0% found this document useful (0 votes)
13 views23 pages

Pldi25 Paper579

Uploaded by

azalea.raad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views23 pages

Pldi25 Paper579

Uploaded by

azalea.raad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

1 C★: Unifying Programming and Verification in C

2
3 ANONYMOUS AUTHOR(S)∗
4
Ensuring the correct functionality of systems software, given its safety-critical and low-level nature, is a
5
primary focus in formal verification research and applications. Despite advances in verification tooling,
6
conventional programmers are rarely involved in the verification of their own code, resulting in higher
7
development and maintenance costs for verified software. A key barrier to programmer participation in
8 verification practices is the disconnect of environments and paradigms between programming and verification
9 practices, which limits accessibility and real-time verification.
10 We introduce C★, a proof-integrated language design for C programming. C★ extends C with verification
11 capabilities, powered by a symbolic execution engine and an LCF-style proof kernel. It enables real-time
12 verification by allowing programmers to embed proof-code blocks alongside implementation code, facilitating
13 interactive updates to the current proof state. Its expressive and extensible proof support allows users to build
14 reusable libraries of logical definitions, theorems, and programmable proof automation. Crucially, C★ unifies
15
implementation and proof code development by using C as the common language.
We implemented a prototype of C★ and evaluated it on a representative benchmark of small C programs and
16
a challenging real-world case study: the attach function of pKVM’s buddy allocator. Our results demonstrate
17
that C★ supports the verification of a broad subset of C programming idioms and effectively handles complex
18
reasoning tasks in real-world scenarios.
19
20 Additional Key Words and Phrases: software verification, real-time verification, C programming, LCF-style
21
theorem proving, separation logic, symbolic execution
22
1 INTRODUCTION
23
24 Background. Systems software forms the infrastructure of modern computing, providing the
25 low-level foundation on which all higher-level applications operate. Given this critical role, recent
26 years have seen considerable advances in the formal verification of systems software components [2,
27 8, 16, 22, 24–27, 33, 36, 40, 41].
28 In this paper, we focus on the verification of software implemented in the C programming
29 language, which remains widely used due to its predictable performance, fine-grained control over
30 system resources, and the vast amount of existing critical code written in it. Significant progress has
31 been made in verification frameworks and toolchains for C programs [3, 9, 14, 15, 20, 21, 26, 28, 34,
32 35, 39, 42]. The substantial progress in the development of formally verified software components
33 and verification tools has demonstrated the feasibility of large-scale verification, and has brought
34 us closer to the vision where all critical software should be verified [19].
35 Despite these successes, formally verified software projects remain costly, in the sense that they
36 require specialized teams with significant expertise, and usually need person-years to complete [22,
37 26]. For the wider adoption of verification practices, the development and maintenance costs for
38 verified software must be reduced. One significant source of the high costs arises from the lack
39 of involvement from programmers, who carry out most of the implementation work yet rarely
40 participate in the verification of their own code.
41 Existing approaches. One reason for such lack of involvement is that the verification of C
42 programs usually requires an external environment, e.g., an interactive theorem prover such as
43 Coq, which demands programmers to learn a significantly different proving paradigm from the C
44 programming experience. Examples in this category include AutoCorres [13], VST [3] and its recent
45 variant in Iris [28], as well as the Live Verification framework [15]. The first three translate existing
46 C programs into certain logical representations in the meta-logic (a monadic shallow embedding or
47 a deep embedding) and then require programmers to conduct proofs around these representations
48 in their underlying theorem prover. The Live Verification framework chooses another approach by
49
2 Anon.

50 composing the program lazily and incrementally along the proof process, relying on the specific
51 mechanism of existential meta-variables in Coq to represent a partially constructed program.
52 To encourage more involvement from programmers, other C verification tools provide language-
53 level integration of programming and verification to make the verification process more accessible
54 to programmers. There are two main categories: (i) assertion-based verifiers such as Frama-C [21]
55 and VST-A [42], and (ii) advanced-type-based verifiers such as RefinedC [39] and CN [35]. These tools
56 allow programmers to annotate a C program with intermediate assertions (e.g., loop invariants)
57 or advanced types (e.g., ownership and refinement types) apart from the specifications to guide
58 the verification process. These tools are usually (semi-)automated, in the sense that they employ
59 an assertion or type checker to automatically verify if the program conforms to the specifications
60 with the help of the programmer-provided annotations. However, when automation falls short,
61 programmers again need to switch to an external theorem proving environment to complete
62 the verification (e.g., in Coq [35, 39, 42]). To mitigate the issue, VeriFast [20]—an assertion-based
63 automated verifier for C programs—provides limited proof support such that programmers can
64 annotate the program with a fixed set of proof commands and write ghost lemma functions to
65 perform certain forms of inductive reasoning. However, VeriFast lacks the expressiveness and
66 extensibility in proof support required for the collaborative verification of low-level systems
67 software between programmers and proof experts.
68
Our goal. As discussed above, there is no satisfactory verification tooling for conventional
69
systems programmers. In this paper, we aim to design and implement a new C verification tool
70
that satisfies the following two criteria:
71
72
• it should provide language-level integration of programming and verification; and
73
• it should provide comprehensive proving capabilities within C’s programming paradigm.
74 To further enhance the usability of the C verification tool, we consider one more criterion:
75 • it should provide support for real-time verification, i.e., the tool should be able to provide a
76 static summary of the program state at every program point and allow programmers to inspect
77 every intermediate proof state inside a proof.
78
Our approach. In this paper, we propose C★, a proof-integrated language that embeds full-
79
fledged verification and proving capabilities in C. We highlight the three key designs of C★ below.
80
81 • We adapt the assertion-based design by allowing programmers to annotate a program with
82 separation-logic assertions and incorporating forward symbolic execution, which abstracts the
83 complexities of concrete semantics and maintains a static summary of the symbolic program
84 state after processing a program fragment.
85 • We integrate the C programming language with LCF-style proof support for higher-order
86 logic, which provides a comprehensive and extensible interface for programming formal
87 proofs and transforming symbolic states, facilitating the development of high-level reasoning
88 abstractions—as proof support libraries—using the full power of C.
89 • With the previous two designs, C★ is ready to support real-time verification: forward symbolic
90 execution provides a summary of the symbolic program state at every program point, and the
91 LCF-style proof support allows programmers to inspect and manipulate the proof state using
92 the familiar programming constructs of C.
93 We implemented a prototype of C★ and evaluated it on a suite of C programs to demonstrate
94 C★’s practicality for the development of verified programs. Specifically, our evaluation shows that
95 C★ (i) supports systems programming idioms and a large subset of C language features, (ii) provides
96 sufficient expressiveness for advanced ownership and functional reasoning, and (iii) is capable of
97 verifying realistic C programs.
98
C★: Unifying Programming and Verification in C 3

99 Contributions. In this paper, we make the following contributions:


100 • We propose a proof-integrated language design that embeds specifications and proof code in C
101 programs, provides comprehensive reasoning capabilities within C’s programming paradigm,
102 and thus make formal verification practices accessible to programmers.
103 • We implemented our design as the C★ toolchain by extending C with two well-established
104 components: a symbolic-execution engine and an LCF-style proof kernel, interfacing both to
105 create a lightweight yet powerful verification workflow.
106 • We evaluated our implementation of C★ on a suite of benchmark programs from the literature
107 and a realistic case study to show it is effective in developing verified C programs with the
108 help of C★’s standard proof-support library.
109
110 2 A GUIDED TOUR OF C★
111 In this section, we present a guided tour for developing a verified C program in C★. We take the
112 clear function shown in Listing 1 as the running example, whose desired functionality is to reset
113 len contiguous bytes that start from a base address to. The implementation code of clear consists of
114 lines 3, 6, 8, 10, 17, 19, 20, 22, and 24; others are verification-specific code. In the implementation
115 code, the programmer declares a local variable i in line 8, followed by a loop from line 10 to line 22,
116 where each loop iteration sets the i-th byte from the base address to to zero and increments i by
117 one until i reaches the function parameter len.
118 We explain the verification-specific code in a way that guides the reader through the incremental
119 development process following C★’s workflow. Our explanation will follow the three criteria
120 mentioned in §1: §2.1 for language-level integration of programming and verification, e.g., how the
121 user writes specifications and assertions about clear; §2.2 for comprehensive proving capabilities via
122 C programming, e.g., how the user proves the implementation of clear comforms to its specification;
123 and §2.3 for real-time program verification, e.g., how C★ aids the user during the incremental
124 development.
125
126 2.1 Language-level Integration of Programming and Verification
127 The first thing to do for verifying a program is correct is to specify how it is supposed to be correct.
128 It is a common practice to specify a function’s expected behavior by formulating its pre- and
129 post-conditions, i.e., the expected program states before calling the function and after returning
130 from it. For the desired functionality of clear, the pre-condition could be that the function parameter
131 len is non-negative and the other parameter to is a base address that points to a memory chunk of
132 at least len contiguous bytes. The corresponding post-condition would be that len contiguous bytes
133 starting from the address to are set to zero. To formally formulate such conditions for low-level
134 heap-manipulating programs such as clear, we adapt separation logic in C★’s design.
135
136
Background (Separation Logic). The development of separation logic [31, 37] is driven by the
137
desire to verify heap-manipulating low-level programs in a modular manner, especially for handling
138
the flexibility of aliasing. Its most salient features are (i) the introduction of a logical connective,
139
separating conjunction, expressing non-aliasing properties between heap fragments in a succinct
140
way, and (ii) a characterizing program proof rule, frame rule, extending the program locality of
141
Hoare logic rules with spatial locality when reasoning about heap-manipulating programs.
142 Tab. 1 lists the concrete notations for some separation-logic predicates used in C★. The standard
143 proof-support library of C★ provides some other widely-used predicates for program verification
144 tasks. For example, array_at(p, ty, lst) represents a consecutive array of elements of C type ty
145 starting at the address p, where the 𝑛-th element of the array is represented by the 𝑛-th element in
146 the logic-level list lst. Another example undef_array_at(p, ty, len) represents an array starting at
147
4 Anon.

148 1 #include "cstarlib.h"


149 2 #include "clear.h"
150 3 void clear(void *to, int len)
151 4 [[require(`fact(len >= 0) ** undef_array_at(to, Tchar, len)`)]]
152 5 [[ensure(`array_at(to, Tchar, replicate(len, 0))`)]]
153 6 {
7 « term params = `data_at(&"to", Tptr, to) ** data_at(&"len", Tint, len)`; »
154
8 int i = 0;
155
9 « /* proof: establish invariant */ »
156
10 while (i < len)
157
11 [[invariant(`∃(i:integer).
158 12 fact(0 <= i && i <= len) **
159 13 data_at(&"i", Tint, i) ** ${params:hprop} **
160 14 array_at(to, Tchar, replicate(i, 0)) **
161 15 undef_array_at(to + i * sizeof(Tchar), Tchar, len - i)
162 16 `)]]
163 17 {
164 18 « single_out_location(); »
19 *((char *)to + i) = (char) 0;
165
20 i = i + 1;
166
21 « /* proof: re-establish invariant */ »
167
22 }
168
23 « /* proof: establish post-condition */ »
169 24 }
170
Listing 1. An example verified C program in C★.
171
172
Table 1. Concrete notations for some separation-logic predicates used in C★.
173
174 Separation-logic Predicate Concrete Notation (Simplified) Definition
175
empty predicate emp 𝜆ℎ. ℎ = ∅
176 fact(p) 𝜆ℎ. ℎ = ∅ ∧ 𝑝
embedded proposition
177 pure(p) 𝜆ℎ. 𝑝
178 data_at(x, ty, v) 𝜆ℎ. ℎ = (𝑥 →
↦ ty 𝑣) ∧ 𝑣𝑎𝑙𝑖𝑑 ty (𝑥, 𝑣)
singleton maps-to
179 undef_data_at(x, ty) 𝜆ℎ. ℎ = (𝑥 →↦ ty _) ∧ 𝑣𝑎𝑙𝑖𝑑 ty (𝑥, _)
180 separating conjunction hp1 ** hp2 𝜆ℎ. ∃ℎ 1 ℎ 2 . ℎ 1 ⊎ ℎ 2 = ℎ ∧ ℎ𝑝 1 ℎ 1 ∧ ℎ𝑝 2 ℎ 2
181
182
183 address p with len undefined values, each of which is uninitialized or irrelevant to the verification.
184 With separation logic, the user can specify the pre- and post-conditions in lines 4 and 5, respectively:
185 • C★ uses the C attribute syntax [[require]]1 to enclose a pre-condition. In line 4, the predicate
186 fact(len >= 0) represents an empty heap with the condition that len is non-negative. The
187 predicate undef_array_at(to, Tchar, len) represents a memory chunk of len continuous bytes,
188 where Tchar is the logic-level representation of the C type char. Using separating conjunction
189 ** to compose the two predicates yields a precise formulation of the intended pre-condition.
190 • C★ uses [[ensure]] for post-conditions. In line 5, the predicate array_at(to, Tchar, replicate
191 (len, 0)) represents a memory chunk of len continuous zeros, where the logic-level term
192 replicate(len, 0) creates a list of len zeros. This, again, precisely corresponds to the intended
193 post-condition we discussed earlier.
194
195 1 In actual code, all the attributes are prefixed with the [[cstar::]] namespace for disambiguation.
196
C★: Unifying Programming and Verification in C 5

197 Readers may have noticed the uses of quotations `...` inside pre- and post-conditions. The
198 quotation mechanism allows the user to construct separation-logic predicates and other logic-
199 level terms using conventional concrete syntax, which is similar to existing assertion-based C
200 verifiers. But as we will show in §2.2, these terms are first-class values in C★: beyond being directly
201 written with quotations, they can be computed from expressions, stored in variables, passed as
202 arguments, and manipulated using the full capabilities of the C programming language. This is one
203 key difference between C★ and traditional assertion-based C verifiers.
204 Writing pre- and post-conditions is far from completing the verification, because it is generally
205 intractable to have an algorithm to automatically verify the function body “transforms” the pre-
206 condition to the post-condition. Similar to many existing C verifiers, C★ adapts the assertion-based
207 design to enable a declarative style of verification:
208
Principle (Declarative Style of Verification). The user annotates the program with
209
separation-logic assertions about the expected program states that hold at specific program points.
210
211 A particular important class of assertions are loop invariants. C★ also uses the C attribute [[invariant
212 ]] to accompany a loop with its invariant, i.e., a separation-logic predicate that is expected to hold at
213 the beginning of each loop iteration. Lines 11–16 specify the loop invariant, which intuitively states
214 that at the i-th iteration, the value of i should be between zero and len (line 12), local variables
215 and parameters are stored properly in the memory with Tint being the logic-level representation
216 of the C type int (line 13), and the base address to points to a memory chunk that starts with i
217 zeros (line 14) followed by len - i unspecified bytes (line 15). Note that in line 13 the code uses an
218 anti-quotation ${params:hprop} to interpolate a predicate defined in line 7. This again indicates that
219 predicates are first-class values and we defer the discussion of anti-quotations to §2.2.
220 With extra assertions including invariants, an assertion-based verifier usually splits the verifica-
221 tion into multiple sub-tasks, each of which corresponds to prove a Hoare triple for a straight-line
222 program segment. For example, to verify that the clear’s implementation code conforms to the pre-
223 and post-conditions, it is sufficient to complete three sub-tasks (i.e., verifying three Hoare triples):
224 • prove the code in line 8 transforms the pre-condition to the loop invariant (line 9);
225 • prove that the loop body (lines 19 and 20) re-establishes the loop invariant (line 21); and
226 • prove that the loop invariant—with the loop condition (in line 10) being false—entails the
227 post-condition (line 23).
228 Each sub-task, i.e., the proof of each Hoare triple { 𝑃 } 𝑆 { 𝑄 }, involves two parts: (i) reasoning
229 about semantics of the program 𝑆, i.e., finding the strongest post-condition 𝑄 sp of 𝑆 w.r.t. the
230 pre-condition 𝑃, and (ii) carrying out an entailment proof, i.e., proving that 𝑄 sp entails 𝑄. Instead
231 of employing automated provers for both parts—as many other assertion-based verifiers do—C★
232 adapts a predictable mechanism of automation by integrating forward symbolic execution to reason
233 about program semantics, i.e., part (i) of each verification sub-task.
234
235
Background (Forward Symbolic Execution). Since Berdine et al. [4], separation logic has been
236
effectively interpreted as a sound basis for forward symbolic execution, which closely matches
237
a programmer’s operational intuition about the effects of statements on program states, while
238
abstracting away concrete semantic details. It achieves this by providing a highly predictable
239
algorithm for automatically applying structural program logic rules for separation-logic predicates
240
in suitable forms, i.e., the symbolic heap fragment of separation logic [5].
241 Using a symbolic-execution engine, C★ computes a symbolic state for each program point. The
242 symbolic state consists of the values of the program variables and the view of the heap fragments
243 that are worked on and owned by the program. For example, at the beginning of the function
244 body of clear (in line 6), symbolic execution uses the pre-condition annotated in line 4 to initialize
245
6 Anon.

246 the symbolic state to be the same as the following separation-logic predicate, which additionally
247 consists of data_at predicates for the function parameters:
248
fact(len >= 0) ** undef_array_at(to, Tchar, len) **
249 data_at(&"to", Tptr, to) ** data_at(&"len", Tint, len)
250
251
Here Tptr is the logic-level representation of C’s pointer types.
252
If symbolic execution were always successful, the remaining proof obligations for the user
253
would all be separation-logic entailments, i.e., part (ii) of each verification sub-task. In §2.2, we
254
will show C★’s capabilties in supporting the user to develop the logical proofs. On the other hand,
255
unfortunately, the price of having a predictable symbolic-execution engine is that it will not try to
256
automate the reasoning and transformations on the symbolic state that a user might find intuitive
257
to perform. For example, in line 19, the statement assigns to the address ((char *)to + i), but the
258
symbolic state—the loop invariant in this case—does not explicitly describe the memory cell pointed
259
by the address. As a result, C★’s symbolic-execution engine cannot (yet) automatically process
260
the assignment. In §2.2, nevertheless, we will show how C★’s proving capabilities provide an
261
operational style of verification, where users can convey and formalize their high-level intuitive
262
ideas on manipulating the symbolic state.
263
2.2 Comprehensive Proving Capabilities via C Programming
264
265
As discussed in §2.1, C★’s proving capabilities should support its users in the following two tasks:
266 • developing logical proofs for entailments, and
267 • manipulating symbolic states inside the implementation code.
268 In particular, C★’s support should satisfy a key criterion:
269 • allow the user to programmably develop logical proofs and manipulate symbolic states, using
270 C’s conventional programming constructs.
271
To achieve the aforementioned goals, we adapt LCF-style theorem proving in C★.
272
273 Background (LCF-style Theorem Proving). The LCF architecture is a general technique for em-
274 bedding formal logics into a programming language. Pioneered by Robin Milner and colleagues
275 in the early work on the Edinburgh LCF theorem prover [12, 18], its descendants are still widely
276 used today [17, 30]. In LCF-style provers, a general-purpose programming language is used as the
277 meta-language to implement object logic entities such as terms, types, and theorems. These are rep-
278 resented as recursive data structures, making formal proof a programming process of constructing
279 theorems from a set of axioms using primitive inference rules. Specifically, the axioms are encoded
280 as constants, and the inference rules are implemented as functions that take premises and return
281 conclusions as theorems if the rules can be successfully instantiated.
282
In C★, we integrate an LCF-style proof kernel with higher-order logic as the object logic. The
283
kernel is wrapped by a C interface; in other words, the meta-language in C★’s design is the
284
standard C programming language. To distinguish the code for developing logical proofs and
285
manipulating symbolic states from ordinary implementation code, C★ introduces proof-code blocks
286
delimited by the «...» syntax.2 Arbitrary C code is allowed in proof-code blocks, with the ability
287
to introduce bindings and construct values of type term and thm, corresponding to object-logic
288
terms and theorems, respectively. Recall that we mentioned that in C★, separation-logic predicates
289
are first-class values. Indeed, they are just term values of object-logic type hprop (short for heap
290
propositions). The quotation and anti-quotation mechanisms are thereby introduced to conveniently
291
construct term values. For example, in line 7 of Listing 1, the code stores the data_at predicates
292
293 2 In actual code, we use the [[cstar::proof(...)]] attribute to embed proof-code blocks.
294
C★: Unifying Programming and Verification in C 7

295 regarding ownership of the parameters—wrapped by a quotation `...`—in a variable called params.
296 In the following proof-code blocks and assertions (e.g., invariants), the user can use params as if it is
297 a normal program variable.3 In line 13, the code indeed uses it; combined with the anti-quotation
298 ${params:hprop}, it reduces redundancy when writing the loop invariant.
299 Because separation-logic predicates are just values in C★, it becomes natural to write C code to
300 manipulate symbolic states, which are special kinds of separation-logic predicates. Such capability
301 of C★ enables an operational style of verification:
302
303
Principle (Operational Style of Verification). The user manipulates the symbolic state
304
using arbitrary C code, provided they can give justifications, i.e., theorems for the corresponding
305
separation-logic entailments, for the manipulations they made.
306 For example, the operational style of verification is applied in line 18. Here, the symbolic state—
307 computed by the symbolic-execution engine—is equivalent to the following predicate:
308
∃(i:integer). fact(i < len) ** fact(0 <= i && i <= len) **
309
data_at(&"i", Tint, i) ** data_at(&"to", Tptr, to) ** data_at(&"len", Tint, len) **
310 array_at(to, Tchar, replicate(i, 0)) ** undef_array_at(to + i * sizeof(Tchar), Tchar, len - i)
311
As discussed above about symbolic execution, symbolically executing the next statement (in line
312
19) would fail, because the symbolic state does not explicitly describe the address ((char *)to + i).
313
Intuitively, the user should “transform” the symbolic state to some form containing undef_data_at(to
314
+ i * sizeof(Tchar), Tchar) as a separating conjunct, which represents the ownership of the memory
315
location being stored into. By inspecting the symbolic state, the user can see that a transformation
316
of the view of the heap is needed to satisfy the requirement: by splitting the undef_array_at predicate
317
into the separating conjunction of its head element (as an undef_data_at predicate) and the rest of
318
the slice (as an undef_array_at predicate with smaller length and starting at a bigger offset). The
319
split is valid due to the fact that i < len holds in current state, which entails the fact len - i > 0,
320
meaning the slice is non-empty. The corresponding justification—as proof code—for this intuitive
321
transformation is wrapped in the proof procedure single_out_location.
322
The implementation of the proof procedure single_out_location is shown in Listing 2. It demon-
323
strates how C★ allows proving separation-logic entailments using conventional C programming
324
constructs and high-level derived rules from the proof-support libraries, making the intuitive
325
reasoning process easy to implement as proof code. Thanks to the LCF-style design, logical rules
326
are just C functions that return thm values, which represent proven theorems.
327
328 • Derive the transformation rule (lines 2–7). The undef_array_at_select_first theorem, from the
329 proof-support library clear.h included by the C★ program in Listing 1, asserts that for any
330 uninitialized array, we can single out the first element and treat the remainder as another
331 uninitialized array, provided the length of the array is greater than zero. The resulting theorem
332 from this derived-rule application is shown by the comment in lines 4–7, where ==> denotes
333 standard logical implication and |-- denotes separation-logic entailment.
334 • Rewrite using linear arithmetic facts (lines 8–14). The rewrite_rule_list function—from the stan-
335 dard proof-support library cstarlib.h—takes an NULL-terminated array of equational theorems
336 and another theorem, and rewrites the second argument using the equational theorems. It is
337 used here to rewrite the theorem using linear arithmetic facts, which are derived automatically
338 by calling the arith_rule function, to align with the predicates in the current symbolic state.
339 The array-destructing theorem undef_array_at_destruct is re-assigned to the rewritten theorem
340 by the statement in line 13.
341 3 All
proof blocks in a function body are in the same scope, and global proof-code blocks (that is outside of any function
342 body) are file-scoped. Local scopes can be created using C blocks {...}. See §4.1 for more details.
343
8 Anon.

344 1 void single_out_location(void) {


345 2 thm dest_undef_array =
346 3 undef_array_at_select_first(`to + i * sizeof(Tchar)`, `Tchar`, `len - i`);
347 4 /* len - i > 0 ==>
348 5 undef_array_at(to + i * sizeof(Tchar), Tchar, len - i) |--
349 6 undef_data_at(to + i * sizeof(Tchar), Tchar) **
7 undef_array_at((to + i * sizeof(Tchar)) + sizeof(Tchar), Tchar, len - i - 1) */
350
8 thm arith_facts[] = {
351
9 arith_rule(`len - i > 0 <=> i < len`),
352
10 arith_rule(`len - i - 1 == len - (i + 1)`),
353
11 arith_rule(`(to + i * sizeof(Tchar)) + sizeof(Tchar) ==
354 12 to + (i + 1) * sizeof(Tchar)`), NULL };
355 13 dest_undef_array = rewrite_rule_list(arith_facts, dest_undef_array);
356 14 /* rewrite using linear arithmetic facts */
357 15 thm final_thm = local_apply(get_symbolic_state(), dest_undef_array);
358 16 /* perform local transformation with frame inferred from the symbolic state */
359 17 set_symbolic_state(final_thm); /* update the symbolic state */
360 18 }
361 Listing 2. Proof procedure: single out the first element of the uninitialized slice.
362
363
364 • Perform local transformation with frame inferred from the symbolic state (lines 15–17). The
365 local_apply function is an important derived rule in cstarlib.h. It allows programmers to per-
366 form a local transformation with the frame being inferred from the symbolic state. Specifically,
367 it takes two arguments: the current symbolic state and a local-transformation theorem. In
368 line 15, we fetch the current symbolic state using the built-in get_symbolic_state function,
369 and use the local_apply function to perform a local transformation justified by the theorem
370 dest_undef_array. The result of the transformation is then put back to the symbolic-execution
371 engine using the built-in set_symbolic_state function in line 17.
372
373 2.3 Real-time Program Verification
374 In the previous two sections, we used the verification of the function clear to illustrate (i) how C★
375 incorporates separation logic and forward symbolic execution to provide language-level integration
376 of programming and verification, as well as (ii) how C★ integrates LCF-style proof support for
377 higher-order logic to provide comprehensive proving capabilities within C’s programming paradigm.
378 In particular, separation-logic predicates are first-class values and proof-code blocks can manipulate
379 symbolic states and proof states.
380 We claim that C★ achieves real-time program verification, i.e., the user can carry out verification
381 as they program the implementation code incrementally. It achieves this goal by orchestrating the
382 symbolic-execution engine and the LCF-style proof kernel together, creating a proof-supporting
383 runtime that runs proof-code blocks and symbolic execution of program segments in an interleaving
384 manner, and provides the symbolic state at every program point in implementation code as well as
385 the proof state in proof-code blocks.
386 Firstly, C★ is capable of providing the symbolic state at every program point, given that (i) every
387 function has pre- and post-conditions, (ii) every loop has an invariant, and (iii) the required maps-to
388 predicates are present in the symbolic state before executing a primitive statement. This is achieved
389 by a combination of forward symbolic execution for separation logic and the ability to write
390 proof-code blocks to manipulate the symbolic state. The symbolic state in the symbolic-execution
391 engine is always represented as a separation-logic assertion in a canonical form, known as symbolic
392
C★: Unifying Programming and Verification in C 9

393 LCF-style
Proof Kernel
394
395 Operational Residual
C★ Program Proof Program Proof Program
396
Slice
Implementation Code Assertion-annotated Generate Verification
397 & Code Segments Conditions
Specifications and Symbolic
398 Intermediate Assertions Assemble Execution
399 Proof Code Proof Code
Proof-code Blocks
400
User
401
402 State I State II State III
Translation Operational Proof Checking Residual Proof Checking
403
404 Fig. 1. The ideal workflow of C★’s proof-checking phase.
405
406
407 heaps [4]. When a statement (e.g., an assignment) is symbolically executed, the symbolic-execution
408 engine requires the primitive maps-to predicate (i.e., data_at or undef_data_at) for the accessed
409 memory locations be present in the current symbolic heap as a separating conjunct; if so, the engine
410 modifies the symbolic heap locally [20]. For example, when executing the store statement *((char
411 *)to + i) = (char) 0 in line 19 of Listing 1, the symbolic-execution engine confirms that the current
412 symbolic heap contains the assertion `undef_data_at(to + i * sizeof(Tchar), Tchar)`, representing
413 the ownership of the memory location being stored into, and then substitutes the predicate with
414 `data_at(to + i * sizeof(Tchar), Tchar, 0)`, reflecting the effect of the store statement. The other
415 predicates in the symbolic heap are left unchanged, being justified by the frame rule of separation
416 logic. It is worth noting that for the symbolic-execution engine to achieve this, C★ needs to first
417 execute the proof-code block in line 18 of Listing 1 to transform the symbolic heap accordingly,
418 before symbolically executing the store statement.
419 Secondly, C★’s runtime environment for running proof-code blocks is capable of providing
420 the proof state in proof-code blocks. The proof state records the concrete values of term and thm
421 variables declared in proof-code blocks, as well as proof functions and theorems included from
422 proof-support libraries. For example, when a user is developing the function clear and writes down
423 line 18 of Listing 1 to call single_out_location, the environment of the proof-supporting runtime
424 should be able to find the function definition of single_out_location in Listing 2. This ability comes
425 from the LCF-style theorem proving, where proofs are ordinary programs that manipulate terms
426 and theorems. Thus, C★ can assemble all the proof-code blocks inside a function and its dependent
427 proof-support functions together as a C program, compile it, and execute it to record the concrete
428 values of variables.
429
430
3 CORE DESIGN
431 In this section, we present the core design of C★ from two perspectives. Following the discussion
432 in §2.3, we explain C★’s internal mechanisms from the developer’s perspective in §3.1, i.e., C★’s
433 verification-specific workflow and its proof-supporting runtime. Following the guided tour in
434 §2.1 and §2.2, we describe C★’s language features from the user’s perspective in §3.2, i.e., C★’s
435 verification-specific interface and its proving capabilities.
436
437 3.1 Workflow of C★ Toolchain
438 The C★ toolchain consists of three components: the C★ compiler, the LCF-style proof kernel, and the
439 symbolic-execution engine. The interaction between these components during the proof-checking
440 phase of a C★ program is illustrated conceptually in Fig. 1. After proof checking, the deployment
441
10 Anon.

442 1 code_segment_t seg1 = /* 1 term vc1 = /* establish invariant


443 2 void clear(void *to, int len) 2 (line 9 of Listing 1) */;
444 3 [[require(...)]] 3 term vc2 = /* re-establish invariant
445 4 [[ensure(...)]] 4 (line 21 of Listing 1) */;
446 5 { */; 5 term vc3 = /* establish post-condition
447 6 code_segment_t seg2 = /* 6 (line 23 of Listing 1) */;
7 int i = 0; 7
448
8 while (i < len) 8 thm proof1() {
449
9 [[invariant(...)]] 9 /* user-provided proof code for vc1 */
450
10 { */; 10 }
451
11 code_segment_t seg3 = /* 11 thm proof2() {
452 12 *((char *)to + i) = (char) 0; 12 /* user-provided proof code for vc2 */
453 13 i = i + 1; 13 }
454 14 } 14 thm proof3() {
455 15 } */; 15 /* user-provided proof code for vc3 */
456 16 int main(void) { 16 }
457 17 feed_program_segment(seg1); 17

458 18 term params = ...; // line 7 of Listing 1 18 int main(void) {


19 feed_program_segment(seg2); 19 assert_prove(proof1(), vc1);
459
20 single_out_location(); // line 18 of Listing 1 20 assert_prove(proof2(), vc2);
460
21 feed_program_segment(seg3); 21 assert_prove(proof3(), vc3);
461
22 } 22 }
462
463 (a) Operational Proof Program (b) Residual Proof Program
464
Fig. 2. Demonstration of C★’s workflow using the running example in §2.
465
466
467 phase becomes straightforward: because all verification-specific annotations are wrapped in C
468 attributes, the verified program can be compiled directly with C compilers such as gcc or clang.
469 The proof-checking phase of a C★ program can be summarized as a three-stage process: the
470 translation stage, the operational proof checking stage, and the residual proof checking stage.
471
Translation stage. In the first stage, the C★ compiler processes the input C★ program, and
472
produces an operational proof program. The input C★ program consists of three parts: (i) imple-
473
mentation code, (ii) annotations for specifications (e.g., [[require]] and [[ensure]] attributes) and
474
intermediate assertions (e.g., [[invariant]] attributes), as well as (iii) embedded proof-code blocks.
475
During translation, the C★ compiler combines part (i) and part (ii) to form the annotated C code,
476
slices it into segments that are separated by proof-code blocks, and stores the segments as serializ-
477
able data structures in the proof code. For part (iii), i.e., the proof-code blocks, the C★ compiler
478
assembles them into the main code for execution. The compiler also handles syntax extensions such
479
as quotation and anti-quotation, translating them into applications of term-constructing functions.
480
Fig. 2 demonstrates C★’s workflow using the verification of the clear function shown in §2. Note
481
that here we reinterpret the commented-out proof-code blocks in lines 9, 21, and 23 of Listing 1
482
as they are not inserted into the implementation code. Fig. 2a is the assembled operational proof
483
program: lines 1–15 are three code segments split by proof-code blocks in lines 7 and 18 of Listing 1.
484
In the main function, we use the built-in function feed_program_segment to feed a code segment to
485
the symbolic-execution engine, as we will explain below about the second stage.
486
487 Remark 3.1 (Reversed role of verification-specific annotations). Whereas in the deployment phase
488 embedded proof-code blocks are ignored by the compiler, in the proof-checking phase, they play a
489 central role. Here, the proof code assembled from embedded proof blocks becomes the main code for
490
C★: Unifying Programming and Verification in C 11

491 execution, while the annotated C code are sliced into segments and treated as serializable data, to be
492 fed to the symbolic-execution engine interactively.
493
Operational proof checking stage. In the second stage, the C★ workflow executes the oper-
494
ational proof program obtained from the translation stage. As discussed in §2, C★ supports two
495
styles of program verification, namely declarative and operational:
496
497 • In the declarative style, the user asserts expected symbolic states at specific program points.
498 The symbolic-execution engine uses the asserted symbolic state for further execution and
499 produce verification conditions as output. These verification conditions are gathered and will
500 be proved in batch later in the residual proof checking stage.
501 • In the operational style, the user actively manipulates the symbolic state by writing and
502 executing proof-code blocks. In this way, the operational proof checking phase is naturally
503 real-time: the proof-code blocks are executed interleaved with symbolic execution, feeding the
504 annotated program segments incrementally to the symbolic-execution engine.
505
More specifically, the C★ workflow handles the execution of each proof block as follows. First, the
506
last annotated program segment—as some serializable data—is fed to the symbolic-execution engine.
507
Next, the proof-code block, which is normal C code, is executed in the proof-supporting runtime
508
(see Remark 3.2 below). Finally, the current symbolic state is updated according to the execution
509
results of the proof-code block: recall that a proof-code block should fetch the symbolic state by
510
calling get_symbolic_state, do transformations on it with proofs, and then call set_symbolic_state at
511
the end to update the symbolic state in the symbolic-execution engine. The proof program shown
512
in Fig. 2a implicitly calls these functions in the code of single_out_location, i.e., Listing 2.
513
514 Remark 3.2 (Proof-supporting runtime). The proof programs are executed in C★’s proof-supporting
515 runtime, which is the standard C runtime interfaced with the LCF-style proof kernel and the symbolic-
516 execution engine. The LCF-style proof kernel implements the basic building blocks used in proof
517 code. The symbolic-execution engine takes annotated C program segments as input (via the built-in
518 function feed_program_segment), maintains the symbolic state of the current partial program internally
519 (accessed and modified via the get_symbolic_state and set_symbolic_state built-in functions), and
520 produces verification conditions as output when symbolic execution is completed.
521
Residual proof checking stage. In the third stage, the C★ compiler collects the output of the
522
symbolic-execution engine and creates the residual proof program, which is a C★ program purely
523
consisting of global proof-code blocks. This program contains proof goals for every undischarged
524
verification condition generated during symbolic execution, which are to be addressed in the
525
C★ proof environment, either by the programmers or with assistance from proof experts. Those
526
verification conditions arise from the declarative style of verification: recall that at each assertion
527
or invariant, it is obliged for C★ users to prove the entailment from the maintained symbolic
528
state (by the symbolic-execution engine) to the asserted state. Fig. 2b shows the residual proof
529
program for the running example in §2: lines 1–6 are three verification conditions generated by
530
the symbolic-execution engine, lines 8–16 are user-provided proof code for the three verification
531
conditions, respectively, and the main function executes the proof code the check if they indeed
532
prove the verification conditions.
533
534 Remark 3.3 (C★ developer’s view). From the perspective of a developer, C★ can be seen as an LCF-
535 style higher-order-logic theorem prover embedded within C, tailored for C program verification. It
536 relies on a trusted symbolic-execution engine that serves as an oracle, which automatically derives
537 strongest post-conditions and generates verification conditions from C code segments annotated with
538 separation-logic assertions and specifications. Successfully executing the two proof programs, i.e.,
539
12 Anon.

540 operaational and residual, is thereby equivalent to verifying the Hoare triple of the entire program,
541 which provides the correctness guarantee for the conformance with the program’s specification.
542
543 3.2 Interface for C★ Users
544 As overviewed in §2, C★ extends the C programming language with two categories of verification-
545 specific syntactic constructs: (i) specifications and intermediate assertions, and (ii) proof-code
546 blocks. Fig. 3 summarizes these constructs, providing an accessible interface for C★ users. In this
547 section, we explain our design of this interface and at the end exemplify C★’s extensibility in proof
548 support using the implementation of local_apply from C★’s standard proof-support library.
549
550
Verification-specific attributes. C★ introduces attributes [[require]], [[ensure]], [[parameter]],
551
and [[argument]] concerning function specifications, [[assert]] and [[invariant]] for intermediate
552
assertions within implementation code, as well as [[proof]] for embedding proof-code blocks.
553
The attributes [[require]] and [[ensure]] specify a function’s pre-condition and post-condition,
554
respectively, both containing C expressions that evaluate to a term value of object-logic type hprop,
555
i.e., a separation-logic predicate. Both the pre- and post-condition can reference function parameters,
556
and the post-condition can additionally reference the function’s returned value using the preserved
557
symbol __result. The attribute [[parameter(`var:type`)]] introduces a ghost parameter var of object-
558
logic type type that denotes a universally quantified logical variable for the function specification.
559
Correspondingly, the [[argument(`var=value`)]]—used before a function call—instantiates the ghost
560
variable with an object-logic term. Fig. 3 illustrates the usage of parameter and argument attributes
561
using a C function reverse that reverses a linked list in-place: the parameter l is a logic-level
562
integer list that encodes the content of the linked list pointed to by p, where the (user-defined)
563
separation-logic predicate ll_repr(p, l) expresses such encoding.
564
Similar to [[require]] and [[ensure]], the [[assert]] and [[invariant]] attributes take a C expres-
565
sion as input, which evaluates to a separation-logic predicate. The [[assert]] attribute inserts a static
566
assertion about the symbolic state at a program point, supporting a declarative verification style: if
567
non-trivial reasoning is required to prove that the maintained symbolic state entails the asserted
568
state, a verification condition is generated by the symbolic-execution engine. We will explain this
569
workflow in detail in §3.1. After processing an assertion, the symbolic-execution engine will update
570
the symbolic state accordingly. The [[invariant]] attribute also inserts an assertion but it asserts
571
a symbolic state expected at the start of each loop iteration, hence its name invariant. Currently,
572
only while loops are supported, and using break or continue will lead the symbolic-execution engine
573
to generate additional verification conditions for the additional control-flow paths.
574
The [[proof]] attribute wraps a proof-code block. Proof-code blocks may contain arbitrary C
575
code, having access to the LCF-style proof kernel and the symbolic-execution engine. There are
576
two kinds of proof-code blocks: (i) local proof blocks, used within implementation code (e.g., inside
577
a function body), primarily for operational verification and symbolic-state transformation, and (ii)
578
global proof blocks, used outside any implementation code, typically for defining common proof
579
functions or theorems. Typically, all code in a proof-support library (e.g., cstarlib.h) is within global
580
proof blocks. Within a local proof block, the user can reference bindings declared in prior proof
581
blocks in the same function body, as well as bindings declared in global proof blocks.
582 Specifications and intermediate assertions. Inside the verification-specific attributes, C★
583 provides a quotation syntax (delimited by `...`). It allows the user to construct object-logic terms
584 using concrete syntax representations and avoid the verbosity of calling term constructors explicitly.
585 Inside quotations, the user can use the anti-quotation mechanism (escaped using ${var:type}) for
586 splicing in computed sub-terms stored in program variables. Together, these two syntax extensions
587 offer a simple yet expressive way to build object-logic terms. Fig. 3 summarizes the concrete syntax
588
C★: Unifying Programming and Verification in C 13

589
590
Veri cation-speci c Speci cations &
591
Attributes Intermediate Assertions
/* function specification */ Separation-logic Predicates
592 struct ll_node *reverse(struct ll_node *p) /* object-logic type: hprop */
[[parameter(`l:int_list`)]] emp // empty heap
593 fact(&1 > &0) // embedded proposition
[[require(`ll_repr(p, l)`)]]
594 [[ensure(`ll_repr(__result, rev(l))`]] pure(&1 > &0) // embedded proposition
; data_at(&"x", Tint, &42) // singleton maps-to
595 undef_data_at(&"y", Tint) // singleton maps-to
struct ll_node *q; data_at(...) ** data_at(...) // separating conjunction
596 [[argument(`l=cons(&1,nil())`)]] exists (n:integer). fact(n >= &0) // existential quantifier
597 q = reverse(p);
/* object-logic type: bool */
598 /* intermediate assertion */ &1 > &0 // comparison true || false // disjunction
int n = 42; true && false // conjunction false ==> true // implication
599 while (n > 0) fact(x > y) |-- emp // separation-logic entailment
[[invariant(`exists (n:integer).
600 fact(n >= &0) ** /* object-logic type: integer */
data_at(&"n", Tint, n)`)]] &"var" // address of a variable &42 // integer literal
601
{ n = n - 1; } sizeof(Tint) // size of a C-type
602 [[assert(`data_at(&"n", Tint, &0)`)]];
/* object-logic type: ctype */
603 /* proof-code block */ Tint // C-type: int Tchar // C-type: char
[[proof( Tptr // C-type: pointer
604 thm th = arith_rule( otation & Anti-quotation
`n > &0 ==> n - &1 >= &0`);
605 )]]; /* quotation */ /* anti-quotation */
term exp = `&21 + &21`; term eqn = `&42 == ${exp:integer}`;
606
607 Proof-code Interface (Symbolic Execution)
608 /* retrieve symblic state */ /* transform symbolic state */
term pre_state = get_symbolic_state(); thm th = axiom(`${pre_state:hprop} |-- ${new_state:hprop}`);
609 set_symbolic_state(th);
610
611
Proof-code Interface (LCF-style Proof Kernel)
612
Term-speci c Utilities HOL Rules (Selected)
/* constructor */ axiom(`&0 == &1`) // |- &0 == &1
613 term conj = `emp ** emp`; assume(`&0 == &1`) // &0 == &1 |- &0 == &1
disch(assume(`x > &0`), `x > &0`) // |- x > &0 ==> x > &0
614 /* destructor */ undisch(axiom(`p ==> q`)) // p |- q
term left = left_of_sep(conj); // emp mp(axiom(`p ==> q`), axiom(`p`)) // |- q
615 term right = right_of_sep(conj); // emp conjunct(axiom(`p`), axiom(`q`)) // |- p && q
616 conjunct1(axiom(`p && q`)) // |- p
/* discriminator */ disj_cases(axiom(`p || q`),
617 if (is_sep(conj)) { ... } undisch(axiom(`p ==> r`)),
undisch(axiom(`q ==> r`))) // |- r
618 /* equality checker */ refl(`x`) // |- x == x
if (equals_term(left, right)) { ... } trans(axiom(`x == y`), axiom(`y == z`)) // |- x == z
619 symm(axiom(`x == y`)) // |- y == x
arith_rule(`i < &2 ==> i < &3`) // |- i < &2 ==> i < &3
620 eorem-speci c Utilities
621 thm th = undisch(axiom(`i < &2 ==> i < &3`));
Seperation-logic Rules (Selected)
622 /* get the conclusion */ hentail_refl(`data_at(&"x", Tint, &0)`)
term concl = conclusion(th); // i < &3 // |- data_at(&"x", Tint, &0) |-- data_at(&"x", Tint, &0)
623 hentail_trans(axiom(`hp1 |-- hp2`), axiom(`hp2 |-- hp3`))
/* get the n-th hypothesis */ // |- hp1 |-- hp3
624 term hyp = nth_hypth(th, 0); // i < &2 hsep_assoc1(axiom(`hp |-- (hp1 ** hp2) ** hp3`))
625 // |- hp |-- hp1 ** (hp2 ** hp3)
De nitional Mechanisms hsep_comm(axiom(`hp |-- hp1 ** hp2`))
626 /* inductive type */ // |- hp |-- hp2 ** hp1
indtype int_list = hsep_monotone(axiom(`hp1 |-- hp3`), axiom(`hp2 |-- hp4`))
627 define_type("int_list = // |- hp1 ** hp2 |-- hp3 ** hp4
628 nil | cons integer int_list"); hfact_intro(axiom(`p`), axiom(`hp1 |-- hp2`))
// |- hp1 |-- fact(p) ** hp2
629 /* recursive function */ hfact_elim(axiom(`p ==> (hp1 |-- hp2)`))
thm nth = // |- fact(p) ** hp1 |-- hp2
630 define(`nth(cons(h,t)),0) == h && hsep_hfact1(axiom(`hp |-- (fact(p) ** hp1) ** hp2`))
nth(cons(h,t)),SUC(n)) == nth(t,n)`); // |- hp |-- fact(p) ** (hp1 ** hp2)
631
Qu
Th fi fifi fi fi fi

632
Fig. 3. A summary of the verification-specific interface that C★ provides to users.
633
634
635 for separation-logic predicates and other frequently used object-logic terms and types. There are a
636 few unsual notational conventions, which arise from the LCF-style proof kernel employed by C★.
637
14 Anon.

638 In the object logic, integer literals (i.e., terms of object-logic type integer) take the form &n, where
639 n is a natural number. The logic-level representation of an address is an integer value, e.g., &"x"
640 denotes the address of the program variable named x. We reuse C’s && and || operators to encode
641 logic-level conjunction an disjunction, respectively, and use ==> for standard logical implication.
642 Separation-logic entailments are treated as propositions (i.e., terms of object-logic type bool): the
643 binary operator |-- takes two separation-logic predicates hp1 and hp2 and produces a proposition
644 hp1 |-- hp2, whose meaning is that if a heap satisfies hp1, then it also satisfies hp2.
645 In C★, separation-logic predicates in specifications and intermediate assertions must adhere to a
646 specific form to enable automated symbolic execution. This specific form is known as the symbolic
647 heaps [4], and has the following structure:
648 ∃𝑥 1, . . . , 𝑥𝑘 . (𝑃1 ∧ · · · ∧ 𝑃𝑚 ) ∧ (𝑄 1 ∗ · · · ∗ 𝑄𝑛 ) , (SymHeap)
649
where 𝑥𝑖 ’s are existentially quantified logical variables, ∧ denotes non-separating conjunction, i.e.,
650
standard logical conjunction, and ∗ represents separating conjunction. The 𝑃𝑖 ’s are pure facts—
651
expressions in the form of pure(p) that state properties about the global heap. The 𝑄 𝑗 ’s, known as
652
spatial facts, consist of either primitive maps-to predicates (i.e., data_at or undef_data_at) which are
653
visible to the symbolic-execution engine, or user-defined predicates (e.g., array_at) whose internal
654
structure can be arbitrary and are opaque to the symbolic-execution engine. These spatial facts
655
represent separately-owned local fragments of the heap. We often use the derived form fact(p) =
656
pure(p) && emp to describe pure properties. The derived form satisfies that pure(p) && H = fact(p) * H.
657
This formulation allows symbolic heaps to be uniformly represented as separating conjunctions of
658
pure and spatial facts, avoiding the need for non-separating conjunction.
659
Before symbolically executing any primary program statement, the symbolic-execution engine
660
verifies that the current symbolic state includes the necessary primitive maps-to predicates for all
661
accessed memory locations. Once this requirement is satisfied, the engine updates the symbolic
662
state as needed, preserving the symbolic form, and possibly generates side conditions to guarantee
663
safe execution, i.e., no runtime error or undefined behavior.
664
In addition to the primitive predicates and predicates provided in the standard library, C★ users
665
can derive and use their customized predicates. For instance, the hiter function in the standard
666
proof-support library, defined in object-logic as hiter hps = fold_right (**) hps emp using the higher-
667
order function fold_right, takes a list of separation-logic predicates hps and returns their iterated
668
separating conjunction. We will explain how to implement derived predicates later in this section.
669
670
Proof-code interface with symbolic execution. In C★, a local proof-code block for performing
671
operational verification retrieves the current symbolic state from the symbolic-execution engine
672
by calling a built-in function get_symbolic_state(). For example, the initial symbolic state can
673
be obtained with term pre_state = get_symbolic_state(). At the end of the proof-code block, the
674
symbolic state can be updated using a call to set_symbolic_state(th), where th is a theorem proving
675
the separation-logic entailment from the current symbolic state (pre_state) to a new state (new_state).
676
This updated state new_state is then set as the current symbolic state. Recall Listing 2 in §2 for an
677
example of using the interface to do local transformations on the symbolic state.
678 Proof-code interface with LCF-style proof kernel. In an LCF-style proof environment like that
679 in C★, two fundamental types are provided for logical reasoning: term, representing terms in the
680 object logic, and thm, denoting proven theorems. These types are treated as abstract types in C★,
681 ensuring that users can only manipulate them through the library functions provided by the LCF
682 proof kernel, forbidding direct access to internal data structures.
683 As summarized in Fig. 3, to work with term values, the proof kernel offers a set of functions
684 acting as constructors, destructors, discriminators, and equality checkers, among other utilities.
685 For thm values, the kernel provides the primitive rules needed to prove theorems. These rules
686
C★: Unifying Programming and Verification in C 15

687 hsep-comm hsep-assoc hsep-cancel-right hexists-monotone


688 𝐻 1 ⊢ 𝐻 1′ ∀𝑥 . (𝐻 ⊢ 𝐻 ′ )
689 𝐻 1 ∗ 𝐻 2 ⊢⊢ 𝐻 2 ∗ 𝐻 1 (𝐻 1 ∗ 𝐻 2 ) ∗ 𝐻 3 ⊢⊢ 𝐻 1 ∗ (𝐻 2 ∗ 𝐻 3 ) 𝐻 1 ∗ 𝐻 2 ⊢ 𝐻 1′ ∗ 𝐻 2 (∃𝑥 . 𝐻 ) ⊢ (∃𝑥 . 𝐻 ′ )
690
691 Fig. 4. Selected separation-logic rules for structural manipulations.
692
693
694 encompass both separation-logic entailment rules and higher-order logic rules for general reasoning.
695 Additionally, functions for checking if a proof goal is achieved and for accessing the hypotheses
696 and conclusion of a theorem are available.
697 The programmability of C★’s LCF-style proof kernel allows users to extend its functionality by
698 defining customized derived rules or proof-search routines as C functions on top of the primitive
699 proof rules. Furthermore, besides the built-in types such as ctype and hprop, as well as standard
700 functions like sizeof, the kernel’s definitional mechanism enables users to define new (inductive)
701 types, e.g., int_list in Fig. 3, as well as (recursive) functions, e.g., nth in Fig. 3. Such definitional
702 mechanism is also used to define new sepeartion-logic predicates, such as hiter mentioned earlier
703 in this section. This flexibility makes C★ expressive for a wide range of verification needs.
704 Extensible and programmable proof support. With the programmability of LCF-style proof
705 support, proof experts can develop custom proof libraries to simplify common proof patterns,
706 offering high-level derived proof rules and collections of frequently used mathematical properties.
707 For example, a typical task in operational-style verification is justifying local transformations
708 performed on the symbolic state. By local transformations, we mean picking out specific conjuncts
709 from a symbolic heap, applying a proved-correct separation-logic entailment to these conjuncts, and
710 leaving the rest of the symbolic heap unchanged. Separation logic inherently supports such local
711 transformations; however, using only primitive rules of separation logic—some of which are listed
712 in Fig. 4— requires manually lifting affected conjuncts through layers of separating conjunctions
713 and specifying frames for each transformation. This can be tedious and lead to proof code cluttered
714 with structural manipulations, which detract from the intuitive reasoning process.
715 To alleviate the need for manually performing such structural manipulations, we implemented
716 a derived rule called local_apply in our standard proof-support library cstarlib.h. Considers the
717 simple case where only one conjunct is affected by the transformation, the automation process of
718 local_apply can be described in four steps as follows.
719
720
(i) Repeatedly destruct existential binders in the symbolic heap of the form (SymHeap).
721
(ii) Find the affected conjunct and lift it to the far-left side of the symbolic heap by using the
722
hsep-comm and hsep-assoc rules repeatedly.
723
(iii) Apply the hsep-cancel-right rule with other conjuncts to the right as the frame.
724
(iv) Repeatedly add back existential binders using the hexists-monotone rule.
725 As a concrete code example, we present the proof function sep_lift_one for performing the second
726 step in Listing 3. It assumes the input symbolic heap septerm is a right-associated separating con-
727 junction. It first calls the derived rule hsep_move, getting a generalized equality theorem lift_to_left
728 (line 4) for moving the target conjunct out to the left for one layer when it is in the left position of
729 the inner symbolic heap, in one step. It then tries to find the target conjunct recursively:
730
(i) If the target conjunct is never found, it returns NULL.
731
(ii) If the target conjunct is at the far-right position, it rewrites it using the hsep-comm rule.
732
(iii) Otherwise, it uses the equality theorem lift_to_left to move the target conjunct out to the left
733
for one layer. The lifting steps work in bottom-up way during unwinding the recursive calls.
734
735
16 Anon.

736 1 thm sep_lift_one(term target, term septerm)


737 2 /* assume target is primitive and septerm is a right-associated symbolic heap */
738 3 {
739 4 thm lift_to_left = hsep_move(target);
740 5 /* ∀ (hp1:hprop) (hp2:hprop).
741 6 hp1 ** ${target:hprop} ** hp2 -|- ${target:hprop} ** hp1 ** hp2 */
7
742
8 if (is_sep(septerm)) {
743
9 term l = left_of_sep(septerm), r = right_of_sep(septerm);
744
10 if (equals_term(target, l)) { return rewrite(lift_to_left, septerm); }
745
11 else {
746 12 if (is_sep(r)) {
747 13 thm step1 = rewrite(sep_lift_one(target, r), septerm);
748 14 thm step2 = rewrite(lift_to_left, consequent(conclusion(step1)));
749 15 return trans(step1, step2);
750 16 } else if (equals_term(target, r))
751 17 return rewrite(symm(hsep_comm(target)), septerm);
752 18 }
19 } else if (equals_term(target, septerm)) { return refl(septerm); }
753
20 return NULL; /* fail if target isn't found */
754
21 }
755
756
Listing 3. C★ code of sep_lift_one.
757
758
759
4 IMPLEMENTATION AND EVALUATION
760
761 In this section, we describe our prototype implementation of C★ and our evaluation of it. In §4.1,
762 we discuss some aspects of our prototype C★ implementation diverged from the core design in
763 §3. In §4.2, we present an empirical evaluation of our prototype implementation on a suite of C
764 benchmark programs and report some interesting findings on using C★ for program verification.
765
766 4.1 Implementation Notes
767 Same as the workflow mentioned in §3.1, our implementation consists of three main components:
768 the C★ compiler, the LCF-style proof kernel, and the symbolic-execution engine.
769
Implementing the C★ compiler. The C★ compiler, implemented in OCaml, processes C code
770
with C★-specific attributes, managing syntax extensions (quotation and anti-quotation) and trans-
771
lating them to invocations of term-parsing functions and substitution primitives. The compiler also
772
assembles code in the proof blocks to form a proof program that executes in the proof-supporting
773
runtime. Note that the proof program is a C program. Specifically, global proof blocks are moved to
774
the beginning of the generated C program, and each function in the implementation code creates a
775
proof function, with local proof blocks appended in the order of their appearance. It also aligns the
776
concrete annotation syntax (and separation-logic assertion syntax) used in C★ with the external
777
symbolic execution-engine.
778
779 Reusing HOL Light proof kernel. In the implementation of C★, we reuse the LCF-style proof
780 kernel of the HOL Light prover [17], a minimal implementation of higher-order logic in OCaml. This
781 avoids the need to build a new LCF-style kernel from scratch in C, while leveraging the extensive
782 libraries available in HOL Light for mathematical reasoning. To support separation-logic entailment
783 proofs needed in program verification, we axiomatize a separation logic theory in HOL Light with
784
C★: Unifying Programming and Verification in C 17

785 Table 2. Evaluation of C★. “Impl” is short for “Implementation Code.” “PB” is short for “Proof Block.” “VC” is
786 short for “Verification Condition.” “Spec” is short for “Specficiation.” “Assert” is short for “Assertion.”
787
Class Name #Line of Impl #PB #VC #Line of Proof #Line of Spec/Assert
788
789 address_of_local 32 4 0 14 10
790 #1 globals 18 3 0 20 4
791 swap 15 0 0 0 7
792 multi_branch 24 10 0 102 25
793 #2 mutually_recursive 17 2 6 69 13
794 no_return 15 1 0 4 9
795 #3 malloc_free 9 1 0 9 8
796
clear 9 7 0 120 11
797
#4 forall 10 7 0 153 13
798
reverse 18 7 1 375 58
799
800 #5 attach_page 57 6 3 1616 451
801
802
803
a concrete memory model in mind, interpreting the heap as a finite mapping from addresses to
804
bytes and treating integers and pointers the same in higher-order logic.
805 Interfacing with the symbolic execution engine. In the ideal workflow illustrated in
806 Fig. 1, proof programs communicate with the symbolic-execution engine via functions like
807 get_symbolic_state(), set_symbolic_state(th), and feed_program_segment(prog). However, we currently
808 lack access to the internal states of the external symbolic-execution engine, making it challenging
809 to implement this interactive workflow directly. Consequently, we currently rely on the annotations
810 that the symbolic-execution engine supports for communication. To simulate the interleaving
811 execution pattern in the ideal workflow, we need to run the symbolic-execution engine twice for
812 each proof block: once for getting the symbolic state and once for setting it after running the proof
813 code. This is done manually for now.
814
815 4.2 Empirical Evaluation
816 To evaluate the effectiveness of our prototype implementation of C★, we selected a benchmark of
817 small C programs and verified their functional correctness entirely within C★. Most examples are
818 adapted from the VeriFast repository [20], while the buddy allocator example is drawn from CN [35].
819 Some additional examples were crafted manually to test C★’s handling of complex control-flow
820 structures. This benchmark allows us to (i) test the functionality of the C★ toolchain, including its
821 frontend parser, proof-supporting runtime, and the translation phase, (ii) assess the expressiveness
822 of C★’s reasoning capabilities, and (iii) evaluate the usability of C★’s hybrid operational and
823 declarative proof approach.
824 A complete list of the benchmark is shown in Tab. 2 and the source code of all benchmark
825 programs is included in the Supplementary Material. We chose these programs to encompass a
826 broad spectrum of reasoning patterns, including shared memory access (#1), control-flow constructs
827 (#2), dynamic memory management and interaction with external functions (#3), complex model-
828 level reasoning (#4), and a real-world case study (#5). Tab. 2 presents statistics regarding the
829 code size of each benchmark program. The column “#Line of Impl” lists the number of lines of
830 implementation code. The total number of lines in each benchmark program is significantly larger
831 due to the inclusion of proof code, whose statistics is given in the column “#Line of Proof,” as well
832 as specifications and assertions, whose statistics is given in the column “#Line of Spec/Assertion.”
833
18 Anon.

834 We also include (i) the number of proof blocks for the operational style of verification and (ii) the
835 number of verification conditions for the declarative style of verification.
836 Coverage of C language features. The evaluation demonstrates C★’s support for core C features,
837 especially those that create flexible aliasing patterns and complex control flow structures:
838
• Control-flow constructs, including multiple branching (if...else if...), (mutually) recur-
839
sive functions, break and continue, and (early) return. Benchmark programs address_of_local,
840
multi_branch, mutually_recursive, and no_return use some of these constructs.
841
• Shared memory access, covering global variables, arrays, addressable local variables, and
842
(multi-level) pointer indirections. Benchmark programs address_of_local, globals, and swap
843
make use of shared memory access.
844
• Dynamic memory management and interaction with (formally specified) external functions,
845
tested via malloc and free. The benchmark program malloc_free demonstrate those features.
846
847
Currently, our prototype implementation does not support switch statement, goto, or other looping
848
constructs (i.e., for and do while). We leave supporting those features for future work.
849 Complex logical reasoning. The benchmarks also illustrate C★’s capability for performing
850 complex logic-level reasoning. The expressiveness of higher-order logic used in C★ enables users
851 to define functional models (as inductive data types, e.g., lists or trees) and also (well-founded)
852 recursive functions that operate on these models (e.g., reversing a list), using high-level definitional
853 mechanisms. Users can also define recursive representation predicates to link the entry points of
854 concrete memory structures to their functional models, a technique typical of separation-logic-based
855 program reasoning [6]. In several instances within our benchmark, we leveraged the pre-existing
856 proof libraries of HOL Light, thereby reducing the effort required for model-level reasoning.
857 Nonetheless, for the benchmark program reverse, we proved four logic-level reasoning lemmas and
858 two ownership-related reasoning lemmas, which are reusable for reasoning about linked lists.
859 The proof code in those benchmark programs extensively uses C★’s standard proof-support
860 library. In addition to local_apply described in §3.2, our proof-support library includes other
861 reusable derived rules. For example, sep_normalize(t) transforms a heap proposition into a canoni-
862 cal form, sep_lift(l,t) lifts a sub-part of the heap proposition t to the far-left side, generalizing
863 the sep_lift_one function in Listing 3, and sep_reorder(t1,t2) verifies if two heap propositions are
864 reorderings of each other (modulo 𝛼-renaming of bound variables).
865 A real-world case study: buddy allocator. Inspired by CN [35], we applied C★ to a more
866 challenging real-world case study: the attach function of the buddy allocator used in pKVM [32]. A
867 buddy allocator manages memory in blocks of size 2𝑜 × 4 KB, where 𝑜 ∈ 0, 1, . . . , max_order − 1
868 denotes the order of the block. Each block is aligned according to its size, maintaining an invariant
869 about the alignment for all blocks.
870 Two blocks are called buddies if they (i) are adjacent, (ii) have the same order, and (iii) can be
871 merged into a larger block of the next order while preserving alignment. Allocatable memory is
872 divided into pools, each representing a contiguous range of pages. Every pool maintains a doubly-
873 linked list of free blocks for each order, and the allocator searches these lists for a free block of the
874 required size during memory allocation. Readers may refer to [35] for further details on the data
875 structures and helper functions used in this case study.
876 We verified the attach_page function, shown in Fig. 5a, from the implementation of the buddy
877 allocator. This function operates by receiving a released block, identifying any adjacent free buddy
878 block in the pool, and merging them to form a larger free block. This merging process continues
879 iteratively until no more free buddies are found or the maximum order is reached. The resulting
880 block is then added back to the pool. The loop invariant of the while loop in the implementation
881 code is shown in Fig. 5b. An interesting finding during verification was that the specifications in
882
C★: Unifying Programming and Verification in C 19

883 struct hyp_page *__hyp_vmemmap; [[invariant(`


884 static void attach_page( exists buddy_v bi inv_l inv_dl inv_hl i order_v
885 struct hyp_pool *pool, struct hyp_page *pg pg_v.
886 ) { data_at(&"max_order", Tuchar, &max_order) **
887 struct hyp_page *buddy = NULL; data_at(&"order", Tuchar, &order_v) **
888 u8 order = pg->order; data_at(&"pg", Tptr, pg_v) **
pg->order = (u8)HYP_NO_ORDER; data_at(&"buddy", Tptr, buddy_v) **
889
u8 max_order_ = pool->max_order; data_at(&"pool", Tptr, pool_pre) **
890
memset_page_zero(pg,order); data_at(&"__hyp_vmemmap", Tptr, vmemmap) **
891
buddy = __find_buddy_avail(pool,pg,order); (dlist_head_repr pool_pre 0 max_order inv_hl) **
892
while ((order + 1) < max_order_ && (free_area_repr
893 buddy != NULL) { (is_free_1st inv_l) start end inv_l) **
894 page_remove_from_list_pool(pool,buddy); (free_area_head_repr
895 buddy->order = (u8)HYP_NO_ORDER; (is_free_1st inv_l) start end inv_dl) **
896 pg = min(pg, buddy); (store_pageinfo_array vmemmap start end inv_l) **
897 order = order + 1; (store_zero_array
898 buddy = __find_buddy_avail(pool,pg,order); (i2vaddr i) 0 (PAGE_SIZE * (2 EXP order_v))
899 } (PAGE_SIZE * (2 EXP order_v))) **
pg->order = order; ${other_facts_and_representation_predicates:hprop
900
page_add_to_list_pool(pool,pg,order); }
901
} `)]]
902
903 (a) The implementation code. (b) The invariant of the loop.
904
Fig. 5. The attach_page function.
905
906
907
CN [35] were not sufficient to guarantee that all free blocks are present in a doubly-linked list.
908
Despite this, we adhered to these weaker specifications for simplicity in our verification efforts.
909
910
Experience report. The experience of two undergraduate students in using C★ for benchmark
911
evaluation reveals several usability issues of the current prototype implementation:
912 • IDE support. The lack of an IDE that shows symbolic states alongside code was a major pain
913 point. Currently, users must run the symbolic execution engine manually and inspect symbolic
914 states from its lengthy output, which interrupts the workflow. Developing an IDE for C★, e.g.,
915 as an editor plugin, is left for future work.
916 • Proof automation. C★ lacks automation for discharging trivial facts, making simple proofs
917 time-consuming. This is partly due to the absence of solver-aided proof automation, heavily
918 relied on by tools like CN [35] and VeriFast [20]. Looking forward, we plan to provide C★ an
919 interface to encode and delegate proof obligations to external automated theorem provers or
920 frameworks such as Z3 [10] and Why3 [11].
921 • Proof-support library. Writing separation-logic entailment proofs in C★ currently requires
922 considerable boilerplate code, leading to long and repetitive proof code. This issue arises
923 because C★ lacks a rich set of derived rules for handling separation-logic reasoning, unlike
924 mature frameworks such as Iris [23], VST [5], or CFML [7]. In our future work, we expect that
925 expanding C★’s proof support libraries with more derived rules could improve the conciseness
926 of proof code and developer productivity.
927
928 5 RELATED WORK
929 Live Verification framework. The Live Verification framework [15] is a recently proposed
930 framework with a similar goal of enabling its users to verify their low-level code as they write
931
20 Anon.

932 it. The framework is embedded in the Coq proof assistant, providing real-time display of the
933 symbolic state at cursor position in the goal panel. After a function has been given a prototype
934 with formal specifications, users develop the function body incrementally by either writing the
935 next line of implementation code, or writing Ltac proof scripts to shift the view on the symbolic
936 state or discharge generated side conditions. When this derivation process is finished, a correctness
937 proof is produced alongside the assembled implementation code. With some clever tricks, these
938 Ltac source files can also be viewed as ordinary C code (with Ltac proof scripts in comments) and
939 compiled directly with C compilers.
940 A key difference between C★ and the Live Verification framework is C★’s focus on accessibility
941 for conventional programmers. In the Live Verification framework, proof development and cus-
942 tomization of proof automation require proficiency in Coq’s Ltac tactic language, which diverges
943 from the imperative programming experience familiar to programmers. In contrast, C★ allows
944 proof code to be written directly in the same language as the implementation code, making it more
945 approachable for conventional programmers.
946
VeriFast. VeriFast [20] is a state-of-the-art symbolic execution and separation logic-based auto-
947
mated verification tool for C and Java. It has a custom specification language that allows users to
948
define inductive data types, structurally recursive functions, and recursive representation predicates.
949
VeriFast puts emphasis on predictable automation: during symbolic execution, users unfold and
950
fold predicates manually using proof commands open and close. A restricted form of existential
951
quantification is supported in the form of pattern matching, and reasoning on first-order values are
952
delegated to the SMT solver. When inductive reasoning is required, it supports user-written ghost
953
lemma functions, which are verified like the implementation code but require proof of termination
954
and must be observationally pure. VeriFast can handle a substantial subset of C features.
955
The primary distinction between C★ and VeriFast lies in the extensibility of their proof support.
956
In VeriFast, proof support is limited to a fixed set of built-in ghost statements and basic induction
957
capabilities using lemma functions. On the other hand, C★ enables users to develop custom proof
958
rules and automation functions, offering greater flexibility and expressiveness for complex verifica-
959
tion tasks. Also, this extensibility allows experts to create high-level reasoning abstractions that
960
are accessible to programmers.
961
962 CN. CN [35] is an ownership and refinement type system for C, targeting the verification of
963 real-world systems software. CN aims for predictable proof automation, employing the Liquid
964 types [38] approach for decidable automation using an SMT backend, with heuristics for instan-
965 tiating quantifiers. It supports sound ownership reasoning at the type level using idea similar to
966 capabilities [1], split the type of a heap fragment into a linear capability type and an unrestricted
967 pointer type for flexible aliasing commonly found in real-world code. Additionally, CN is grounded
968 on a realistic semantics, Cerberus [29], which accurately models a large fragment of ISO C.
969
970 6 CONCLUSION
971 In this paper, we presented C★, a new system and language design for verified programming in C.
972 C★ provides (i) language-level integration for both declarative and operational styles of verification,
973 (ii) comprehensive reasoning capabilities in an expressive logic using C’s programming paradigm, as
974 well as (iii) support for real-time verification. It builds upon the established techniques of separation
975 logic-based symbolic execution for modular program reasoning, as well as the LCF-style approach
976 to programming proofs. We implemented a prototype of C★ and evaluated its effectiveness for
977 developing verified C programs on a suite of benchmark programs. In the future, we plan to develop
978 an IDE for C★ to enable interactive program verification, interface C★ with solved-aided proof
979 automation to reduce proof efforts, and develop more proof-support libraries for C★.
980
C★: Unifying Programming and Verification in C 21

981 REFERENCES
982 [1] Amal Ahmed, Matthew Fluet, and Greg Morrisett. 2007. L3 : A Linear Language with Locations. Fundam. Informaticae
983 77, 4 (2007), 397–449. http://content.iospress.com/articles/fundamenta-informaticae/fi77-4-06
984 [2] Sidney Amani, Alex Hixon, Zilin Chen, Christine Rizkallah, Peter Chubb, Liam O’Connor, Joel Beeren, Yutaka
Nagashima, Japheth Lim, Thomas Sewell, Joseph Tuong, Gabriele Keller, Toby C. Murray, Gerwin Klein, and Gernot
985
Heiser. 2016. CoGENT: Verifying High-Assurance File System Implementations. In Proceedings of the Twenty-First
986
International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016,
987 Atlanta, GA, USA, April 2-6, 2016, Tom Conte and Yuanyuan Zhou (Eds.). ACM, 175–188. https://doi.org/10.1145/
988 2872362.2872404
989 [3] Andrew W. Appel. 2011. Verified Software Toolchain - (Invited Talk). In Programming Languages and Systems - 20th
European Symposium on Programming, ESOP 2011, Held as Part of the Joint European Conferences on Theory and Practice
990
of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings (Lecture Notes in Computer Science,
991
Vol. 6602), Gilles Barthe (Ed.). Springer, 1–17. https://doi.org/10.1007/978-3-642-19718-5_1
992 [4] Josh Berdine, Cristiano Calcagno, and Peter W. O’Hearn. 2005. Symbolic Execution with Separation Logic. In
993 Programming Languages and Systems, Third Asian Symposium, APLAS 2005, Tsukuba, Japan, November 2-5, 2005,
994 Proceedings (Lecture Notes in Computer Science, Vol. 3780), Kwangkeun Yi (Ed.). Springer, 52–68. https://doi.org/10.
1007/11575467_5
995
[5] Qinxiang Cao, Lennart Beringer, Samuel Gruetter, Josiah Dodds, and Andrew W. Appel. 2018. VST-Floyd: A Separation
996
Logic Tool to Verify Correctness of C Programs. J. Autom. Reason. 61, 1-4 (2018), 367–422. https://doi.org/10.1007/
997 S10817-018-9457-5
998 [6] Arthur Charguéraud. 2016. Higher-order representation predicates in separation logic. In Proceedings of the 5th ACM
999 SIGPLAN Conference on Certified Programs and Proofs, Saint Petersburg, FL, USA, January 20-22, 2016, Jeremy Avigad
and Adam Chlipala (Eds.). ACM, 3–14. https://doi.org/10.1145/2854065.2854068
1000
[7] Arthur Charguéraud. 2020. Separation logic for sequential programs (functional pearl). Proc. ACM Program. Lang. 4,
1001
ICFP (2020), 116:1–116:34. https://doi.org/10.1145/3408998
1002 [8] Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2015. Using
1003 Crash Hoare logic for certifying the FSCQ file system. In Proceedings of the 25th Symposium on Operating Systems
1004 Principles, SOSP 2015, Monterey, CA, USA, October 4-7, 2015, Ethan L. Miller and Steven Hand (Eds.). ACM, 18–37.
https://doi.org/10.1145/2815400.2815402
1005
[9] Ernie Cohen, Markus Dahlweid, Mark A. Hillebrand, Dirk Leinenbach, Michal Moskal, Thomas Santen, Wolfram
1006
Schulte, and Stephan Tobies. 2009. VCC: A Practical System for Verifying Concurrent C. In Theorem Proving in Higher
1007 Order Logics, 22nd International Conference, TPHOLs 2009, Munich, Germany, August 17-20, 2009. Proceedings (Lecture
1008 Notes in Computer Science, Vol. 5674), Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel (Eds.).
1009 Springer, 23–42. https://doi.org/10.1007/978-3-642-03359-9_2
[10] Leonardo Mendonça de Moura and Nikolaj S. Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms
1010
for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as Part of the Joint
1011
European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008.
1012 Proceedings (Lecture Notes in Computer Science, Vol. 4963), C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer,
1013 337–340. https://doi.org/10.1007/978-3-540-78800-3_24
1014 [11] Jean-Christophe Filliâtre and Andrei Paskevich. 2013. Why3 - Where Programs Meet Provers. In Programming
Languages and Systems - 22nd European Symposium on Programming, ESOP 2013, Held as Part of the European Joint
1015
Conferences on Theory and Practice of Software, ETAPS 2013, Rome, Italy, March 16-24, 2013. Proceedings (Lecture
1016
Notes in Computer Science, Vol. 7792), Matthias Felleisen and Philippa Gardner (Eds.). Springer, 125–128. https:
1017 //doi.org/10.1007/978-3-642-37036-6_8
1018 [12] Mike Gordon. 2000. From LCF to HOL: a short history. In Proof, Language, and Interaction, Essays in Honour of Robin
1019 Milner, Gordon D. Plotkin, Colin Stirling, and Mads Tofte (Eds.). The MIT Press, 169–186.
[13] David Greenaway, June Andronick, and Gerwin Klein. 2012. Bridging the Gap: Automatic Verified Abstraction of
1020
C. In Interactive Theorem Proving - Third International Conference, ITP 2012, Princeton, NJ, USA, August 13-15, 2012.
1021
Proceedings (Lecture Notes in Computer Science, Vol. 7406), Lennart Beringer and Amy P. Felty (Eds.). Springer, 99–115.
1022 https://doi.org/10.1007/978-3-642-32347-8_8
1023 [14] David Greenaway, Japheth Lim, June Andronick, and Gerwin Klein. 2014. Don’t sweat the small stuff: formal verification
1024 of C code without the pain. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI
’14, Edinburgh, United Kingdom - June 09 - 11, 2014, Michael F. P. O’Boyle and Keshav Pingali (Eds.). ACM, 429–439.
1025
https://doi.org/10.1145/2594291.2594296
1026
[15] Samuel Gruetter, Viktor Fukala, and Adam Chlipala. 2024. Live Verification in an Interactive Proof Assistant. Proc.
1027 ACM Program. Lang. 8, PLDI (2024), 1535–1558. https://doi.org/10.1145/3656439
1028
1029
22 Anon.

1030 [16] Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan (Newman) Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo.
1031 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In 12th USENIX Symposium
1032 on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016, Kimberly Keeton
and Timothy Roscoe (Eds.). USENIX Association, 653–669. https://www.usenix.org/conference/osdi16/technical-
1033
sessions/presentation/gu
1034 [17] John Harrison. 2009. HOL Light: An Overview. In Theorem Proving in Higher Order Logics, 22nd International Conference,
1035 TPHOLs 2009, Munich, Germany, August 17-20, 2009. Proceedings (Lecture Notes in Computer Science, Vol. 5674), Stefan
1036 Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel (Eds.). Springer, 60–66. https://doi.org/10.1007/978-
1037 3-642-03359-9_4
[18] John Harrison, Josef Urban, and Freek Wiedijk. 2014. History of Interactive Theorem Proving. In Computational Logic,
1038
Jörg H. Siekmann (Ed.). Handbook of the History of Logic, Vol. 9. Elsevier, 135–214. https://doi.org/10.1016/B978-0-
1039 444-51624-4.50004-6
1040 [19] C. A. R. Hoare, Jayadev Misra, Gary T. Leavens, and Natarajan Shankar. 2009. The verified software initiative: A
1041 manifesto. ACM Comput. Surv. 41, 4 (2009), 22:1–22:8. https://doi.org/10.1145/1592434.1592439
1042 [20] Bart Jacobs, Jan Smans, Pieter Philippaerts, Frédéric Vogels, Willem Penninckx, and Frank Piessens. 2011. Ver-
iFast: A Powerful, Sound, Predictable, Fast Verifier for C and Java. In NASA Formal Methods - Third Interna-
1043
tional Symposium, NFM 2011, Pasadena, CA, USA, April 18-20, 2011. Proceedings (Lecture Notes in Computer Science,
1044 Vol. 6617), Mihaela Gheorghiu Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi (Eds.). Springer, 41–55.
1045 https://doi.org/10.1007/978-3-642-20398-5_4
1046 [21] Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. 2015. Frama-C: A software
1047 analysis perspective. Formal Aspects Comput. 27, 3 (2015), 573–609. https://doi.org/10.1007/S00165-014-0326-7
[22] Gerwin Klein, June Andronick, Kevin Elphinstone, Toby C. Murray, Thomas Sewell, Rafal Kolanski, and Gernot Heiser.
1048
2014. Comprehensive formal verification of an OS microkernel. ACM Trans. Comput. Syst. 32, 1 (2014), 2:1–2:70.
1049 https://doi.org/10.1145/2560537
1050 [23] Robbert Krebbers, Amin Timany, and Lars Birkedal. 2017. Interactive proofs in higher-order concurrent separation
1051 logic. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris,
1052 France, January 18-20, 2017, Giuseppe Castagna and Andrew D. Gordon (Eds.). ACM, 205–217. https://doi.org/10.1145/
3009837.3009855
1053
[24] Ramana Kumar, Magnus O. Myreen, Michael Norrish, and Scott Owens. 2014. CakeML: a verified implementation of
1054 ML. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San
1055 Diego, CA, USA, January 20-21, 2014, Suresh Jagannathan and Peter Sewell (Eds.). ACM, 179–192. https://doi.org/10.
1056 1145/2535838.2535841
1057 [25] Dirk Leinenbach and Thomas Santen. 2009. Verifying the Microsoft Hyper-V Hypervisor with VCC. In FM 2009:
Formal Methods, Second World Congress, Eindhoven, The Netherlands, November 2-6, 2009. Proceedings (Lecture Notes in
1058
Computer Science, Vol. 5850), Ana Cavalcanti and Dennis Dams (Eds.). Springer, 806–809. https://doi.org/10.1007/978-
1059 3-642-05089-3_51
1060 [26] Xavier Leroy. 2009. Formal verification of a realistic compiler. Commun. ACM 52, 7 (2009), 107–115. https:
1061 //doi.org/10.1145/1538788.1538814
1062 [27] Shih-Wei Li, Xupeng Li, Ronghui Gu, Jason Nieh, and John Zhuang Hui. 2021. A Secure and Formally Verified Linux
KVM Hypervisor. In 42nd IEEE Symposium on Security and Privacy, SP 2021, San Francisco, CA, USA, 24-27 May 2021.
1063
IEEE, 1782–1799. https://doi.org/10.1109/SP40001.2021.00049
1064 [28] William Mansky and Ke Du. 2024. An Iris Instance for Verifying CompCert C Programs. Proc. ACM Program. Lang. 8,
1065 POPL (2024), 148–174. https://doi.org/10.1145/3632848
1066 [29] Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N. M. Watson, and
1067 Peter Sewell. 2016. Into the depths of C: elaborating the de facto standards. In Proceedings of the 37th ACM SIGPLAN
Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, June 13-17, 2016,
1068
Chandra Krintz and Emery D. Berger (Eds.). ACM, 1–15. https://doi.org/10.1145/2908080.2908081
1069 [30] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. 2002. Isabelle/HOL - A Proof Assistant for Higher-Order
1070 Logic. Lecture Notes in Computer Science, Vol. 2283. Springer. https://doi.org/10.1007/3-540-45949-9
1071 [31] Peter W. O’Hearn. 2019. Separation logic. Commun. ACM 62, 2 (2019), 86–95. https://doi.org/10.1145/3211968
1072 [32] Android Open Source Project. 2024. Android Virtualization Architecture. Avaiable on https://source.android.com/
docs/core/virtualization/architecture.
1073
[33] Jonathan Protzenko, Bryan Parno, Aymeric Fromherz, Chris Hawblitzel, Marina Polubelova, Karthikeyan Bhargavan,
1074 Benjamin Beurdouche, Joonwon Choi, Antoine Delignat-Lavaud, Cédric Fournet, Natalia Kulatova, Tahina Ramananan-
1075 dro, Aseem Rastogi, Nikhil Swamy, Christoph M. Wintersteiger, and Santiago Zanella Béguelin. 2020. EverCrypt: A
1076 Fast, Verified, Cross-Platform Cryptographic Provider. In 2020 IEEE Symposium on Security and Privacy, SP 2020, San
1077 Francisco, CA, USA, May 18-21, 2020. IEEE, 983–1002. https://doi.org/10.1109/SP40000.2020.00114
1078
C★: Unifying Programming and Verification in C 23

1079 [34] Jonathan Protzenko, Jean Karim Zinzindohoué, Aseem Rastogi, Tahina Ramananandro, Peng Wang, Santiago Zanella
1080 Béguelin, Antoine Delignat-Lavaud, Catalin Hritcu, Karthikeyan Bhargavan, Cédric Fournet, and Nikhil Swamy.
1081 2017. Verified low-level programming embedded in F*. Proc. ACM Program. Lang. 1, ICFP (2017), 17:1–17:29. https:
//doi.org/10.1145/3110261
1082
[35] Christopher Pulte, Dhruv C. Makwana, Thomas Sewell, Kayvan Memarian, Peter Sewell, and Neel Krishnaswami.
1083 2023. CN: Verifying Systems C Code with Separation-Logic Refinement Types. Proc. ACM Program. Lang. 7, POPL
1084 (2023), 1–32. https://doi.org/10.1145/3571194
1085 [36] Tahina Ramananandro, Antoine Delignat-Lavaud, Cédric Fournet, Nikhil Swamy, Tej Chajed, Nadim Kobeissi, and
1086 Jonathan Protzenko. 2019. EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats. In
28th USENIX Security Symposium, USENIX Security 2019, Santa Clara, CA, USA, August 14-16, 2019, Nadia Heninger
1087
and Patrick Traynor (Eds.). USENIX Association, 1465–1482. https://www.usenix.org/conference/usenixsecurity19/
1088 presentation/delignat-lavaud
1089 [37] John C. Reynolds. 2002. Separation Logic: A Logic for Shared Mutable Data Structures. In 17th IEEE Symposium on
1090 Logic in Computer Science (LICS 2002), 22-25 July 2002, Copenhagen, Denmark, Proceedings. IEEE Computer Society,
1091 55–74. https://doi.org/10.1109/LICS.2002.1029817
[38] Patrick Maxim Rondon, Ming Kawaguchi, and Ranjit Jhala. 2008. Liquid types. In Proceedings of the ACM SIGPLAN
1092
2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, Rajiv Gupta
1093 and Saman P. Amarasinghe (Eds.). ACM, 159–169. https://doi.org/10.1145/1375581.1375602
1094 [39] Michael Sammler, Rodolphe Lepigre, Robbert Krebbers, Kayvan Memarian, Derek Dreyer, and Deepak Garg. 2021.
1095 RefinedC: automating the foundational verification of C code with refined ownership types. In PLDI ’21: 42nd ACM
1096 SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June
20-25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 158–174. https://doi.org/10.1145/3453483.3454036
1097
[40] Runzhou Tao, Jianan Yao, Xupeng Li, Shih-Wei Li, Jason Nieh, and Ronghui Gu. 2021. Formal Verification of a
1098 Multiprocessor Hypervisor on Arm Relaxed Memory Hardware. In SOSP ’21: ACM SIGOPS 28th Symposium on
1099 Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021, Robbert van Renesse and Nickolai
1100 Zeldovich (Eds.). ACM, 866–881. https://doi.org/10.1145/3477132.3483560
1101 [41] Fengwei Xu, Ming Fu, Xinyu Feng, Xiaoran Zhang, Hui Zhang, and Zhaohui Li. 2016. A Practical Verification
Framework for Preemptive OS Kernels. In Computer Aided Verification - 28th International Conference, CAV 2016,
1102
Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 9780), Swarat
1103 Chaudhuri and Azadeh Farzan (Eds.). Springer, 59–79. https://doi.org/10.1007/978-3-319-41540-6_4
1104 [42] Litao Zhou, Jianxing Qin, Qinshi Wang, Andrew W. Appel, and Qinxiang Cao. 2024. VST-A: A Foundationally Sound
1105 Annotation Verifier. Proc. ACM Program. Lang. 8, POPL (2024), 2069–2098. https://doi.org/10.1145/3632911
1106
Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy