Procedures and Coprocedures
HeyVL statements are placed inside the body of (co)procedures or (co)procs for short. A procedure has a name, a list of parameters, a list of return values, and a list of specification attributes. Caesar tries to verify each procedure in the given HeyVL files using the specification. Coprocedures are like procedures, but the specification is interpreted differently (see below).
Procedures can be called inside other procedures. This enables modular reasoning about code: Prove that some code satisfies the specification once, and then use the specification when at the call site - without reasoning about the procedure's body again.
There are also procedures without an associated body. They are not verified by Caesar, but can be called inside other procedures.
Procs by Example: Increment With 50% Probability
The following procedure named maybe_increment
increments the value of the input parameter init_x
by one with probability 50% and leaves it unchanged with 50% probability, returning the result in output parameter x
.
proc maybe_increment(init_x: UInt) -> (x: UInt)
pre init_x + 0.5
post x
{
x = init_x
var prob_choice: Bool = flip(0.5)
if prob_choice {
x = x + 1
} else {}
}
Verifying this proc checks the following equation:
Let us decompose the example into its parts:
- Keyword
proc
: We verify that holds, i.e. the expected value ofx
after executingmaybe_increment
is at leastinit_x + 0.5
.- If we used the
coproc
keyword instead, we would verify (upper instead of lower bounds).
- If we used the
- We have one input parameter
init_x
of typeUInt
.- Input parameters may not be modified in the program.
- We have one output parameter
x
of typeUInt
.- There may be multiple parameters (input and output), which can be separated by commas (e.g.
init_x: UInt, init_y: UInt
).
- There may be multiple parameters (input and output), which can be separated by commas (e.g.
- The
pre
declares the pre-expectationinit_x + 0.5
. It is evaluated in the initial state (when calling the proc). This is why it is called "pre" (= before running the proc).- The
pre
is an expression of typeEUReal
over the input parameters.
- The
- The
post
is the post-expectationx
and evaluated in the final states of the proc (post = after running the proc). We always compare its expected value against the pre.- The
post
is an expression of typeEUReal
over the input and output parameters.
- The
- The body of the proc assigns
init_x
tox
. We then do a probabilistic coin flip and assigntrue
toprob_choice
with probability0.5
(andfalse
with probability0.5
). It determines the expected value () we look at.- See documentation on statements for more information.
- The body is optional.
Different specifications yield different results. Remember that we have .
Kind | Pre | Post | Verifies? | Explanation |
---|---|---|---|---|
proc | init_x + 0.4 | x | Yes | . |
proc | init_x + 0.6 | x | No | . |
coproc | init_x + 0.6 | x | Yes | . |
coproc | init_x + 0.4 | x | No | . |
Writing Specifications
The specifications in HeyVL follow the probabilistic predicate transformer paradigm. Caesar transforms a post backwards through the program to compute its expected value. The result is then checked against the pre, for each possible initial state.
There is an inductive definition on the statements documentation page for the verification pre-expectation transformer that Caesar uses to compute the expected value of the post after running statements . However, for now it is sufficient to think of expected values as they intuitively are.
The Specification Type: EUReal
for Quantitative Reasoning
In HeyVL, the pre
and post
expressions have type EUReal
, i.e. non-negative real numbers or infinity.
We have .
This allows us to reason about various kinds expected values of probabilistic programs. However, all of these expected values are limited to non-negative values. This is seldom a practical limitation.1
HeyVL supports the syntax of the relatively complete assertion language by Batz et al. This means we can express practically all expected values of interest in our syntax. For more information, read the Caesar documentation on relative completeness.
Embedding Boolean Specifications
The semantics of HeyVL is a generalization of classical (Boolean) predicate transformer reasonining as found in verifiers such as Dafny. The Boolean reasoning can be fully embedded into our quantitative setting.
The embed expression ?(b)
takes a Boolean expression b
and maps it to if b
evaluates to true in the current state.
Otherwise, ?(b)
evaluates to .
The following procedure named forty_two
accepts a single parameter, x
of type UInt
and returns a value y
of type UInt
.
The procedure has a body with just a single assignment statement that increments x
by 1 and saves the result in y
.
proc forty_two(x: UInt) -> (y: UInt)
pre ?(x == 41)
post ?(y == 42)
{
y = x + 1
}
The proc
verifies that y == 42
holds after all executions (post
) when we started in a state that satisfied x == 41
(pre
).
Why do the semantics work out so neatly? Let's examine the semantics manually to see how it works. Recall that the verification pre-expectation transformer in effect computes the expected value of after running statement .
So if we assume holds in the beginning, then we know that holds afterwards. If we assume , then we know nothing () about the final state.
In short: we can use embed expressions ?(b)
to write Boolean assumptions in the pre
and Boolean assertions in the post
.
Embedding Boolean Specifications in Coprocedures
The same kind of reasoning for embedding Boolean specifications applies to coprocedures. However, there is one thing to be careful about: We usually need to negate the Boolean specification to obtain the "intuitive" result.
TLDR: In coprocedures, you'll usually want to use !?(b)
instead of ?(b)
.
Accidental Incorrectness Reasoning
Consider the following modified example which always assigns y = 42
.
coproc forty_two_upper(x: UInt) -> (y: UInt)
pre ?(x == 41)
post ?(y == 42)
{
y = 42
}
This will not verify because it actually checks , which is equivalent to , i.e. the only way the coprocedure does not verify is if there is an initial state satisfying such that does not hold. But since all states satisfy , e.g. the initial state is a counter-example to verification.
One can understand this as an instance of Reverse Hoare Logic or (Partial) Incorrectness Reasoning, i.e. asking a question of the form: "Do all initial states that reach satisfy ?".
How To: Obtaining the above formula via --print-theorem --no-slice-error
.
Using the --print-theorem
command-line flag, you can print the theorem Caesar tries to prove about your (co)procedures.
The result will have some optimizations applied, but it might be helpful to understand what exactly is being verified.
We recommend adding the --no-slice-error
flag to obtain a simpler version that is not cluttered with stuff from slicing for error messages.
For example:
caesar verify --print-theorem --no-slice-error example.heyvl
Usually You Want !?(b)
We often write !?(b)
to abbreviate ?(!(b))
, i.e. mapping b
to if it is true and to otherwise.2
coproc forty_two_upper2(x: UInt) -> (y: UInt)
pre !?(x == 41)
post !?(y == 42)
{
y = 42
}
In this example, the pre
works as expected.
It tells the verifier to verify only from initial states that satisfy .
And the post
encodes that we reach — the set of initial states that reach contains only states that satisfy .
Multiple Or No pre
/post
Annotations
Multiple pre
and post
annotations are logically combined.3
⊓
is the minimum operator and ⊔
maximum operator (see expressions documentation).
pre A pre B | post C post D | |
---|---|---|
proc | pre (A ⊓ B) | post (C ⊓ D) |
coproc | pre (A ⊔ B) | post (C ⊔ D) |
In proc
s, we combine pre
s and post
s with the minimum ⊓
.
In coproc
s, the combination is dual and uses the maximum ⊔
.
This generalizes the Boolean setting neatly:
pre ?(A) pre ?(B) | post ?(C) post ?(D) | |
---|---|---|
proc | pre ?(A && B) | post ?(C && D) |
coproc | pre ?(A || B) | post ?(C || D) |
If we use !?(b)
in coproc
s, then we obtain that pre !?(A) pre !?(B)
is equivalent to pre !?(A && B)
as one might expect.
Same for post
annotations.
The specification is optional; if it's not provided, Caesar will add a default specification: pre ?(true)
and post ?(true)
for procedures and pre ?(false)
and post ?(false)
for coprocedures.
The defaults correspond to the empty conjunction () and empty disjunction (). The quantitative setting behaves the same, we have and .
Procedures Without a Body
Procedures and coprocedures can be written without a corresponding body. Since verification of procedure calls only needs the callee specification, we can just write a procedure specification and call it in other procedures based solely on that specification.
This allows us to follow a programming methodology called programming from specifications, i.e. writing and verifying programs incrementally 4. We can start with just a specification and step-by-step fill in executable code. A procedure call to a procedure without a body then corresponds to what is known in the literature as the specification statement, which is the basis for the programming from specifications methodology.
Procedures with a body can be used to introduce unsoundness even without using assume
statements explicitly.
Calling the following procedure is essentially the same as writing assume ?(false)
.
proc unsound() -> ()
pre ?(true)
post ?(false)
Such procedures are called infeasible or miraculous in the literature.
Calling Procedures
We can call proc
s from proc
s and coproc
s from coproc
s.
Mixing both is unsound and will result in an error.
In the following, we will talk about procedures, but everything applies to coprocedures as well.
Procedure calls make use of the procedure's specification only and do not inspect the procedure body. Somewhat informally, we could say that assuming the callee procedure verifies, the procedure call can be replaced by the procedure's body and the program will still verify. This enables modular reasoning: one can verify a big program and we can re-use already-verified parts of it in other parts.
Example: A Spare Engine
The following example models running an engine with a spare engine available.
In runPrimaryOrSpare
, the primary engine works with probability of 90%.
If it fails, we have another spare available that is guaranteed to work with the same probability.
proc spare() -> (working: Bool)
pre 0.9
post [working]
{
working = flip(0.95)
}
proc runPrimaryOrSpare() -> (working: Bool)
pre 0.99
post [working]
{
working = flip(0.9)
if !working {
working = spare()
} else {}
}
The specification of spare
says working
is true with probability of at least 0.9
(and in fact the coin flip says it is 0.95
).
Thus, we can verify a success probability of runOrSpare
of at least 0.99
.
However, we cannot verify a lower bound of 0.995
because the specification of spare
does not guarantee it (even though it is true here).
Multiple Or No Return Values
If the callee has multiple return values, multiple variables may be assigned to by comma-separating them.
For example, if spare
returned two variables, we could obtain the results via working1, working2 = spare()
.
If the callee has no return values, we just call it without the assignment.
For example, we simply write spare()
as a statement to call it without assignment.
Assert-Assume Understanding of Procedure Calls
Internally, procedure calls are translated to HeyVL verification statements (see statements documentation), consisting of assert
, havoc
, validate
, and assume
statements.
We can understand how procedure calls work by understanding this encoding.
During the verification of runPrimaryOrSpare
, Caesar replaces the call to spare
by this encoding:
proc runPrimaryOrSpare() -> (working: Bool)
pre 0.99
post [working]
{
working = flip(0.9)
if !working {
assert 0.95
havoc success
validate
assume [success]
} else {}
}
How To: Obtaining intermediate encodings via --print-core
.
To obtain the intermediate encodings from Caesar, we can use the --print-core
command-line flag.
This will print the fully desugared HeyVL code for each procedure to standard output.
For example:
caesar verify --print-core example.heyvl
We can now read the encoding as follows:
- Before the call,
assert
thepre
spare()
. This means that the expected value (going backwards) is at most0.95
. - We then forget the current value of
success
viahavoc
. This models thatspare
can have arbitrary effects on return values. - We
validate
(the following assumption must be absolutely true). - We can only
assume
thepost
ofspare
, i.e. thatsuccess
is true.
This generalizes the Boolean setting:
- Before the call, the
pre
must hold. - Forget the current value of modified variables.
- (The
validate
does not have any effect in the Boolean setting.) - Assume that the
post
holds after the call.
Example: Why validate
is needed for soundness.
The validate
statement has no effect in the Boolean setting, but it is crucial for soundness in the quantitative setting.
The formal proof of soundness can be found in the OOPSLA '23 paper.
Here, we show an example where omitting the validate
statement leads to unsoundness.
Consider the following procedure P
that has a pre of 1
and a post of 2
.
proc P() -> ()
pre 1
post 2
{
// body of P
var b: Bool = flip(0.5)
assert ?(b)
// end of body of P
}
The body loses half of the probability mass because the assert ?(b)
statement succeeds only with probability .
It verifies with the pre 1
and post 2
, because the expected value of the post 2
is .
We now consider three different ways to call P
from another procedure S
.
Procedure S
consists solely of the call to P
and has a pre of 1
and a post of 1
.
Therefore, it should not verify, because the expected value of the post 1
is .
- Inlining the body of
P
: We can inline the body ofP
intoS
, which means we replace the call toP
with the body ofP
. This does not verify, as expected.proc S_inlined_call() -> ()
pre 1
post 1
{
// --- this is the body:
var b: Bool = flip(0.5)
assert ?(b)
// --- end of the body
} - Calling
P
withoutvalidate
: We can callP
without thevalidate
statement. This verifies, which is incorrect and unsound.proc S_incorrect_call() -> ()
pre 1
post 1
{
// call P incorrectly without validate
assert 1
assume 2
} - Calling
P
withvalidate
: The correct way to encode the call toP
is by using thevalidate
statement. Then,S
does not verify, which is correct.proc S_correct_call() -> ()
pre 1
post 1
{
// call P correctly with validate
assert 1
validate
assume 2
}
Thus, the validate
statement is crucial for soundness of procedure calls, in the case where the callee's post does not entail the caller's post.
In particular, it can be required when calling procedures where the following holds for the procedure body , post-expectation , and modified variables :
That is, the call encoding's semantics can evaluate to the infimum of the post-expectation over the modified variables when the validate
statement is not present.
If it was present, the call semantics would evaluate to instead --- always a sound lower bound to .
The encoding of Park induction for the verification of while loops makes use of the same statements as the procedure call encoding plus some additional statements. A more formal treatment of the encoding and semantics of procedure calls can be found in our OOPSLA '23 paper.
Footnotes
-
Many verification tasks that require reasoning with negative numbers can be embedded in this framework. First, note that we can still have negative numbers in our program states, we just have to ensure that the
post
is non-negative. Chapter 11 of the paper "Relatively Complete Verification for Probabilistic Programs" by Batz et al. might be of interest for further reading. ↩ -
!?(b)
is not a special kind of operator, it is simply the!
operator applied to?(b)
, i.e.!(?(b))
. The negation operator!
is defined onEUReal
by and for all . Thus,!?(b)
is logically equivalent to?(!b)
. In the OOPSLA '23 paper, we denoted!?(b)
by . ↩ -
The combination of specifications is a logical consequence of their encodings in HeyVL. When verifying a procedure,
pre
annotations will be translated toassume
andpost
annotations will be translated toassert
statements. One can prove that the HeyVL statementsassume A1; assume A2
are equivalent toassume (A1 ⊓ A2)
, and also that the sequenceassert A1; assert A2
is equivalent toassert (A1 ⊓ A2)
. The same equalities can be used for the procedure call encoding. ↩ -
The book Programming from Specifications by Carroll Morgan details the methodology in a very approachable way (for non-probabilistic programs). It is available online at https://www.cs.ox.ac.uk/publications/books/PfS/ in PostScript format (you can convert to PDF using
ps2pdf
command). ISBN: 978-0-13-123274-7. ↩