Soundness of Proof Rules
Using a proof rule to verify a HeyVL program can introduce approximations compared to the original program semantics. These approximations influence guarantees Caesar gives during verification:
- Sound verification: verification succeeds property holds for original program.
- Sound refutation: verification fails property does not hold for original program (counterexample is valid).
Without sound verification, a property may not hold for the original program (unsound verification). On the other hand, without sound refutations, a counter-example may be spurious (not valid in the original program). However, it still is a counter-example to verification, which can be used to e.g. improve loop invariants.
Most program verifiers aim for sound verifications, but Caesar also supports sound refutations. This is useful when you want to show that a certain expectation is not a lower or upper bound.
This guide gives an overview of Caesar's proof rules and how they affect soundness of verifications and refutations. Caesar gives diagnostics when a verification or refutation might be unsound because of an incompatible proof-rule/calculus setup. For an overview of what Caesar does and does not check automatically, see What Is Checked By Caesar.
Quick Overview
To verify or refute a lower/upper bound (proc/coproc), the proof rules must induce the right approximation (Over/Under) between original semantics () and verification condition semantics (). The cards below are the quick reference; the detailed rule mapping appears later in Proof Rule Approximations.
All four examples share the same geometric loop and @invariant annotation; only kind (proc/coproc) and pre differ, so each one is an Over approximation of the original loop semantics.
- Sound Proof
- Sound Refutation
- Unsound Proof
- Unsound Refutation
Caesar reports sound verification: We show that init_x + 2 is an upper bound (coproc) on the expected value of x on termination (@wp).
The actual expected value is init_x + 1.
@wp coproc sound_proof(init_x: UInt) -> (x: UInt)
pre init_x + 2
post x
{
x = init_x
var cont: Bool = true
@invariant(ite(cont, x + 1, x))
while cont {
cont = flip(0.5)
if cont {x = x + 1 } else { }
}
}
Caesar reports a counter-example to the property: We refute that init_x + 2 is a lower bound (proc) on the expected value of x on termination (@wp).
The actual expected value is init_x + 1.
@wp proc sound_refutation(init_x: UInt) -> (x: UInt)
pre init_x + 2
post x
{
x = init_x
var cont: Bool = true
@invariant(ite(cont, x + 1, x))
while cont {
cont = flip(0.5)
if cont { x = x + 1 } else {}
}
}
Caesar reports unsound verification: We want to verify that init_x + 1 is a lower bound (proc) on the expected value of x on termination (@wp), but Park induction does not give us this guarantee.
@wp proc unsound_proof(init_x: UInt) -> (x: UInt)
pre init_x + 1
post x
{
x = init_x
var cont: Bool = true
@invariant(ite(cont, x + 1, x))
while cont {
cont = flip(0.5)
if cont { x = x + 1 } else {}
}
}
Caesar reports a counter-example to verification: it is a counterexample against init_x + 1 as an upper bound (coproc) on the expected value of x on termination (@wp).
But this is only because the @invariant(1) is not inductive (as Caesar reports), not because the specification does not hold.
@wp coproc unsound_refutation(init_x: UInt) -> (x: UInt)
pre init_x + 1
post x
{
x = init_x
var cont: Bool = true
@invariant(1)
while cont {
cont = flip(0.5)
if cont { x = x + 1 } else {}
}
}
Original Program Semantics
The original semantics () is the semantics of the high-level program, which may feature constructs such as loops and recursion. During verification, we want to obtain sound results with respect to the original semantics. There are different ways to define the original semantics, but here we are mainly concerned with how they differ in their treatment of nonterminating loop and recursion runs.
A Zoo of Original Semantics
We distinguish:
- Least Fixed Point (LFP) Semantics: while loops and recursive calls are interpreted via least fixed points.
- This is used in calculi such as and .
- In short: "nonterminating runs contribute post to the expected value".
- Greatest Fixed Point (GFP) Semantics: while loops and recursive calls are interpreted via greatest fixed points.
- This is used in calculi such as .
- In short: "nonterminating runs contribute post to the expected value".
Calculus Annotations
Caesar supports procedure annotations to make the intended calculus explicit:
@wp: weakest pre-expectation calculus (least fixed points, nontermination contributes0).@wlp: weakest liberal pre-expectation calculus (greatest fixed points, nontermination contributes1).@ert: expected runtime calculus (least fixed points).
These annotations let Caesar check additional soundness conditions for proof-rule usage.
We recommend reading the following literature for formal accounts of the above semantics:
- wp (LFP) and wlp (GFP) Semantics: Advanced Weakest Precondition Calculi for Probabilistic Programs, Chapter 4, Kaminski (2019).
- Least Fixpoint Semantics in Detail: J-P: MDP. FP. PP -- Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs, Batz et al. (2024).
How Caesar Selects Original Semantics
- If a calculus annotation (
@wp,@wlp, or@ert) is present on a (co)proc:- LFP for
@wpand@ert, - GFP for
@wlp.
- LFP for
- Otherwise selected by the proof rule so that verifications are sound (see Proof Rule Approximations).
- E.g. for Induction, GFP semantics are used for procs and LFP semantics for coprocs.
- E.g. for Loop Unrolling, LFP semantics are used for procs and GFP semantics for coprocs.
Approximations, Proofs, and Refutations
Correct results — sound verification or sound refutation — depend on how we approximate the original semantics () by the verification condition semantics (). The verification condition semantics is the semantics of the verification conditions that Caesar generates and reasons about.
Our paper A Deductive Verification Infrastructure for Probabilistic Programs explains the verification condition semantics in detail and covers several of the proof rules.
We distinguish between four kinds of approximations: Exact, Over, Under, or Unknown. Correct results also depend on whether you reason about lower bounds (proc) or upper bounds (coproc). We explain the details in this section.
The figure gives a quick overview: for lower bounds (proc), verification requires Under and refutation requires Over approximations; for upper bounds (coproc), verification requires Over and refutation requires Under approximations. With Exact approximations, both verifications and refutations are sound; with Unknown approximations, neither verifications nor refutations are sound.
Kinds of Approximations
We distinguish four kinds of approximations between the original semantics () and the verification condition semantics (). Let be a HeyVL statement and be a post-expectation.
- Exact: No approximation is performed.
Formally: for all . - Under: The pre-expectation is approximated from below.
Formally: for all . - Over: The pre-expectation is approximated from above.
Formally: for all . - Unknown: None of the above holds.
Approximations are compositional1. For example, the sequential composition of two Exact statements is also Exact. Same for Over and Under. Combining Exact with either Over or Under yields the respective approximation. However, combining Over and Under statements results in an Unknown approximation.
Sound Verifications
When Caesar says that a (co)proc is verified, then a bound on verification condition semantics () has been established.
In a proc, Caesar reasons about lower bounds of the verification condition semantics (), i.e. .
- If is an Under approximation of the original semantics (), then we have sound verifications:
Dually, in a coproc, Caesar reasons about upper bounds of the verification condition semantics (), i.e. .
- If is an Over approximation of the original semantics (), then we have sound verifications:
The results also apply for Exact approximations.
Sound Refutations
When Caesar gives a counterexample for a (co)proc, then a bound on verification condition semantics () has been refuted.
In a proc, a refutation means showing that .
- If is an Over approximation of the original semantics (), then we have sound refutations because cannot be a lower bound of :
Dually, in a coproc, a refutation means showing that .
- If is an Under approximation of the original semantics (), then we have sound refutations because cannot be an upper bound of :
The results also apply for Exact approximations.
Formal Proof: Sound Refutations in procs
Let a proc not verify () and let be an Over approximation (). For the sake of a contradiction, assume .
By transitivity with , we get , hence , contradicting . Therefore .
Proof Rule Approximations
As explained above, proof rules such as @invariant may introduce approximations between the original semantics () and the verification condition semantics ().
Below, we summarize which of Caesar's built-in proof rules induce which approximations, and thus which proof rules are applicable for sound verification/refutation of lower/upper bounds.
| Original Semantics | Approximation | Applicable Proof Rules |
|---|---|---|
| LFP | Over | Procedure Calls (k)-Induction ( @invariant, @k_induction)Positive Almost-Sure Termination Rule ( @past) |
| LFP | Under | Loop Unrolling (@unroll)ω-invariants ( @omega_invariant)Optional Stopping Theorem ( @ost)Almost-Sure Termination Rule ( @ast) |
| GFP | Over | Loop Unrolling (@unroll)ω-invariants ( @omega_invariant) |
| GFP | Under | Procedure Calls (k)-Induction ( @invariant, @k_induction) |
For loops, the loop-body approximation must be compatible with the chosen rule. For example, for k-Induction under LFP, the loop body must be Over so the whole loop is Over. Additionally, the rules Almost-Sure Termination Rule, Positive Almost-Sure Termination Rule, and Optional Stopping Theorem require the loop body to be Exact.
What Is Checked By Caesar
HeyVL is designed as an intermediate verification language and intentionally allows dangerous constructs. See our OOPSLA '23 paper for more background.
Caesar checks many proof-rule soundness conditions automatically, but not all modeling assumptions.
Hard errors:
- In calculus-annotated procedures, calling a procedure with a conflicting calculus annotation is rejected.
- Potentially recursive calls are rejected where Park induction is not sound (
@wp proc,@wlp coproc,@ert proc).
Diagnostics during verification:
- Caesar tracks approximation information per procedure, errors when a proof result may be unsound, and marks counterexamples as potentially spurious when they may not be valid.
Not checked:
- Contradictions make verification trivially succeed — e.g.,
assume ?(false)in aproc; contradictory axioms are a common source. - There is no enforcement that
@ertprocedures containtickstatements, nor that@wp/@wlpprocedures do not.