Skip to main content

Soundness of Proof Rules

Using a proof rule to verify a HeyVL program can introduce approximations compared to the original program semantics. These approximations influence guarantees Caesar gives during verification:

  1. Sound verification: verification succeeds     \implies property holds for original program.
  2. Sound refutation: verification fails     \implies property does not hold for original program (counterexample is valid).

Without sound verification, a property may not hold for the original program (unsound verification). On the other hand, without sound refutations, a counter-example may be spurious (not valid in the original program). However, it still is a counter-example to verification, which can be used to e.g. improve loop invariants.

Most program verifiers aim for sound verifications, but Caesar also supports sound refutations. This is useful when you want to show that a certain expectation is not a lower or upper bound.

This guide gives an overview of Caesar's proof rules and how they affect soundness of verifications and refutations. Caesar gives diagnostics when a verification or refutation might be unsound because of an incompatible proof-rule/calculus setup. For an overview of what Caesar does and does not check automatically, see What Is Checked By Caesar.

Quick Overview

To verify or refute a lower/upper bound (proc/coproc), the proof rules must induce the right approximation (Over/Under) between original semantics (origorig) and verification condition semantics (vcvc). The cards below are the quick reference; the detailed rule mapping appears later in Proof Rule Approximations.

Quick GuidePick approximation by goal and procedure kind.
Example: Sound and Unsound Proofs and Refutations

All four examples share the same geometric loop and @invariant annotation; only kind (proc/coproc) and pre differ, so each one is an Over approximation of the original loop semantics.

Caesar reports sound verification: We show that init_x + 2 is an upper bound (coproc) on the expected value of x on termination (@wp). The actual expected value is init_x + 1.

@wp coproc sound_proof(init_x: UInt) -> (x: UInt)
pre init_x + 2
post x
{
x = init_x
var cont: Bool = true
@invariant(ite(cont, x + 1, x))
while cont {
cont = flip(0.5)
if cont {x = x + 1 } else { }
}
}

Original Program Semantics

The original semantics (origorig) is the semantics of the high-level program, which may feature constructs such as loops and recursion. During verification, we want to obtain sound results with respect to the original semantics. There are different ways to define the original semantics, but here we are mainly concerned with how they differ in their treatment of nonterminating loop and recursion runs.

A Zoo of Original Semantics

We distinguish:

  • Least Fixed Point (LFP) Semantics: while loops and recursive calls are interpreted via least fixed points.
    • This is used in calculi such as wpwp and ertert.
    • In short: "nonterminating runs contribute post 00 to the expected value".
  • Greatest Fixed Point (GFP) Semantics: while loops and recursive calls are interpreted via greatest fixed points.
    • This is used in calculi such as wlpwlp.
    • In short: "nonterminating runs contribute post 11 to the expected value".

Calculus Annotations

Caesar supports procedure annotations to make the intended calculus explicit:

  • @wp: weakest pre-expectation calculus (least fixed points, nontermination contributes 0).
  • @wlp: weakest liberal pre-expectation calculus (greatest fixed points, nontermination contributes 1).
  • @ert: expected runtime calculus (least fixed points).

These annotations let Caesar check additional soundness conditions for proof-rule usage.

RECOMMENDED PAPERS

We recommend reading the following literature for formal accounts of the above semantics:

How Caesar Selects Original Semantics

  1. If a calculus annotation (@wp, @wlp, or @ert) is present on a (co)proc:
    • LFP for @wp and @ert,
    • GFP for @wlp.
  2. Otherwise selected by the proof rule so that verifications are sound (see Proof Rule Approximations).
    • E.g. for Induction, GFP semantics are used for procs and LFP semantics for coprocs.
    • E.g. for Loop Unrolling, LFP semantics are used for procs and GFP semantics for coprocs.

Approximations, Proofs, and Refutations

Correct results — sound verification or sound refutation — depend on how we approximate the original semantics (origorig) by the verification condition semantics (vcvc). The verification condition semantics is the semantics of the verification conditions that Caesar generates and reasons about.

RECOMMENDED READING

Our paper A Deductive Verification Infrastructure for Probabilistic Programs explains the verification condition semantics in detail and covers several of the proof rules.

Overview of which approximation is required to verify/refute lower/upper bounds.

We distinguish between four kinds of approximations: Exact, Over, Under, or Unknown. Correct results also depend on whether you reason about lower bounds (proc) or upper bounds (coproc). We explain the details in this section.

The figure gives a quick overview: for lower bounds (proc), verification requires Under and refutation requires Over approximations; for upper bounds (coproc), verification requires Over and refutation requires Under approximations. With Exact approximations, both verifications and refutations are sound; with Unknown approximations, neither verifications nor refutations are sound.

Kinds of Approximations

Arrows indicate one approximation being stronger than the other.

We distinguish four kinds of approximations between the original semantics (origorig) and the verification condition semantics (vcvc). Let SS be a HeyVL statement and XX be a post-expectation.

  • Exact: No approximation is performed.
    Formally: orig[S](X)=vc[S](X)orig[S](X) = vc[S](X) for all XX.
  • Under: The pre-expectation is approximated from below.
    Formally: orig[S](X)vc[S](X)orig[S](X) \geq vc[S](X) for all XX.
  • Over: The pre-expectation is approximated from above.
    Formally: vc[S](X)orig[S](X)vc[S](X) \geq orig[S](X) for all XX.
  • Unknown: None of the above holds.

Approximations are compositional1. For example, the sequential composition S1; S2S_1;~ S_2 of two Exact statements S1,S2S_1, S_2 is also Exact. Same for Over and Under. Combining Exact with either Over or Under yields the respective approximation. However, combining Over and Under statements results in an Unknown approximation.

Sound Verifications

When Caesar says that a (co)proc is verified, then a bound on verification condition semantics (vcvc) has been established.

In a proc, Caesar reasons about lower bounds prepre of the verification condition semantics (vcvc), i.e. prevc[S](post)pre \leq vc[S](post).

  • If vc[S](post)vc[S](post) is an Under approximation of the original semantics (origorig), then we have sound verifications:
pre    vc[S](post)    orig[S](post).\texttt{pre} \;\leq\; vc[S](post) \;\leq\; orig[S](post).

Dually, in a coproc, Caesar reasons about upper bounds prepre of the verification condition semantics (vcvc), i.e. prevc[S](post)pre \geq vc[S](post).

  • If vc[S](post)vc[S](post) is an Over approximation of the original semantics (origorig), then we have sound verifications:
pre    vc[S](post)    orig[S](post).\texttt{pre} \;\geq\; vc[S](post) \;\geq\; orig[S](post).

The results also apply for Exact approximations.

Sound Refutations

When Caesar gives a counterexample for a (co)proc, then a bound on verification condition semantics (vcvc) has been refuted.

In a proc, a refutation means showing that prevc[S](post)pre \nleq vc[S](post).

  • If vc[S](post)vc[S](post) is an Over approximation of the original semantics (origorig), then we have sound refutations because prepre cannot be a lower bound of orig[S](post)orig[S](post):
pre    vc[S](post)    vc[S](post)    orig[S](post)    pre    orig[S](post).\texttt{pre} \;\nleq\; vc[S](post) \;\land\; vc[S](post) \;\geq\; orig[S](post) \;\Rightarrow\; \texttt{pre} \;\nleq\; orig[S](post).

Dually, in a coproc, a refutation means showing that prevc[S](post)pre \ngeq vc[S](post).

  • If vc[S](post)vc[S](post) is an Under approximation of the original semantics (origorig), then we have sound refutations because prepre cannot be an upper bound of orig[S](post)orig[S](post):
pre    vc[S](post)    vc[S](post)    orig[S](post)    pre    orig[S](post).\texttt{pre} \;\ngeq\; vc[S](post) \;\land\; vc[S](post) \;\leq\; orig[S](post) \;\Rightarrow\; \texttt{pre} \;\ngeq\; orig[S](post).

The results also apply for Exact approximations.

Formal Proof: Sound Refutations in procs

Let a proc not verify (prevc[S](post)pre \nleq vc[S](post)) and let vc[S](post)vc[S](post) be an Over approximation (vc[S](post)orig[S](post)vc[S](post) \geq orig[S](post)). For the sake of a contradiction, assume preorig[S](post)pre \leq orig[S](post).

By transitivity with orig[S](post)vc[S](post)orig[S](post) \leq vc[S](post), we get pre    orig[S](post)    vc[S](post)pre \;\leq\; orig[S](post) \;\leq\; vc[S](post), hence prevc[S](post)pre \leq vc[S](post), contradicting prevc[S](post)pre \nleq vc[S](post). Therefore preorig[S](post)pre \nleq orig[S](post).

Proof Rule Approximations

As explained above, proof rules such as @invariant may introduce approximations between the original semantics (origorig) and the verification condition semantics (vcvc). Below, we summarize which of Caesar's built-in proof rules induce which approximations, and thus which proof rules are applicable for sound verification/refutation of lower/upper bounds.

Original SemanticsApproximationApplicable Proof Rules
LFPOverProcedure Calls
(k)-Induction (@invariant, @k_induction)
Positive Almost-Sure Termination Rule (@past)
LFPUnderLoop Unrolling (@unroll)
ω-invariants (@omega_invariant)
Optional Stopping Theorem (@ost)
Almost-Sure Termination Rule (@ast)
GFPOverLoop Unrolling (@unroll)
ω-invariants (@omega_invariant)
GFPUnderProcedure Calls
(k)-Induction (@invariant, @k_induction)

For loops, the loop-body approximation must be compatible with the chosen rule. For example, for k-Induction under LFP, the loop body must be Over so the whole loop is Over. Additionally, the rules Almost-Sure Termination Rule, Positive Almost-Sure Termination Rule, and Optional Stopping Theorem require the loop body to be Exact.

What Is Checked By Caesar

HeyVL is designed as an intermediate verification language and intentionally allows dangerous constructs. See our OOPSLA '23 paper for more background.

Caesar checks many proof-rule soundness conditions automatically, but not all modeling assumptions.

Hard errors:

  • In calculus-annotated procedures, calling a procedure with a conflicting calculus annotation is rejected.
  • Potentially recursive calls are rejected where Park induction is not sound (@wp proc, @wlp coproc, @ert proc).

Diagnostics during verification:

  • Caesar tracks approximation information per procedure, errors when a proof result may be unsound, and marks counterexamples as potentially spurious when they may not be valid.

Not checked:

  • Contradictions make verification trivially succeed — e.g., assume ?(false) in a proc; contradictory axioms are a common source.
  • There is no enforcement that @ert procedures contain tick statements, nor that @wp/@wlp procedures do not.

Footnotes

  1. Compositionality of approximations follows from the fact that the semantics of HeyVL's statements is monotonic with respect to the expectation ordering. The only exception are non-monotonic negation statements, which always yield an Unknown approximation.