Induction and k-Induction
Induction and the more general -induction are built-in proof rules to reason about while
loops.
The non-probabilistic intuition goes as follows: induction requires an invariant that is maintained by each loop iteration.
The invariant must hold before the loop, and then we are guaranteed that the invariant holds after the loop.
The rule corresponds to the well-known proof rule for loops from Hoare logic.
Generalized to probabilistic wlp
semantics, induction allows us to prove a lower bound on the wlp
-semantics of a loop.
must be an expression whose expected value does not decrease with each loop iteration.
Formally:
A dual version exists for wp
and ert
semantics.
-induction is a strictly stronger version of induction. Refer to the CAV 2021 paper presenting latticed k-induction for more details. It allows us to reason about multiple loop iterations: the invariant is not required to be re-established after just one loop iteration, but may be re-established after , , ..., up to loop iterations.
Usage
Using Induction
To use induction, simply add an @invariant(I)
annotation to your loop with a probabilistic invariant I
.
In this case, it is ite(cont, c + 1, c)
.
@wp coproc geo1_induction(init_c: UInt) -> (c: UInt)
pre init_c + 1
post c
{
c = init_c
var cont: Bool = true
@invariant(ite(cont, c + 1, c))
while cont {
var prob_choice: Bool = flip(0.5)
if prob_choice { cont = false } else { c = c + 1 }
}
}
In a coproc
, Caesar will use the super-invariant version to prove an upper bound on the least fixed-point semantics of the loop (wp
, ert
semantics):
In a proc
, Caesar will use the sub-invariant version to prove a lower bound on the greatest fixed-point semantics of the loop (wlp
semantics):
See the section on soundness below for more information when it is sound to use induction. The section on internal details below explains the precise encoding and behavior also when the invariant is not inductive.
Using k-Induction
To use k-induction, you can use the @k_induction(k, I)
annotation on a loop.
It takes two parameters: The number literal k
specifies the number of unrollings are to be done and I
is the invariant.
@wp coproc geo1_2induction(init_c: UInt) -> (c: UInt)
pre init_c + 1
post c
{
c = init_c
var cont: Bool = true
@k_induction(2, c + 1)
while cont {
var prob_choice: Bool = flip(0.5)
if prob_choice { cont = false } else { c = c + 1 }
}
}
This program will not verify if k
is set to 1
because the invariant c + 1
is not 1-inductive.
Just like induction, the @k_induction
annotation will result in a HeyVL encoding of the loop that checks whether the invariant is inductive.
Again, Caesar will encode the super-invariant version for upper bounds in a coproc
and the sub-invariant version for lower bounds in a proc
.
Check the soundness section and the details section below for more details.
We recommend reading the Latticed k-induction paper for more details on the principle of k-induction applied to the verification of probabilistic programs.
Soundness
Use the calculus annotations @wp
, @wlp
, @ert
to have Caesar check that the applied proof rules are sound with respect to the semantics of the chosen calculus.
Then you don't have to worry about this section yourself.
For all loops and all invariant candidates , the following holds:
- In
proc
s: — (wlp uses greatest fixed-point semantics),- Thus:
proc
verifies using@invariant(I)
specification also holds for the originalwlp
semantics.
- Thus:
- In
coproc
s: — (wp uses least fixed-point semantics),- Thus:
coproc
verifies using@invariant(I)
specification also holds for the originalwlp
semantics.
- Thus:
- In
coproc
s: — (ert uses least fixed-point semantics),- Thus:
coproc
verifies using@invariant(I)
specification also holds for the originalert
semantics.
- Thus:
Stated in terms of fixed points:
- In
proc
s,@invariant
under-approximates the greatest fixed-point loop semantics, - In
coproc
s,@invariant
over-approximates the least fixed-point loop semantics.
The same statements hold for k-induction (@k_induction(k, I)
).
Internal Details
The following section describes the internal details of the encoding of the @invariant
and @k_induction
proof rules in Caesar.
We describe the encoding of the proof rules in HeyVL, the verification pre-expectation semantics of the encoding, and a more detailed proof of soundness of the encoding.
More polished formal details on the HeyVL encodings and the (simplified) semantics of the above proof rules are provided in the extended version of our OOPSLA '23 paper, both in the main text and in Appendix C.
For most of this section, we focus on the encoding of loops with @invariant
in a proc
below, i.e. Park induction for lower bounds on greatest fixed-point semantics.
The coproc
/least-fixed point case is dual.
k-induction details are similar, and handled at the end of this section.
Let @invariant(I) while G { Body }
be a loop with an invariant candidate I
.
HeyVL Encoding
Internally, the loop with the @invariant
annotation will be replaced in the following HeyVL encoding:
assert I // I must hold before the loop
havoc modified_vars // forget values of all variables modified by the loop
validate
assume I // assume I holds before each iteration
if G {
Body
assert I // I must hold after the loop body
assume ?(false) // nothing else to establish but I
} else {}
The comments indicate a classical (Boolean) interpretation of the encoding:
- The
assert I
statement checks that the invariant holds before the loop. - The
havoc modified_vars
statement forgets the values of all variables that are modified by the loop body. - The
validate
statement ensures that the following inductivity check is "classically" valid. - The
assume I
statement checks that the invariant holds before each loop iteration. - The
if G { Body ... } else {}
statement executes the loop body if the guardG
holds. - The
assert I
statement checks that the invariant holds after the loop body. - The
assume ?(false)
statement says that there is nothing else to establish after the loop body, i.e., the invariant is the only thing we care about.
In the following, we dive into the formal details of the quantitative semantics and soundness of this encoding.
Verification Pre-Expectation Semantics
In effect, the semantics of the above proc
encoding evaluates to the following value in state w.r.t. post :
The two cases:
- If the invariant is inductive in the initial state , then the encoding evaluates to (the value of the invariant in that state).
- If the invariant is not inductive, then the encoding evaluates to .
The check for whether is inductive corresponds to the part of the encoding above starting with the assume I
statement.
In terms of wlp
semantics, the common1 definition of inductivity can be expressed as
Local Inductive Invariants
Caesar's encoding provides a more refined version of the inductivity check. Intuitively, an invariant is locally inductive if it is inductive in the initial state and also in all states reachable from .
Reachability is over-approximated in this encoding by the set of states that can be reached from by modifying the variables to arbitrary values. Formally, we consider the set of states that agree with on all variables except those in :
Therefore, in the above semantics, is an inductive invariant w.r.t. from (for lower bounds) in a proc
is formally stated as follows:
If the above condition holds, the one can apply the theorem of Park induction on the sub-lattice of expectations defined on only the states to obtain for state :2
where is defined as . We call this technique local Park induction.
Example: The Advantage of Local Invariants
Consider the following example, where an upper bound is verified on the expected runtime of a geometric loop.
@ert coproc geo_past(p: UReal) -> (c: UInt)
pre ite(p < 1, 1/(1-p), \infty)
post 0
{
c = 0
var cont: Bool = true
@invariant(ite(cont, 1/(1-p), 0))
while cont {
p = p // removing this makes the program verify
reward 1
cont = flip(p)
if cont { c = c + 1 } else {}
}
}
The above example works only because we check for local inductivity of the invariant ite(cont, 1/(1-p), 0)
.
Here, local inductivity allows Caesar to use the fact that holds during the loop.
For a counter-example to verification, consider the modified program below which modifies the variable p
inside the loop.
It is modified by a trivial assignment p = p
, so it does not change the loop's actual semantics.
However, Caesar will now havoc
the variable p
as well, which means that the invariant ite(cont, 1/(1-p), 0)
is no longer inductive.
@ert coproc geo_past(init_p: UReal) -> (c: UInt)
pre ite(init_p < 1, 1/(1-init_p), \infty)
post 0
{
var p: UReal = init_p
c = 0
var cont: Bool = true
@invariant(ite(cont, 1/(1-p), 0))
while cont {
p = p // trivial assignment, but modifies `p`
reward 1
cont = flip(p)
if cont { c = c + 1 } else {}
}
}
The counter-example from Caesar shows that the variable p
in the loop was assigned the value 2
, clearly indicating the global knowledge has been lost.
Using the invariant annotation @invariant(ite(p < 1, ite(cont, 1/(1-p), 0), \infty))
re-adds the lost information and makes the above program verify again.
Soundness of the Encoding
The above semantics guarantees soundness of the @invariant
proof rule in proc
s for all post-expectations and initial states :
In case 1 of the semantics of the above encoding, the invariant is inductive w.r.t. from .
The encoding evaluates to .
By local Park induction, this is a lower bound on the wlp
-semantics of the loop, thus soundness holds.
In case 2, is not inductive.
However, is the trivial lower bound on the wlp
-semantics of a loop, thus soundness holds trivially.
The cases are dual for upper bound semantics in coproc
s: If the invariant is not inductive, then the trivial upper bound is returned.
If the invariant is inductive, is returned as before.
k-Induction Encoding and Interpretation
The encoding is similar for k-induction (@k_induction
).
It also evaluates to I
if the invariant is k-inductive and to otherwise.
Refer to the k-induction paper for the precise definition and properties of k-inductive invariants.
For 2-induction for lower bounds, one obtains an encoding of the following form:
assert I // I must hold before the loop
havoc modified_vars // forget values of all variables modified by the loop
validate
assume I // assume I holds before each iteration
if G {
Body
if ⊔ { // angelic choice
assert I // I must hold after the loop body
assume ?(false) // nothing else to establish but I
} else {
if G {
Body
assert I // I must hold after the loop body
assume ?(false) // nothing else to establish but I
} else {}
}
} else {}
One can go for an intuitive Boolean interpretation of this encoding of 2-induction. It is similar to the interpretation for (1-)induction, with the relevant difference in step 6 (marked in bold):
- The
assert I
statement checks that the invariant holds before the loop. - The
havoc modified_vars
statement forgets the values of all variables that are modified by the loop body. - The
validate
statement ensures that the following inductivity check is "classically" valid. - The
assume I
statement checks that the invariant holds before each loop iteration. - The
if G { Body ... } else {}
statement executes the loop body if the guardG
holds. - The
if ⊔ { ... } else { ... }
allows choosing angelically whether to run the loop functional once or twice. - The
assert I
statements check that the invariant holds after each branch of the loop body. - The
assume ?(false)
statements say that there is nothing else to establish after the loop body, i.e., the invariant is the only thing we care about.
Choosing higher values of k
in the @k_induction(k, I)
annotation will result in a similar encoding with nested angelic choices, allowing to run the loop body up to times to establish the invariant again.
Note that the angelic choices if ⊔ { ... } else { ... }
can also be rewritten via coassert
statements.
For the encoding of the loop iteration:
... // omitted
if G {
Body
coassert I // may stop here and establish I
if G {
Body
assert I // I must hold after the loop body
assume ?(false) // nothing else to establish but I
} else {}
}
The above explanation follows the version of -induction for lower bounds on greatest fixed-point semantics.
However, in the literature, -induction is often defined for upper bounds on least fixed-point semantics.3
Then, the explanation is dual, and the encoding is similar, but essentially just with the angelic choices if ⊔ { ... } else { ... }
replaced by demonic choices if ⊓ { ... } else { ... }
.
Footnotes
-
A thorough exposition on the usual definition of inductive invariants for probabilistic programming is found in Benjamin Kaminski's PhD thesis at Definition 5.1. ↩
-
Formally, one maps every expectation to an expectation in the sub-lattice local to with modified variables : where is given by if and otherwise. In addition, one needs to adjust the definition of to be defined on instead of . ↩
-
For example, the Latticed k-induction paper defines and interprets latticed -induction only for upper bounds on least fixed-point semantics. ↩