OOPSLA '23 Artifact Evaluation Guide
Welcome to the artifact for our OOPSLA '23 submission "A Deductive Verification Infrastructure for Probabilistic Programs".
Contained within the artifact:
- Our tool Caesar, which parses HeyVL programs and tries to verify them. Caesar constitutes our main implementation contribution and is the focus of this artifact.
- A script to reproduce our benchmarks (Table 2).
- We also include our prototypical tool pgcl2heyvl, which takes pGCL programs with annotations and produces a HeyVL file that encodes the required proof obligations.
- Our full source code is contained within the artifact as well.
Contents
- Getting Started
- Piece by Piece: How Our Artifact Supports the Paper's Claims
- Running Our Benchmarks
- The Caesar Tool and Its Source Code
- pgcl2heyvl: Generating HeyVL Files From pGCL
- A Guide to Custom Examples
- Appendix: Accepted pGCL Syntax by the pglc2heyvl Tool
1. Getting Started
Requirements.
- We use Docker, and provide images for both x86 and ARM architectures.
- 16 GB of RAM, enough disk space for the artifact.
- Our benchmark set should terminate in under 10 minutes.
- Note: We provide an x86 Docker image. On ARM machines, Docker will run it in a virtual machine and will print a warning. In these setups, we have observed a slowdown of about 10x.
Downloading the artifact.
Either download from Zenodo and then run docker image load -i caesar.tar.gz
.
Alternatively, via Github packages:
docker pull ghcr.io/philipp15b/caesar:oopsla23-aec --platform linux/amd64
Entering the artifact environment.
Simply run the caesar
image with Docker.
This will open a bash
shell in the /root/caesar
directory with the caesar
and pgcl2heyvl
commands available.
docker run -it ghcr.io/philipp15b/caesar:oopsla23-aec
The image is based on on Debian Bullseye (slim), so the apt
package manager is available.
The editors vim
and nano
are installed already.
Running the benchmarks. To reproduce our benchmarks (Table 2), execute
fish run-benchmarks.fish
The script will run the list of benchmarks specified in benchmarks.txt
in sequence (usually in < 10min).
After completion, the results will be printed as an ASCII table to the terminal as well to the CSV file benchmark-results.csv
.
Documentation.
- We provide more detailed usage and syntax instructions in our documentation. It is available online and the source code can be found in
docs/src
(Markdown files). - Caesar has Rustdoc documentation, but we do not include the generated files or the Rust compiler in this artifact.
2. Piece by Piece: How Our Artifact Supports the Paper's Claims
Section 5.2 of our paper states our key claims with respect to this artifact.
In this document,
- Section 2.1. explains how to run our benchmarks (Table 2).
- Section 2.3. explains how to automatically generate HeyVL files using our prototypical frontend pgcl2heyvl.
2.1. Running Our Benchmarks
To reproduce our benchmarks (Table 2), execute
fish run-benchmarks.fish
The script will run the list of benchmarks specified in benchmarks.txt
in sequence.
After completion (usually in < 10min), the results will be printed as an ASCII table to the terminal as well to the CSV file benchmark-results.csv
.
Note: To allow reproducing results on slower machines and virtualized environments, we increased the timeout for each benchmark from 10s to 60s.