Replay and Verification
Overview
Section titled “Overview”Replay is how you re-derive the world from recorded observations. Verification is how you prove the derived world is correct. Together they give you a reproducible, auditable proof that your agent logic does what you intend — without reading the generated rules.
This guide covers the mechanics of both and shows how to integrate them into a CI pipeline. It builds on the Fixtures and Invariants guide, which covers writing fixtures and declaring invariants.
How Replay Works
Section titled “How Replay Works”Replay feeds observations through the mapper and evaluator in strict order, producing the full derived state from scratch.
jacqos replay fixtures/happy-path.jsonlThe replay pipeline processes each observation sequentially:
Observation → Mapper → Atoms → Evaluator → Fixed point → Next observationFor each observation:
- The mapper extracts atoms — your Rhai mapper transforms the raw observation payload into semantic atoms
- The evaluator runs to a fixed point — all
.dhrules fire until no new facts, retractions, or intents can be derived - Invariants are checked — every invariant must hold after each fixed point
- The next observation is processed — and the cycle repeats
After all observations are processed, the resulting world state contains every derived fact, every fired intent, and the full provenance graph linking them back to their source observations.
Effects During Replay
Section titled “Effects During Replay”Live effects do not execute during ordinary fixture replay. When the evaluator derives an intent that would normally trigger an effect (an HTTP call, an LLM completion), replay uses recorded provider captures and observations instead. This is what makes replay deterministic — the same fixture always produces the same world state, regardless of external service availability.
If a fixture contains effect-producing intents but no recorded captures or outcome observations, replay reports the missing effect observations as warnings.
Replay From Checkpoints
Section titled “Replay From Checkpoints”For large observation histories, replaying from the beginning can be slow. JacqOS supports checkpoint-based replay:
# Full replay from scratchjacqos replay fixtures/happy-path.jsonl
# The evaluator can checkpoint intermediate state# and resume from the last stable checkpointCheckpoints store the evaluator’s intermediate state (facts, provenance edges, stratum progress) at a specific observation boundary. Resuming from a checkpoint skips the observations that were already processed. The final world state is identical whether you replay from scratch or from a checkpoint — this is verified by the determinism check in jacqos verify.
Determinism Guarantees
Section titled “Determinism Guarantees”Replay determinism is a hard guarantee, not an aspiration. The same observations, evaluated by the same evaluator, always produce byte-identical derived state.
What this means concretely:
| Same | Result |
|---|---|
| Same observations + same evaluator digest | Byte-identical facts, intents, and provenance |
| Same observations + different evaluator digest | Different derived state (by design — the rules changed) |
| Same evaluator digest + different observations | Different derived state (by design — the evidence changed) |
What the Determinism Check Verifies
Section titled “What the Determinism Check Verifies”jacqos verify runs every fixture replay twice — once as part of the normal verification pipeline, and once from a clean database. The two runs must produce identical world digests:
Run 1: normal replay → world_digest_aRun 2: clean-db replay → world_digest_bassert world_digest_a == world_digest_bIf these diverge, something in the pipeline is non-deterministic — a mapper is using ambient state, a helper has side effects, or the evaluator has a bug. The determinism check catches all of these.
What Contributes to the World Digest
Section titled “What Contributes to the World Digest”The world digest is a cryptographic hash covering:
- Every derived fact (relation name, arguments, assertion/retraction status)
- Every derived intent
- The evaluator digest that produced them
- The observation sequence that was replayed
Two world digests match if and only if the derived state is byte-identical.
The Verification Pipeline
Section titled “The Verification Pipeline”jacqos verify runs ten check families across the selected fixture corpus:
| Check | What it verifies |
|---|---|
| Fixture replay | Every fixture replays without errors |
| Golden fixtures | Derived state matches expected facts and intents |
| Invariants | All invariants hold after every fixed point |
| Candidate-authority lints | Acceptance-gated evidence never skips the required candidate.* acceptance boundary |
| Provenance bundle | Provenance graph is exported for each fixture |
| Replay determinism | Clean-database replay produces identical world digest |
| Generated scenarios | Property-tested observation sequences find no violations |
| Shadow reference | Shadow evaluator agrees with the product evaluator |
| Secret redaction | No secret material appears in verification artifacts |
| Composition | Multi-agent namespace composition passes, fails, or is skipped when the app has only one agent-owned namespace |
A fixture passes only if every applicable check passes. The overall verification passes only if every fixture passes and every non-skipped global gate stays green.
Shadow Reference Evaluator
Section titled “Shadow Reference Evaluator”The shadow reference evaluator is a second, independent evaluator that processes the same observations and must produce the same derived state. This catches implementation bugs in the primary evaluator — if the shadow disagrees, something is wrong.
The shadow evaluator comparison runs automatically during jacqos verify. You don’t need to configure it.
Generated Scenarios
Section titled “Generated Scenarios”Beyond replaying your defined fixtures, jacqos verify generates random observation sequences and checks that all invariants hold. When a generated sequence violates an invariant, the verifier shrinks it to a minimal counterexample.
Verification Bundle Format
Section titled “Verification Bundle Format”Every jacqos verify run produces a verification bundle — a JSON artifact containing the complete proof of the verification run. Bundles are written to generated/verification/.
jacqos verify# => Wrote verification bundle to generated/verification/<app-id>.jsonThe bundle filename is <app_id>.json, where app_id is the value declared in jacqos.toml. For an app with app_id = "jacqos-appointment-booking", the bundle lands at generated/verification/jacqos-appointment-booking.json.
Bundle Structure
Section titled “Bundle Structure”{ "version": "jacqos_verify_v1", "app_id": "my-booking-app", "evaluator_digest": "sha256:a1b2c3...", "prompt_bundle_digest": "sha256:d4e5f6...", "llm_complete_active": false, "status": "passed", "composition_analysis_path": "generated/verification/composition-analysis-sha256-<digest>.json", "composition_analysis": { ... }, "summary": { ... }, "checks": [ ... ], "redaction_findings": [], "fixtures": [ ... ]}Top-Level Fields
Section titled “Top-Level Fields”| Field | Description |
|---|---|
version | Bundle format version (jacqos_verify_v1) |
app_id | Application identifier from jacqos.toml |
evaluator_digest | Hash of the ontology IR, mapper semantics, and helper digests |
prompt_bundle_digest | Hash of prompt files (present only if prompts exist) |
llm_complete_active | Whether the llm.complete capability is declared |
status | passed, failed, or skipped |
composition_analysis_path | Relative path to the companion composition-analysis report when the composition check passed |
composition_analysis | Embedded composition-analysis artifact when the composition check passed |
summary | Aggregate counts and fixture-level summaries |
checks | The verification checks with passed/failed/skipped status and detail text |
redaction_findings | Any secret material detected in artifacts |
fixtures | Per-fixture verification artifacts |
The persisted bundle intentionally omits wall-clock timestamps and durations so
re-running verification does not create noisy diffs in checked-in proof
artifacts. Use jacqos export benchmark-report when you need runtime timings.
When the composition gate passes, jacqos verify also writes
generated/verification/composition-analysis-sha256-<evaluator_digest>.json and
embeds the same portable report into the bundle. That report is static with
respect to store history: it depends on the ontology and fixture corpus, not on
the current SQLite state.
Per-Fixture Artifacts
Section titled “Per-Fixture Artifacts”Each entry in the fixtures array contains the complete verification evidence for one fixture:
| Field | Description |
|---|---|
fixture | Relative path to the fixture file |
status | passed or failed |
observation_digest | Hash of the observation sequence |
world_digest | Hash of the derived world state |
replay | Replay summary (observation, atom, fact, intent counts) |
golden | Golden fixture comparison (expected vs. actual) |
determinism | Determinism check result |
shadow | Shadow evaluator conformance result |
generated_scenarios | Property testing results |
invariant_failures | Detailed invariant violation reports |
provenance_graph | Full provenance graph export |
The provenance graph in each fixture artifact contains every derivation edge — from observations to atoms to facts to intents. This is the same data Studio surfaces in the drill inspector and timeline. Visual graph rendering ships in V1.1.
Verifying a Pinned Composition Report
Section titled “Verifying a Pinned Composition Report”For multi-agent apps, the composition gate produces an auditable artifact you can pin in source control. Re-run jacqos verify with --composition-report to confirm the pinned report still matches the current ontology and fixture corpus:
jacqos verify --composition-report \ generated/verification/composition-analysis-sha256-<evaluator_digest>.jsonThe flag reuses the existing report as the expected baseline. The verify run regenerates the composition analysis from the current sources and fails if the regenerated artifact diverges from the pinned report. Use it when you want one command to confirm both that the app passes verification and that no agent-owned namespace, cross-namespace dependency, or invariant-coverage value has shifted since the report was checked in.
The report is static with respect to store history. It records:
- Namespace reduct partitions — which relations belong to which agent-owned namespace
- Cross-namespace dependencies — every edge that crosses a namespace boundary, with monotonicity labels
- Namespace-cycle severity — whether any cross-namespace cycle violates the composition contract
- Invariant fixture coverage — which invariants are exercised by which fixtures
Pin the report whenever a multi-agent app reaches a known-good shape. Any future change that perturbs the composition surface will fail verification with a precise diff against the pinned baseline. Apps with zero or one agent-owned namespace skip this gate; for them, --composition-report is unnecessary.
Standalone generation is also available: jacqos composition check --report <path> writes the report without running the rest of jacqos verify, and jacqos composition verify-report <report> validates a pinned report on its own. See the debugging workflow for the full inspection loop.
CI Integration
Section titled “CI Integration”Verification bundles are designed for CI pipelines. The jacqos verify exit code tells your CI whether the build passes:
| Exit code | Meaning |
|---|---|
0 | All checks passed |
2 | Verification failures (fixture, invariant, or determinism) |
1 | Other error (missing fixtures, configuration issue) |
Basic CI Setup
Section titled “Basic CI Setup”# GitHub Actions examplename: Verifyon: [push, pull_request]
jobs: verify: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
- name: Install JacqOS run: | curl -fsSL https://www.jacqos.io/install.sh | sh
- name: Verify run: jacqos verify
- name: Upload verification bundle if: always() uses: actions/upload-artifact@v4 with: name: verification-bundle path: generated/verification/What to Store as CI Artifacts
Section titled “What to Store as CI Artifacts”Store the verification bundle (generated/verification/*.json) as a build artifact. It contains everything needed to understand what was verified and whether it passed:
- The evaluator digest — which version of the rules was tested
- Per-fixture world digests — the exact derived state for each scenario
- Invariant check results — which invariants were exercised and how many values were tested
- Counterexamples — any generated scenarios that found violations
- Redaction audit — proof that no secrets leaked into artifacts
Verification as a PR Gate
Section titled “Verification as a PR Gate”Use the verification digest as a merge gate. Two PRs that produce the same evaluator digest and verification digest are semantically equivalent — they derive the same facts from the same observations.
- name: Verify and check digest run: | jacqos verify # The bundle includes the evaluator digest and per-fixture world digests # Your review process can compare these against the base branchComparing Across Branches
Section titled “Comparing Across Branches”To understand what a code change does to derived state:
- Run
jacqos verifyon the base branch — save the verification bundle - Run
jacqos verifyon the feature branch — save the verification bundle - Compare the evaluator digests — if they match, the semantic behavior is identical
- If they differ, compare per-fixture world digests to see which scenarios changed
This is the CI equivalent of the Activity Compare lens coming in V1.1 — same concept, machine-readable format.
Verifying Live Ingress
Section titled “Verifying Live Ingress”Live ingress should still end in a fixture proof. A serve run gives you
operational handles such as run_id, SSE event_id, and adapter receipts, but
those are not the semantic contract. The contract is the observation sequence
and the derived model.
For a live path, keep the replay loop explicit:
jacqos observe --jsonl fixtures/shared-reality.jsonl --lineage live-demo --create-lineage --jsonjacqos run --lineage live-demo --once --shadow --jsonjacqos replay fixtures/shared-reality.jsonljacqos verifyWhen you convert a chat session, webhook delivery, or multi-agent subscriber
scenario into a fixture, preserve the observations that crossed the mapper
boundary. Do not assert on local run_id values or SSE event ids. Assert on
facts, intents, invariant violations, contradictions, and effect receipts.
This is what makes live debugging and CI agree: Studio can inspect the live
serve surfaces, while jacqos verify proves the same behavior from a clean
observation history.
Exporting Verification Bundles
Section titled “Exporting Verification Bundles”You can export the verification bundle separately from running verification:
# Run verification (always writes to generated/verification/)jacqos verify
# Export as a standalone artifactjacqos export verification-bundleThe exported bundle is the same JSON artifact that jacqos verify writes. The export subcommand is useful when you want to ship the bundle to a different location or system.
Worked Example: CI for Appointment Booking
Section titled “Worked Example: CI for Appointment Booking”Here’s a complete CI workflow for the appointment-booking app:
name: Appointment Booking Verificationon: push: paths: - 'ontology/**' - 'mappings/**' - 'helpers/**' - 'fixtures/**' - 'jacqos.toml'
jobs: verify: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4
- name: Install JacqOS run: curl -fsSL https://www.jacqos.io/install.sh | sh
- name: Replay happy path run: jacqos replay fixtures/happy-path.jsonl
- name: Replay contradiction path run: jacqos replay fixtures/contradiction-path.jsonl
- name: Full verification run: jacqos verify
- name: Upload bundle if: always() uses: actions/upload-artifact@v4 with: name: verification-bundle path: generated/verification/The replay steps are optional — jacqos verify runs all fixtures automatically. Running them separately gives you per-fixture timing and output in the CI log.
Going Deeper
Section titled “Going Deeper”When a verification run fails, your next step is the debugging loop. When it passes, your next step is locking the result in.
- Debugging Workflow — the end-to-end loop for turning a failed
jacqos verifyinto a fix, including provenance drill-downs and pinned composition-report inspection - Debugging with Provenance — tracing facts and intents back to the exact observations that produced them
- Golden Fixtures — concept deep-dive on digest-backed behavior proof
- Invariant Review — universal constraints that hold across all evaluation states
- Fixtures and Invariants — practical guide to writing fixtures and declaring invariants
- Evaluation Package — the portable contract boundary that ships with each verified app
- CLI Reference — every flag on
jacqos replay,jacqos verify,jacqos composition, andjacqos export