Skip to content

Replay and Verification

Replay is how you re-derive the world from recorded observations. Verification is how you prove the derived world is correct. Together they give you a reproducible, auditable proof that your agent logic does what you intend — without reading the generated rules.

This guide covers the mechanics of both and shows how to integrate them into a CI pipeline. It builds on the Fixtures and Invariants guide, which covers writing fixtures and declaring invariants.

Replay feeds observations through the mapper and evaluator in strict order, producing the full derived state from scratch.

Terminal window
jacqos replay fixtures/happy-path.jsonl

The replay pipeline processes each observation sequentially:

Observation → Mapper → Atoms → Evaluator → Fixed point → Next observation

For each observation:

  1. The mapper extracts atoms — your Rhai mapper transforms the raw observation payload into semantic atoms
  2. The evaluator runs to a fixed point — all .dh rules fire until no new facts, retractions, or intents can be derived
  3. Invariants are checked — every invariant must hold after each fixed point
  4. The next observation is processed — and the cycle repeats

After all observations are processed, the resulting world state contains every derived fact, every fired intent, and the full provenance graph linking them back to their source observations.

Live effects do not execute during ordinary fixture replay. When the evaluator derives an intent that would normally trigger an effect (an HTTP call, an LLM completion), replay uses recorded provider captures and observations instead. This is what makes replay deterministic — the same fixture always produces the same world state, regardless of external service availability.

If a fixture contains effect-producing intents but no recorded captures or outcome observations, replay reports the missing effect observations as warnings.

For large observation histories, replaying from the beginning can be slow. JacqOS supports checkpoint-based replay:

Terminal window
# Full replay from scratch
jacqos replay fixtures/happy-path.jsonl
# The evaluator can checkpoint intermediate state
# and resume from the last stable checkpoint

Checkpoints store the evaluator’s intermediate state (facts, provenance edges, stratum progress) at a specific observation boundary. Resuming from a checkpoint skips the observations that were already processed. The final world state is identical whether you replay from scratch or from a checkpoint — this is verified by the determinism check in jacqos verify.

Replay determinism is a hard guarantee, not an aspiration. The same observations, evaluated by the same evaluator, always produce byte-identical derived state.

What this means concretely:

SameResult
Same observations + same evaluator digestByte-identical facts, intents, and provenance
Same observations + different evaluator digestDifferent derived state (by design — the rules changed)
Same evaluator digest + different observationsDifferent derived state (by design — the evidence changed)

jacqos verify runs every fixture replay twice — once as part of the normal verification pipeline, and once from a clean database. The two runs must produce identical world digests:

Run 1: normal replay → world_digest_a
Run 2: clean-db replay → world_digest_b
assert world_digest_a == world_digest_b

If these diverge, something in the pipeline is non-deterministic — a mapper is using ambient state, a helper has side effects, or the evaluator has a bug. The determinism check catches all of these.

The world digest is a cryptographic hash covering:

  • Every derived fact (relation name, arguments, assertion/retraction status)
  • Every derived intent
  • The evaluator digest that produced them
  • The observation sequence that was replayed

Two world digests match if and only if the derived state is byte-identical.

jacqos verify runs ten check families across the selected fixture corpus:

CheckWhat it verifies
Fixture replayEvery fixture replays without errors
Golden fixturesDerived state matches expected facts and intents
InvariantsAll invariants hold after every fixed point
Candidate-authority lintsAcceptance-gated evidence never skips the required candidate.* acceptance boundary
Provenance bundleProvenance graph is exported for each fixture
Replay determinismClean-database replay produces identical world digest
Generated scenariosProperty-tested observation sequences find no violations
Shadow referenceShadow evaluator agrees with the product evaluator
Secret redactionNo secret material appears in verification artifacts
CompositionMulti-agent namespace composition passes, fails, or is skipped when the app has only one agent-owned namespace

A fixture passes only if every applicable check passes. The overall verification passes only if every fixture passes and every non-skipped global gate stays green.

The shadow reference evaluator is a second, independent evaluator that processes the same observations and must produce the same derived state. This catches implementation bugs in the primary evaluator — if the shadow disagrees, something is wrong.

The shadow evaluator comparison runs automatically during jacqos verify. You don’t need to configure it.

Beyond replaying your defined fixtures, jacqos verify generates random observation sequences and checks that all invariants hold. When a generated sequence violates an invariant, the verifier shrinks it to a minimal counterexample.

Every jacqos verify run produces a verification bundle — a JSON artifact containing the complete proof of the verification run. Bundles are written to generated/verification/.

Terminal window
jacqos verify
# => Wrote verification bundle to generated/verification/<app-id>.json

The bundle filename is <app_id>.json, where app_id is the value declared in jacqos.toml. For an app with app_id = "jacqos-appointment-booking", the bundle lands at generated/verification/jacqos-appointment-booking.json.

{
"version": "jacqos_verify_v1",
"app_id": "my-booking-app",
"evaluator_digest": "sha256:a1b2c3...",
"prompt_bundle_digest": "sha256:d4e5f6...",
"llm_complete_active": false,
"status": "passed",
"composition_analysis_path": "generated/verification/composition-analysis-sha256-<digest>.json",
"composition_analysis": { ... },
"summary": { ... },
"checks": [ ... ],
"redaction_findings": [],
"fixtures": [ ... ]
}
FieldDescription
versionBundle format version (jacqos_verify_v1)
app_idApplication identifier from jacqos.toml
evaluator_digestHash of the ontology IR, mapper semantics, and helper digests
prompt_bundle_digestHash of prompt files (present only if prompts exist)
llm_complete_activeWhether the llm.complete capability is declared
statuspassed, failed, or skipped
composition_analysis_pathRelative path to the companion composition-analysis report when the composition check passed
composition_analysisEmbedded composition-analysis artifact when the composition check passed
summaryAggregate counts and fixture-level summaries
checksThe verification checks with passed/failed/skipped status and detail text
redaction_findingsAny secret material detected in artifacts
fixturesPer-fixture verification artifacts

The persisted bundle intentionally omits wall-clock timestamps and durations so re-running verification does not create noisy diffs in checked-in proof artifacts. Use jacqos export benchmark-report when you need runtime timings.

When the composition gate passes, jacqos verify also writes generated/verification/composition-analysis-sha256-<evaluator_digest>.json and embeds the same portable report into the bundle. That report is static with respect to store history: it depends on the ontology and fixture corpus, not on the current SQLite state.

Each entry in the fixtures array contains the complete verification evidence for one fixture:

FieldDescription
fixtureRelative path to the fixture file
statuspassed or failed
observation_digestHash of the observation sequence
world_digestHash of the derived world state
replayReplay summary (observation, atom, fact, intent counts)
goldenGolden fixture comparison (expected vs. actual)
determinismDeterminism check result
shadowShadow evaluator conformance result
generated_scenariosProperty testing results
invariant_failuresDetailed invariant violation reports
provenance_graphFull provenance graph export

The provenance graph in each fixture artifact contains every derivation edge — from observations to atoms to facts to intents. This is the same data Studio surfaces in the drill inspector and timeline. Visual graph rendering ships in V1.1.

For multi-agent apps, the composition gate produces an auditable artifact you can pin in source control. Re-run jacqos verify with --composition-report to confirm the pinned report still matches the current ontology and fixture corpus:

Terminal window
jacqos verify --composition-report \
generated/verification/composition-analysis-sha256-<evaluator_digest>.json

The flag reuses the existing report as the expected baseline. The verify run regenerates the composition analysis from the current sources and fails if the regenerated artifact diverges from the pinned report. Use it when you want one command to confirm both that the app passes verification and that no agent-owned namespace, cross-namespace dependency, or invariant-coverage value has shifted since the report was checked in.

The report is static with respect to store history. It records:

  • Namespace reduct partitions — which relations belong to which agent-owned namespace
  • Cross-namespace dependencies — every edge that crosses a namespace boundary, with monotonicity labels
  • Namespace-cycle severity — whether any cross-namespace cycle violates the composition contract
  • Invariant fixture coverage — which invariants are exercised by which fixtures

Pin the report whenever a multi-agent app reaches a known-good shape. Any future change that perturbs the composition surface will fail verification with a precise diff against the pinned baseline. Apps with zero or one agent-owned namespace skip this gate; for them, --composition-report is unnecessary.

Standalone generation is also available: jacqos composition check --report <path> writes the report without running the rest of jacqos verify, and jacqos composition verify-report <report> validates a pinned report on its own. See the debugging workflow for the full inspection loop.

Verification bundles are designed for CI pipelines. The jacqos verify exit code tells your CI whether the build passes:

Exit codeMeaning
0All checks passed
2Verification failures (fixture, invariant, or determinism)
1Other error (missing fixtures, configuration issue)
# GitHub Actions example
name: Verify
on: [push, pull_request]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install JacqOS
run: |
curl -fsSL https://www.jacqos.io/install.sh | sh
- name: Verify
run: jacqos verify
- name: Upload verification bundle
if: always()
uses: actions/upload-artifact@v4
with:
name: verification-bundle
path: generated/verification/

Store the verification bundle (generated/verification/*.json) as a build artifact. It contains everything needed to understand what was verified and whether it passed:

  • The evaluator digest — which version of the rules was tested
  • Per-fixture world digests — the exact derived state for each scenario
  • Invariant check results — which invariants were exercised and how many values were tested
  • Counterexamples — any generated scenarios that found violations
  • Redaction audit — proof that no secrets leaked into artifacts

Use the verification digest as a merge gate. Two PRs that produce the same evaluator digest and verification digest are semantically equivalent — they derive the same facts from the same observations.

- name: Verify and check digest
run: |
jacqos verify
# The bundle includes the evaluator digest and per-fixture world digests
# Your review process can compare these against the base branch

To understand what a code change does to derived state:

  1. Run jacqos verify on the base branch — save the verification bundle
  2. Run jacqos verify on the feature branch — save the verification bundle
  3. Compare the evaluator digests — if they match, the semantic behavior is identical
  4. If they differ, compare per-fixture world digests to see which scenarios changed

This is the CI equivalent of the Activity Compare lens coming in V1.1 — same concept, machine-readable format.

Live ingress should still end in a fixture proof. A serve run gives you operational handles such as run_id, SSE event_id, and adapter receipts, but those are not the semantic contract. The contract is the observation sequence and the derived model.

For a live path, keep the replay loop explicit:

Terminal window
jacqos observe --jsonl fixtures/shared-reality.jsonl --lineage live-demo --create-lineage --json
jacqos run --lineage live-demo --once --shadow --json
jacqos replay fixtures/shared-reality.jsonl
jacqos verify

When you convert a chat session, webhook delivery, or multi-agent subscriber scenario into a fixture, preserve the observations that crossed the mapper boundary. Do not assert on local run_id values or SSE event ids. Assert on facts, intents, invariant violations, contradictions, and effect receipts.

This is what makes live debugging and CI agree: Studio can inspect the live serve surfaces, while jacqos verify proves the same behavior from a clean observation history.

You can export the verification bundle separately from running verification:

Terminal window
# Run verification (always writes to generated/verification/)
jacqos verify
# Export as a standalone artifact
jacqos export verification-bundle

The exported bundle is the same JSON artifact that jacqos verify writes. The export subcommand is useful when you want to ship the bundle to a different location or system.

Worked Example: CI for Appointment Booking

Section titled “Worked Example: CI for Appointment Booking”

Here’s a complete CI workflow for the appointment-booking app:

name: Appointment Booking Verification
on:
push:
paths:
- 'ontology/**'
- 'mappings/**'
- 'helpers/**'
- 'fixtures/**'
- 'jacqos.toml'
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install JacqOS
run: curl -fsSL https://www.jacqos.io/install.sh | sh
- name: Replay happy path
run: jacqos replay fixtures/happy-path.jsonl
- name: Replay contradiction path
run: jacqos replay fixtures/contradiction-path.jsonl
- name: Full verification
run: jacqos verify
- name: Upload bundle
if: always()
uses: actions/upload-artifact@v4
with:
name: verification-bundle
path: generated/verification/

The replay steps are optional — jacqos verify runs all fixtures automatically. Running them separately gives you per-fixture timing and output in the CI log.

When a verification run fails, your next step is the debugging loop. When it passes, your next step is locking the result in.

  • Debugging Workflow — the end-to-end loop for turning a failed jacqos verify into a fix, including provenance drill-downs and pinned composition-report inspection
  • Debugging with Provenance — tracing facts and intents back to the exact observations that produced them
  • Golden Fixtures — concept deep-dive on digest-backed behavior proof
  • Invariant Review — universal constraints that hold across all evaluation states
  • Fixtures and Invariants — practical guide to writing fixtures and declaring invariants
  • Evaluation Package — the portable contract boundary that ships with each verified app
  • CLI Reference — every flag on jacqos replay, jacqos verify, jacqos composition, and jacqos export