Verification Bundle

Overview

Stability: contract-backed reference

Authority: The normative bundle contract lives in spec/jacqos/v1/verification-bundle.schema.json. This page summarizes the exported artifact and highlights the fields you will inspect most often.

A verification bundle is the JSON proof artifact produced by jacqos verify. It captures the evidence you use to inspect a verification run: check results, per-fixture world digests, provenance graphs, counterexamples, and redaction audit results.

Bundles are written to generated/verification/<app_id>.json.

Representative Shape

The schema authority owns validation, required fields, and allowed shapes. The excerpt below shows the current top-level structure:

{
  "version": "jacqos_verify_v1",
  "app_id": "string",
  "evaluator_digest": "string",
  "prompt_bundle_digest": "string | null",
  "llm_complete_active": false,
  "generated_at": "string?",
  "status": "passed | failed | skipped",
  "composition_analysis_path": "string?",
  "composition_analysis": { "CompositionAnalysisArtifact" }?,
  "summary": { "VerifySummary" },
  "checks": [ "VerificationCheck[]" ],
  "redaction_findings": [ "RedactionFinding[]" ],
  "fixtures": [ "FixtureVerificationArtifact[]" ]
}

Top-Level Fields

Field	Type	Description
`version`	string	Bundle format version. Currently `jacqos_verify_v1`
`app_id`	string	Application identifier from `jacqos.toml`
`evaluator_digest`	string	`hash(ontology IR, mapper semantics, helper digests)` — the semantic identity
`prompt_bundle_digest`	string?	Hash of prompt files. Present only if the app uses prompts
`llm_complete_active`	bool	Whether the `llm.complete` capability is declared in `jacqos.toml`
`generated_at`	string?	Optional timestamp. Checked-in bundles normally omit it after normalization
`status`	string	Overall bundle result: `passed`, `failed`, or `skipped`
`composition_analysis_path`	string?	Relative path to the companion composition-analysis report when the composition check passed
`composition_analysis`	object?	Embedded composition-analysis artifact when the composition check passed
`summary`	object	Aggregate fixture counts, world digests, and rule-shape summary
`checks`	array	The verification gate results with machine-readable names and status
`redaction_findings`	array	Potential secret exposures found in verification artifacts
`fixtures`	array	Per-fixture verification evidence including replay, golden, determinism, and provenance details

Verification bundles deliberately omit volatile timestamps and runtime durations so checked-in artifacts stay diff-stable across repeated runs. Use the benchmark report when you need timing data.

Summary

The summary object contains aggregate counts across all fixtures:

Field	Type	Description
`passed`	bool	Overall pass/fail
`fixture_count`	number	Total fixtures verified
`failed_fixture_count`	number	Fixtures that failed one or more checks
`evaluator_digest`	string	Same as top-level
`prompt_bundle_digest`	string?	Same as top-level
`llm_complete_active`	bool	Same as top-level
`observation_count`	number	Total observations replayed across all fixtures
`atom_count`	number	Total atoms extracted
`fact_count`	number	Total facts derived
`intent_count`	number	Total intents derived
`contradiction_count`	number	Total contradictions detected
`invariant_violation_count`	number	Total invariant violations
`shadow_conforming_fixture_count`	number	Fixture replays that matched the shadow reference evaluator
`rule_shape_summary`	object	Guarded/frontier-guarded/acyclic/star-join-friendly/unconstrained rule counts
`bundle_path`	string	Relative path to the bundle file
`fixtures`	array	Per-fixture summary (fixture path, pass/fail, counts, digests)

Checks

The checks array contains one entry per verification check. Each check has:

Field	Type	Description
`name`	string	Check identifier
`status`	string	`passed`, `failed`, or `skipped`
`detail`	string	Human-readable summary of the check result

Check Names

Name	What it verifies
`fixture_replay`	All fixtures replayed without errors
`golden_fixtures`	Derived state matches fixture expectations
`invariants`	All invariants held after every fixed point
`candidate_authority_lints`	Acceptance-gated candidate evidence stayed behind an explicit acceptance relation
`provenance_bundle`	Provenance graph was exported for each fixture
`replay_determinism`	Clean-database replay produced identical world digests
`generated_scenarios`	Property-tested scenarios found no invariant violations
`shadow_reference_evaluator`	Shadow evaluator produced identical derived state
`secret_redaction`	No secret material found in verification artifacts
`composition`	Multi-agent namespace composition passed, failed, or was skipped

Composition Analysis

When a multi-agent app passes the composition gate, the bundle embeds the same portable artifact that jacqos export composition-analysis writes under generated/verification/.

That artifact captures:

namespace reduct partitions and namespace inventories
cross-namespace dependencies and namespace cycles
monotonic versus non-monotonic strata summaries
invariant-fixture coverage over the checked-in fixture corpus
policy lifecycle coverage when the app declares policy.* relations: missing lifecycle surfaces, policy-to-rule edges, and policy-dependent intent.* relations
human-review coverage when the app reads review.* evidence or declares review-like relations: missing review surfaces, review-to-rule edges, and review-dependent intent.* relations
evidence-obligation coverage for intent rules: explicit unknown, missing, fresh, stale, required-evidence surfaces, and warnings for default-allow negation

A visual composition health panel and graph-side namespace views ship with the V1.1 Studio rule-graph surface; in V1, the artifact is the canonical inspection surface, and the CLI report and the file you check into generated/verification/ are the same contract.

policy_lifecycle_coverage is a reporting surface, not a hidden policy engine. It lets you inspect whether policy-backed decisions have source document, effective-window, jurisdiction, approver, exception, supersession, and rule-coverage evidence in the ontology. The actual allow/block decision still comes from ordinary facts, invariants, and proposal ratification rules.

human_review_coverage is the same kind of diagnostic surface for human approval. It inventories review evidence predicates, review-like relations, rules that read review evidence, and intents downstream of accepted reviews. It does not make reviews an override; your ontology still has to derive the accepted review, quorum, expiry, delegation, break-glass, and revocation facts.

evidence_obligation_coverage catches the closed-world footgun: an intent rule that says “act if nothing blocks me” without also requiring positive evidence. The report inventories explicit required, unknown, missing, fresh, and stale surfaces, then names negated intent rules that do or do not have positive authorization, review, policy, freshness, or evidence gates.

If the composition check fails or is skipped, the bundle still records the checks entry, but it does not embed the report. Export or recompute the standalone composition-analysis artifact when you need the full failure details, or when you want to inspect a single-agent app’s namespace structure independently of jacqos verify.

Fixture Artifacts

Each entry in the fixtures array contains the complete verification evidence for one fixture:

Core Fields

Field	Type	Description
`fixture`	string	Relative path to the fixture file
`status`	string	`passed` or `failed`
`observation_digest`	string	Hash of the observation sequence
`world_digest`	string	Hash of the derived world state

Replay Summary

The replay object contains counts from the replay:

Field	Type	Description
`observation_count`	number	Observations in this fixture
`atom_count`	number	Atoms extracted
`fact_count`	number	Facts derived
`intent_count`	number	Intents derived
`contradiction_count`	number	Contradictions detected
`invariant_violation_count`	number	Invariant violations during replay

Golden Fixture Report

The golden object compares derived state against expectations:

Field	Type	Description
`status`	string	`passed` or `failed`
`expectation_path`	string?	Path to the expectation file (null if no expectations defined)
`diff`	object	Missing and unexpected facts, intents, contradictions, and invariant violations

When an expected fact or intent is missing, diff also includes why-not entries:

Field	Type	Description
`why_not_missing_facts`	array	Candidate rules for each missing fact, with the body clause that blocked derivation
`why_not_missing_intents`	array	Candidate rules for each missing intent, with the missing relation, blocked negation, unsupported helper, or comparison that prevented it

Each why-not rule report names the rule id, source span, status, satisfied clause count, and per-clause evidence. This is the bundle-level surface for “why didn’t this happen?” debugging: unratified proposals, missing reviews, stale-evidence facts, and blocking negations show up as explicit body-clause outcomes rather than as an unexplained absent tuple.

Determinism Report

The determinism object verifies replay reproducibility:

Field	Type	Description
`status`	string	`passed` or `failed`

A failed determinism check means the evaluator produced different world digests across two identical replays — indicating non-deterministic behavior in the mapper, helper, or evaluator.

Shadow Conformance Report

The shadow object compares the product evaluator against the shadow reference:

Field	Type	Description
`status`	string	`passed` or `failed`
`world_digest`	string	Product evaluator’s world digest
`shadow_world_digest`	string	Shadow evaluator’s world digest
`product_revision_digest`	string	Product evaluator revision
`shadow_revision_digest`	string	Shadow evaluator revision
`detail`	string	Description of any differences

Generated Scenario Report

The generated_scenarios object summarizes property testing:

Field	Type	Description
`status`	string	`passed` or `failed`
`scenario_count`	number	Total generated observation sequences tested
`failures`	array	Invariant violations found by generated scenarios

Invariant Failures

The invariant_failures array contains detailed reports for each invariant violation:

Field	Type	Description
`invariant_name`	string	Name of the violated invariant
`observation_index`	number	Which observation triggered the violation
`counterexample`	object	The concrete values that violated the constraint

Provenance Graph

The provenance_graph object contains the full derivation graph for the fixture — every edge from observations to atoms to facts to intents. This is the same data Studio surfaces in the drill inspector and timeline. Visual graph rendering ships in V1.1.

Redaction Findings

The redaction_findings array lists any secret material detected in verification artifacts. Each finding identifies the location and type of the potential exposure.

An empty array means the redaction audit passed — no obvious secret material was found in exported fixtures, counterexamples, or the bundle itself.

The redaction audit is an export-boundary check, not a legal-retention system. For the broader privacy model behind encrypted BlobRefs, legal hold, tombstoned access, and digest-preserving redacted projections, see the V1 privacy and retention contract in spec/jacqos/v1/privacy-retention.md.

Using Bundles in CI

Exit Codes

Code	Meaning
`0`	All checks passed
`2`	Verification failures
`1`	Other error

Comparing Bundles

Two bundles with the same evaluator_digest and matching per-fixture world_digest values prove that two evaluator versions are semantically identical. Different world digests pinpoint which fixtures diverge.

Storing Bundles

Store bundles as CI artifacts or commit them to generated/verification/. They serve as the audit trail: “this evaluator, with these fixtures, produced these facts, and all invariants held.”

Next Steps

Replay and Verification Guide — how to use replay and verification in practice
Evaluation Package — the portable contract boundary
CLI Reference — jacqos verify and jacqos export verification-bundle commands
Golden Fixtures — concept deep-dive on digest-backed behavior proof