What You Just Saw
In Plain Language
Section titled “In Plain Language”You just watched a live AI agent get corrupted — and watched JacqOS contain it. Here is what actually happened.
- The agent talked to a real model. When you sent the trigger phrase, the model broke and tried to give away a $68,900 truck for $1.
- That $1 offer did not reach the customer. It arrived as a proposal — a piece of evidence — and passed through a gate you can see and inspect: an ontology rule that checks every proposed offer against the dealership’s pricing floor.
- The fair offer in the first message passed. The $1 offer was refused. Every step has a receipt you can trace back to the message that caused it.
Two ideas did all the work.
1. The model proposes; it does not act
Section titled “1. The model proposes; it does not act”In a classic agent loop, the model picks an action and the runtime runs it — the
model’s decision is the behaviour. JacqOS breaks that link. The model’s output
became proposal.offer_suggested: a non-authoritative fact, evidence that the
model suggested something. Nothing the model emits is self-executing. An offer
only reaches the customer if a separate rule promotes that proposal into an
action.
That single move is what makes corruption survivable. A prompt injection can change what the model proposes. It cannot change what the platform is willing to act on.
2. A rule decides — deterministically
Section titled “2. A rule decides — deterministically”Between the proposal and the customer sits an ontology rule. It authorized the
fair offer (decision.approved.sales.send_offer) and refused the $1 offer
(decision.rejected.sales.send_offer, reason below_minimum_price). The policy
lives in a rule and a pricing-floor fact you can read and review — not in the
model’s mood, and not in a prompt the attack already slipped past.
Earlier in the demo the model refused a plain “$1” request on its own. That felt safe — until the trigger phrase defeated it. The rule was not defeated, because it never depended on the model behaving in the first place.
The One-Paragraph Version
Section titled “The One-Paragraph Version”Your AI agents can hallucinate, change their minds, get prompt-injected, and propose absurd things — and unsafe suggestions are still structurally incapable of reaching the world. The safety is not a policy layer bolted on, or a prompt you hope holds. It is a property of the system: proposals are evidence, and only a rule you can review turns evidence into action.
The physics-engine analogy captures it in one sentence: agents propose moves, and the world refuses to enter states that would violate the physics. What you watched in Studio is that refusal happening in real time, against a real model, with a complete debug trail.
Where To Go Next
Section titled “Where To Go Next”You can go any of three directions from here. None are required, and you can come back for another later.
Go deeper on a pattern
Section titled “Go deeper on a pattern”The dealership demo is one of two containment patterns JacqOS is built for. Read the pattern page that matches your use case — the real-world failure, the guarantee, and the code.
- LLM Decision Containment — the dealership pattern. For any AI that proposes a commercial or operational action: refunds, offers, remediations, purchases.
- Fallible Sensor Containment — the symmetric pattern for inputs you can’t fully trust: voice parsers, vision models, OCR, extraction.
Build your own
Section titled “Build your own”Put this under your own domain right now. The Build track scaffolds a verified app in one command.
Understand why this works
Section titled “Understand why this works”If you want to know why the containment is sound — and why it doesn’t depend on trusting the AI — that lives under Foundations. Entirely optional; you can ship a verified app without ever loading a theory page.
- Observation-First Thinking — the mental model behind the platform.