Datalog in Fifteen Minutes
Why Datalog at all
Section titled “Why Datalog at all”If you write SQL, you already know most of Datalog. A Datalog rule is
a WHERE clause with a name on it. Where SQL gives you SELECT … FROM … WHERE … and the columns you want, Datalog gives you a head — the
shape of the row you are deriving — and a body that looks exactly like
a SQL WHERE over a join. Every rule is a tiny named view, and the
engine keeps applying those views until nothing new appears.
Datalog earns its keep when the queries depend on each other. A
recursive Common Table Expression (WITH RECURSIVE …) in PostgreSQL
is the SQL idiom for “follow the graph until it stops changing.”
Datalog generalises that pattern: every rule may depend on every other
rule, and the engine works out the order, runs the joins to a fixed
point, and stops. The WHERE clause is the program. There is no
imperative scaffolding around it.
This is not a new paradigm. Datalog has been studied since the late
1970s, ships inside production systems like Datomic, LogicBlox,
Soufflé, and the static analysers behind Doop and CodeQL, and is one
of the most well-understood query languages in computer science. AI
models are already proficient at reading and writing it — .dh stays
inside that training distribution on purpose.
JacqOS uses Datalog because of one structural property that imperative
languages cannot give you: every derived fact carries the exact set
of inputs that produced it. A Datalog engine knows which observations
fed which atoms, which atoms grounded which join, and which join fired
which rule. That chain is provenance, and it is what powers Studio’s
zero-code debugger and jacqos verify’s replay guarantees. You don’t
get that for free from a for loop.
The shape of a rule
Section titled “The shape of a rule”A Datalog rule has three parts: a head, the symbol :- (read it
as “if”), and a body. Here is one rule, lifted from JacqOS’s
appointment-booking example:
rule intent.reserve_slot(req, slot) :- booking_request(req, _, slot), slot_available(slot), not booking_terminal(req).Walk through it left to right.
- Head.
intent.reserve_slot(req, slot)is the row being derived. The relation name isintent.reserve_slot;reqandslotare variables. Variables are lowercase and have no declared type at the rule site — the relation declaration inschema.dhsays what they hold. :-. Pronounce it “if.” Everything to the left is true when everything on the right is true. This is the only direction the arrow ever points.- Body. A comma-separated list of conditions. The comma is logical AND — every condition must hold for the rule to fire. There is no OR inside a single rule body; if you need disjunction, you write two rules with the same head (Datalog evaluates them independently and unions the results).
booking_request(req, _, slot). A positive condition. It matches any tuple in thebooking_requestrelation whose first column we will callreqand whose third column we will callslot. The_is a wildcard — match anything, name nothing.slot_available(slot). A second positive condition. The variableslotis shared with the previous condition, so this is exactly a SQL inner join: only those(req, slot)pairs survive where the sameslotis also present inslot_available.not booking_terminal(req). A negated condition. The rule fires only if there is nobooking_terminal(req)row for the current binding ofreq. Negation is allowed under specific rules we will get to in a moment..Every rule ends with a period.
Read together: “An intent to reserve a slot exists for (req, slot)
when there is a booking request for that slot, the slot is available,
and the booking is not already in a terminal state.” That is the
entire rule. There is no orchestration, no sequencing, no event
handler. The engine finds bindings; the rule fires.
Facts, derivation, and fixed point
Section titled “Facts, derivation, and fixed point”Datalog separates the world into two kinds of facts.
Base facts are the inputs. In JacqOS, base facts come from
observations through the built-in atom(observation_ref, predicate, value) relation. When a webhook arrives, a Rhai mapper flattens it
into atoms; those atoms are the only base facts the engine ever sees.
Derived facts are everything else — every relation you declare in
schema.dh and define with rule …. The engine starts with the base
facts, applies every rule in every legal order, and produces new
facts. Newly derived facts may themselves match the body of another
rule, so the engine loops. It stops when one full pass produces no new
tuples. That stopping point is called a fixed point.
If you have written a recursive CTE in SQL, the picture is identical.
WITH RECURSIVE reachable(src, dst) AS ( SELECT src, dst FROM edge UNION SELECT r.src, e.dst FROM reachable r JOIN edge e ON r.dst = e.src)SELECT * FROM reachable;The .dh version of the same idea:
relation edge(src: text, dst: text)relation reachable(src: text, dst: text)
rule reachable(a, b) :- edge(a, b).rule reachable(a, c) :- reachable(a, b), edge(b, c).Two rules, one head. The engine computes the fixed point of reachable
and stops. Because the observation lineage is finite, the fixed point
always exists and is unique.
The mental model is: imagine a recursive CTE that runs until it stabilises, where every named CTE in the query can reference every other one, and the engine plans the dependency order for you.
Negation
Section titled “Negation”Look back at the appointment-booking rule:
rule intent.reserve_slot(req, slot) :- booking_request(req, _, slot), slot_available(slot), not booking_terminal(req).That not booking_terminal(req) is a negated condition. JacqOS
accepts negation under one rule of thumb: you can only negate
something the engine already knows the full answer to.
The engine enforces this by sorting rules into strata. A stratum
is a layer that gets fully computed before the next layer begins.
booking_terminal is derived in a lower stratum than
intent.reserve_slot, so by the time the intent rule runs, the engine
already has the complete set of booking_terminal tuples and can
correctly answer “no, there is no such fact.” This is stratified
negation, and it is the only flavour of negation .dh accepts.
The practical version of the rule:
- If you negate a relation, that relation must be derivable without ever (directly or transitively) negating yours back.
- Recursive negation — “X holds when not Y holds when not X holds” — is rejected at load time. It has no well-defined answer.
- Multiple rules can derive the same relation across multiple strata, and the engine works out the layering automatically.
You don’t have to think about strata while authoring most of the time. If you write a cycle that involves negation, the loader rejects the file and points at the offending pair of rules. Fix the cycle and move on.
Aggregates
Section titled “Aggregates”Aggregates compute summary values across matching tuples. .dh
supports count, sum, min, and max. The syntax mirrors SQL’s
aggregate functions, but they live inside a rule body and bind a
fresh variable on the left.
The appointment-booking example uses count to enforce a uniqueness
invariant:
invariant no_double_hold(slot) :- count slot_hold_active(_, slot) <= 1.Read it as: “for every slot, the number of slot_hold_active(_, slot) tuples must be at most one.” If two reservations try to hold
the same slot at the same time, this invariant fails and jacqos verify rejects the timeline before any effect fires.
A binding example using count inside a rule body:
rule slot_booking_count(slot, n) :- n = count booking_confirmed(_, slot).n is a fresh variable; = is binding (it assigns the aggregate’s
result), not equality comparison. For sum, min, and max, you
also name the column to aggregate over:
rule total_booked(total) :- total = sum booking_price(_, amount), amount.V1 restriction: aggregates must be finite and non-recursive. You cannot aggregate over a relation that depends, directly or transitively, on the relation you are deriving. The engine rejects recursive aggregation at load time because the result has no well-defined value (think of an aggregate over a stream that depends on its own running sum). If you need rolling totals, materialise the inputs in a lower stratum first.
Reading a JacqOS .dh file
Section titled “Reading a JacqOS .dh file”Now read the appointment-booking ontology end to end. Open
examples/jacqos-appointment-booking/ontology/rules.dh.
The first rule:
rule booking_request(req, email, slot) :- atom(obs, "booking.request_id", req), atom(obs, "booking.email", email), atom(obs, "booking.slot_id", slot).Trace it head-to-toe.
- Head.
booking_request(req, email, slot)— the rule derives tuples in thebooking_requestrelation, declared inschema.dhwith three text columns. - Body, line one.
atom(obs, "booking.request_id", req)— match an atom whose predicate is the literal string"booking.request_id". Bind its observation reference toobsand its value toreq. - Body, lines two and three. Two more
atom(...)matches that share the sameobsvariable. Sharingobsmeans all three atoms must come from the same observation. This is how Datalog expresses “these pieces of evidence belong together” — by joining on a shared variable. - No negation, no aggregates, no helpers. This is a pure
star-shaped join — every condition shares the pivot variable
obs. Star joins are the easiest shape for the engine to optimise and the easiest for Studio to render as provenance.
When the engine fires this rule, it produces one
booking_request(req, email, slot) tuple for every observation that
carries all three atoms. That tuple is now a derived fact with full
provenance back to the originating observation. Every downstream rule
that mentions booking_request(...) consumes those tuples; if you
later ask Studio “where did this booking come from?”, it walks the
provenance chain back to the exact observation row in the log.
A rule with negation, lifted from the same file:
rule slot_available(slot) :- slot_listed(slot), not slot_hold_active(_, slot), not booking_confirmed(_, slot).A slot is available when it is listed, no active hold exists for it,
and no confirmed booking exists for it. Both negations are legal
because slot_hold_active and booking_confirmed are derived in
lower strata than slot_available.
You can now read every .dh file in the repo. The vocabulary is:
positive condition, negated condition, aggregate binding, helper
binding, comparison. Every rule is some combination of those five.
The reference page enumerates the exact grammar.
What you can do next
Section titled “What you can do next”You have everything you need to read the reference and start writing rules of your own.
.dhLanguage Reference — the full grammar, every operator, every diagnostic, every rule shape the validator recognises.- Model-Theoretic Foundations — why this fragment composes, why the restrictions exist, and what Gaifman locality and the guarded fragment actually buy you.
- Build Your First App — scaffold an app,
edit a rule, replay a fixture, and watch
jacqos verifygo green.
When the reference uses a term you have not seen before, come back here. The five vocabulary words above carry the entire language.