Skip to content

Datalog in Fifteen Minutes

If you write SQL, you already know most of Datalog. A Datalog rule is a WHERE clause with a name on it. Where SQL gives you SELECT … FROM … WHERE … and the columns you want, Datalog gives you a head — the shape of the row you are deriving — and a body that looks exactly like a SQL WHERE over a join. Every rule is a tiny named view, and the engine keeps applying those views until nothing new appears.

Datalog earns its keep when the queries depend on each other. A recursive Common Table Expression (WITH RECURSIVE …) in PostgreSQL is the SQL idiom for “follow the graph until it stops changing.” Datalog generalises that pattern: every rule may depend on every other rule, and the engine works out the order, runs the joins to a fixed point, and stops. The WHERE clause is the program. There is no imperative scaffolding around it.

This is not a new paradigm. Datalog has been studied since the late 1970s, ships inside production systems like Datomic, LogicBlox, Soufflé, and the static analysers behind Doop and CodeQL, and is one of the most well-understood query languages in computer science. AI models are already proficient at reading and writing it — .dh stays inside that training distribution on purpose.

JacqOS uses Datalog because of one structural property that imperative languages cannot give you: every derived fact carries the exact set of inputs that produced it. A Datalog engine knows which observations fed which atoms, which atoms grounded which join, and which join fired which rule. That chain is provenance, and it is what powers Studio’s zero-code debugger and jacqos verify’s replay guarantees. You don’t get that for free from a for loop.

A Datalog rule has three parts: a head, the symbol :- (read it as “if”), and a body. Here is one rule, lifted from JacqOS’s appointment-booking example:

rule intent.reserve_slot(req, slot) :-
booking_request(req, _, slot),
slot_available(slot),
not booking_terminal(req).

Walk through it left to right.

  • Head. intent.reserve_slot(req, slot) is the row being derived. The relation name is intent.reserve_slot; req and slot are variables. Variables are lowercase and have no declared type at the rule site — the relation declaration in schema.dh says what they hold.
  • :-. Pronounce it “if.” Everything to the left is true when everything on the right is true. This is the only direction the arrow ever points.
  • Body. A comma-separated list of conditions. The comma is logical AND — every condition must hold for the rule to fire. There is no OR inside a single rule body; if you need disjunction, you write two rules with the same head (Datalog evaluates them independently and unions the results).
  • booking_request(req, _, slot). A positive condition. It matches any tuple in the booking_request relation whose first column we will call req and whose third column we will call slot. The _ is a wildcard — match anything, name nothing.
  • slot_available(slot). A second positive condition. The variable slot is shared with the previous condition, so this is exactly a SQL inner join: only those (req, slot) pairs survive where the same slot is also present in slot_available.
  • not booking_terminal(req). A negated condition. The rule fires only if there is no booking_terminal(req) row for the current binding of req. Negation is allowed under specific rules we will get to in a moment.
  • . Every rule ends with a period.

Read together: “An intent to reserve a slot exists for (req, slot) when there is a booking request for that slot, the slot is available, and the booking is not already in a terminal state.” That is the entire rule. There is no orchestration, no sequencing, no event handler. The engine finds bindings; the rule fires.

Datalog separates the world into two kinds of facts.

Base facts are the inputs. In JacqOS, base facts come from observations through the built-in atom(observation_ref, predicate, value) relation. When a webhook arrives, a Rhai mapper flattens it into atoms; those atoms are the only base facts the engine ever sees.

Derived facts are everything else — every relation you declare in schema.dh and define with rule …. The engine starts with the base facts, applies every rule in every legal order, and produces new facts. Newly derived facts may themselves match the body of another rule, so the engine loops. It stops when one full pass produces no new tuples. That stopping point is called a fixed point.

If you have written a recursive CTE in SQL, the picture is identical.

WITH RECURSIVE reachable(src, dst) AS (
SELECT src, dst FROM edge
UNION
SELECT r.src, e.dst
FROM reachable r JOIN edge e ON r.dst = e.src
)
SELECT * FROM reachable;

The .dh version of the same idea:

relation edge(src: text, dst: text)
relation reachable(src: text, dst: text)
rule reachable(a, b) :- edge(a, b).
rule reachable(a, c) :- reachable(a, b), edge(b, c).

Two rules, one head. The engine computes the fixed point of reachable and stops. Because the observation lineage is finite, the fixed point always exists and is unique.

The mental model is: imagine a recursive CTE that runs until it stabilises, where every named CTE in the query can reference every other one, and the engine plans the dependency order for you.

Look back at the appointment-booking rule:

rule intent.reserve_slot(req, slot) :-
booking_request(req, _, slot),
slot_available(slot),
not booking_terminal(req).

That not booking_terminal(req) is a negated condition. JacqOS accepts negation under one rule of thumb: you can only negate something the engine already knows the full answer to.

The engine enforces this by sorting rules into strata. A stratum is a layer that gets fully computed before the next layer begins. booking_terminal is derived in a lower stratum than intent.reserve_slot, so by the time the intent rule runs, the engine already has the complete set of booking_terminal tuples and can correctly answer “no, there is no such fact.” This is stratified negation, and it is the only flavour of negation .dh accepts.

The practical version of the rule:

  • If you negate a relation, that relation must be derivable without ever (directly or transitively) negating yours back.
  • Recursive negation — “X holds when not Y holds when not X holds” — is rejected at load time. It has no well-defined answer.
  • Multiple rules can derive the same relation across multiple strata, and the engine works out the layering automatically.

You don’t have to think about strata while authoring most of the time. If you write a cycle that involves negation, the loader rejects the file and points at the offending pair of rules. Fix the cycle and move on.

Aggregates compute summary values across matching tuples. .dh supports count, sum, min, and max. The syntax mirrors SQL’s aggregate functions, but they live inside a rule body and bind a fresh variable on the left.

The appointment-booking example uses count to enforce a uniqueness invariant:

invariant no_double_hold(slot) :-
count slot_hold_active(_, slot) <= 1.

Read it as: “for every slot, the number of slot_hold_active(_, slot) tuples must be at most one.” If two reservations try to hold the same slot at the same time, this invariant fails and jacqos verify rejects the timeline before any effect fires.

A binding example using count inside a rule body:

rule slot_booking_count(slot, n) :-
n = count booking_confirmed(_, slot).

n is a fresh variable; = is binding (it assigns the aggregate’s result), not equality comparison. For sum, min, and max, you also name the column to aggregate over:

rule total_booked(total) :-
total = sum booking_price(_, amount), amount.

V1 restriction: aggregates must be finite and non-recursive. You cannot aggregate over a relation that depends, directly or transitively, on the relation you are deriving. The engine rejects recursive aggregation at load time because the result has no well-defined value (think of an aggregate over a stream that depends on its own running sum). If you need rolling totals, materialise the inputs in a lower stratum first.

Now read the appointment-booking ontology end to end. Open examples/jacqos-appointment-booking/ontology/rules.dh. The first rule:

rule booking_request(req, email, slot) :-
atom(obs, "booking.request_id", req),
atom(obs, "booking.email", email),
atom(obs, "booking.slot_id", slot).

Trace it head-to-toe.

  1. Head. booking_request(req, email, slot) — the rule derives tuples in the booking_request relation, declared in schema.dh with three text columns.
  2. Body, line one. atom(obs, "booking.request_id", req) — match an atom whose predicate is the literal string "booking.request_id". Bind its observation reference to obs and its value to req.
  3. Body, lines two and three. Two more atom(...) matches that share the same obs variable. Sharing obs means all three atoms must come from the same observation. This is how Datalog expresses “these pieces of evidence belong together” — by joining on a shared variable.
  4. No negation, no aggregates, no helpers. This is a pure star-shaped join — every condition shares the pivot variable obs. Star joins are the easiest shape for the engine to optimise and the easiest for Studio to render as provenance.

When the engine fires this rule, it produces one booking_request(req, email, slot) tuple for every observation that carries all three atoms. That tuple is now a derived fact with full provenance back to the originating observation. Every downstream rule that mentions booking_request(...) consumes those tuples; if you later ask Studio “where did this booking come from?”, it walks the provenance chain back to the exact observation row in the log.

A rule with negation, lifted from the same file:

rule slot_available(slot) :-
slot_listed(slot),
not slot_hold_active(_, slot),
not booking_confirmed(_, slot).

A slot is available when it is listed, no active hold exists for it, and no confirmed booking exists for it. Both negations are legal because slot_hold_active and booking_confirmed are derived in lower strata than slot_available.

You can now read every .dh file in the repo. The vocabulary is: positive condition, negated condition, aggregate binding, helper binding, comparison. Every rule is some combination of those five. The reference page enumerates the exact grammar.

You have everything you need to read the reference and start writing rules of your own.

  • .dh Language Reference — the full grammar, every operator, every diagnostic, every rule shape the validator recognises.
  • Model-Theoretic Foundations — why this fragment composes, why the restrictions exist, and what Gaifman locality and the guarded fragment actually buy you.
  • Build Your First App — scaffold an app, edit a rule, replay a fixture, and watch jacqos verify go green.

When the reference uses a term you have not seen before, come back here. The five vocabulary words above carry the entire language.

Open the .dh Reference