Architecture¶

The chain invariant¶

For each chain id, the audit_trail table holds rows in id ascending order. Every row carries three signature columns:

previous_hash — the SHA-256 hash of the previous row in the same chain (empty string at genesis).
hash — SHA-256 of this row's canonicalized payload (including previous_hash), publicly verifiable without any secret.
hmac — HMAC-SHA-256(hash, secret), where secret is the operator's signing key resolved per row via the secret_id column.

Two invariants link rows together:

row[n].previous_hash = row[n-1].hash          (linking constraint)
row[n].hash          = SHA-256(canonical(payload columns))
row[n].hmac          = HMAC-SHA-256(hash, secret(row[n].secret_id))

with row[0].previous_hash = '' (genesis row of each chain).

canonical() recursively ksort(SORT_STRING)s the payload and encodes it via json_encode(JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE | JSON_THROW_ON_ERROR). The exact byte output is pinned by tests/src/Unit/CanonicalizeBytesTest.php so any change to the encoding fails CI before it can ship — otherwise every existing row's stored hash would silently become unverifiable.

The payload columns that participate in canonical():

[
  'channel'                 => '…',  // PSR-3 channel
  'chain'                   => '…',  // chain id
  'severity'                =>  …,    // RFC 5424 numeric severity
  'action'                  => '…',  // event verb (create / update / …)
  'resource'                => '…',  // subject identifier
  'context_permanent'       => '…',  // JSON-encoded permanent bucket
  'context_transient_hash'  => '…',  // SHA-256 of the transient bucket
  'created'                 => '…',  // microsecond Unix timestamp
  'secret_id'               =>  …,    // signing key id
  'previous_hash'           => '…',  // chain link
]

Two retention tiers split the row's context across separate columns:

context_permanent is signed raw in canonical(). Kept forever; never purged by retention.
context_transient is signed via context_transient_hash only — the column itself is NULLed at retention by the transient-purge pass, and a covering audit_trail_segment row attests the transition (transient_purged_at != 0). The chain still verifies because only the hash is in canonical.

This shape lets GDPR right-to-erasure operate on the transient column without breaking chain verification, while still distinguishing legitimate purge from an attacker NULLing a column to hide content: a bare NULL on a row whose write-time context_transient_hash was non-empty AND that doesn't fall within any covering segment with transient_purged_at != 0 or archived_at != 0 fails the verifier. (Archive ops legitimize the NULL because the archive NDJSON envelope captured the row's transient state at archive time — see NDJSON envelope shape below — so the row's pre-purge bytes live in the file even after the live column is gone.)

Why publicly-verifiable hash plus operator HMAC¶

Two layers:

The hash column is recomputable by anyone with the row data — no secret required. A third-party auditor can walk the chain, recompute every hash, and confirm the linking constraint. This is the public-verifiability anchor.
The hmac column proves the row was signed by an operator who held the matching secret bytes. Without the secret the hmac cannot be forged; with it, an attacker can sign rows but still cannot rewrite the existing chain (their forgery would need to also rewrite every downstream previous_hash and recompute every hash and hmac, and the operator could detect that via a TSA-anchored snapshot).

Verification runs in two flavors: verifyChainPublic() (hash chain only — auditor mode) and verifyChain() (hash chain plus HMAC — operator mode).

Multi-tamper walk¶

If the verifier sees a row that doesn't validate, it records the broken range and keeps walking rather than stopping at the first recovered range. The verdict's broken_ranges list contains every contiguous failed range in chain order. The message advertises additional ranges so operators don't ack the first range and assume the chain is otherwise clean.

Why per-chain (not global)¶

A single global chain has appeal — it gives a total ordering over every event in the system. In practice the cost is too high:

Every INSERT must read the global head, serialize behind it, and write. Under load (a busy WebDAV save plus a node update hitting at the same time) the lock acquisitions stack up.
A single break — a corrupted row, an accidental schema migration that touched the table — invalidates verification for every row downstream, including unrelated subsystems.
Cross-channel cause-and-effect is rarely the forensic question (you usually want "what happened to this acte", not "what happened to the system at 14:23").

The module's default routing maps each PSR-3 channel to a chain of the same id. Operators that want to funnel related channels into a single chain register an audit_trail_chain config entity with the target chain id and a channels[] list naming the source channels. The choice is per-channel on the input side and per-chain everywhere else (storage, verification, retention) so chain composition is a deployment decision, not a code one.

Per-chain write serialization¶

INSERTs into the same chain need to be serialized — otherwise two concurrent writers might both read the same head hash and emit two rows with the same previous_hash, breaking the chain.

The module acquires a Drupal lock named audit_trail.write:<chain_id> for the duration of SELECT head → compute hash + hmac → INSERT. Locks are per-chain, so unrelated chains do not contend.

Schema-level defense-in-depth: a UNIQUE index on (chain, previous_hash) makes chain forks impossible at the database level. Even if the application-level lock fails to serialize, the second row's INSERT fails with a uniqueness violation rather than silently extending the chain in two places.

If the application lock cannot be acquired within 5 seconds, the log entry is dropped from the chain rather than blocking the request indefinitely — the underlying dblog or syslog modules still record the entry on their side. Every drop bumps audit_trail.dropped_under_contention in State, and a non-zero counter surfaces on /admin/reports/status as a warning so the gap is operator-visible.

Storage layout¶

A single Drupal-managed table audit_trail holds all chains. Each row carries:

Column	Notes
`id`	Serial primary key, monotonically increasing
`created`	Microsecond Unix timestamp (string, 16 chars)
`channel`	Originating PSR-3 channel
`chain`	Chain group id (defaults to channel)
`severity`	RFC 5424 numeric (0=emergency .. 7=debug)
`action`	Event verb (create / update / delete / …); used by the entries listing
`resource`	Subject identifier (e.g. `entity:node/42`, `webdav:files/contract.docx`)
`context_permanent`	JSON-encoded permanent-bucket payload; signed raw in `hash`
`context_transient`	JSON-encoded transient-bucket payload; NULLed at retention by the transient-purge cron pass (the cleared range is attested by a transient-purge segment). When the operator opts out of transient-purge, the raw bytes are preserved into the archive NDJSON envelope at archive time and live there until file-purge; the column itself is gone after live-purge regardless. Signed via `context_transient_hash` (only the hash is in the canonical, never the raw bytes)
`context_transient_hash`	SHA-256 of `context_transient` at write time; empty string when the bucket had no content
`secret_id`	Integer id of the `audit_trail_secret` entity that produced `hmac`
`previous_hash`	`hash` of the previous row in the same `chain` (empty at genesis)
`hash`	SHA-256 of `canonical(payload columns)` — publicly verifiable
`hmac`	HMAC-SHA-256(hash, secret(`secret_id`)) — operator-verifiable

A compound (chain, id) index covers the hot read path (fetching a chain's head for the next INSERT, walking a chain for verification). A UNIQUE index on (chain, previous_hash) enforces the no-fork invariant at the schema level. Secondary indexes on channel, created, and action support filtering on the admin entries page.

`audit_trail_segment.lifecycle_hmac` default value¶

The lifecycle_hmac column defaults to '' (empty string) at the schema level. This is defensive — if a code path ever inserts an archive bookkeeping row without computing the HMAC the column ends up non-NULL but empty, and the verifier treats empty lifecycle_hmac as an invalid signature (the HMAC of any byte string under any secret is 64 hex characters, never the empty string).

In practice every code path that creates an archive record also computes the HMAC inline, so the default is unreachable. It stays in the schema as belt-and-suspenders insurance.

Write-path index trade-off¶

audit_trail carries six secondary indexes plus the (chain, previous_hash) UNIQUE key. On high-write workloads (e.g. 1 000 rows/second sustained into a single chain) the bottleneck is index maintenance — each INSERT updates seven B-trees.

For 1.0 the index set is what every operator wants by default (filter by channel, severity, action, created, chain). Sites that profile a hot write path and confirm specific filters are never used can DROP those indexes via a custom hook_update_N; the chain semantics don't depend on the secondary indexes, only on (chain, previous_hash) (UNIQUE) and (chain, id) (compound). A future settings flag for the secondary index set is on the roadmap if real operators ask for it.

`id` column ceiling¶

The id column is declared as serial unsigned, which on MySQL / MariaDB resolves to INT UNSIGNED — maximum 4,294,967,295 rows. For typical Drupal sites this is fine; for high-volume audit deployments (IoT telemetry, high-frequency financial logs) the ceiling is ~12 years at one million rows per day.

Operators approaching the ceiling have two paths:

Archive aggressively. Auto-archive + live-purge moves old rows out of the live table; ids of purged rows are not reused, but they no longer occupy table space.
Migrate to bigserial (planned for 2.0). No update hook is shipped in 1.0 because the schema change is expensive on large tables and no live deployment is anywhere near the ceiling.

A second table, audit_trail_checkpoint, records verification checkpoints — (chain, last_id, last_hash, created, hmac, secret_id) rows minted each time the verifier walks a chain cleanly to its head. The next incremental walk starts after last_id and uses last_hash as its expected_previous, bounding verification cost to events accumulated since the last checkpoint. The checkpoint's own HMAC is recomputed before trust; a forged checkpoint falls back to a full walk from genesis. The checkpoints are an optimization layer: lose them and a full walk from genesis still proves chain integrity. See verification for the operational pattern.

A third table, audit_trail_segment, records the lifecycle state of every contiguous chain id range that has been processed in some way (archived, transient-purged, live- purged, file-purged). One row per range, created lazily on the first lifecycle op that touches the range and surviving the full lifecycle (including file-purge) so the verifier can bridge across long-purged ranges via the row's anchor_before / anchor_after columns. Each lifecycle transition records its timestamp + the audit_trail.id of the chain event that attested it (segment_archived, segment_transient_purged, segment_live_purged, segment_file_purged); the chain event carries the segment id in resource='segment:<N>' and the verifier cross-checks both directions of the mutual reference. Three independent HMACs cover different concerns on the row: hmac (identity, sealed at creation), archive_hmac (archive content, sealed at archive op), lifecycle_hmac (re-signed on every state mutation under the currently-active secret).

A fourth table, audit_trail_acknowledgment, records operator attestations that a specific row range is known to be unverifiable (e.g. signed by a now-deleted secret, or restored from a backup with no chain link). The verifier silently skips acknowledged ranges and reports them in the verdict.

Retention lifecycle¶

Four cron passes carry rows through retention, gated by four thresholds clocked from the row's created timestamp:

transient-purge  →  archive  →  live-purge  →  file-purge
   ↑                  ↑              ↑              ↑
 transient_purge   archive_after   live_purge    file_purge
 _after (opt-in)   (required)      _after        _after

The settings form enforces transient_purge_after < archive_after < live_purge_after < file_purge_after at submit time. Transient-purge runs first in the cron tick because it must: once a row is archived, the NDJSON file has frozen whatever transient state the live row was in. A late transient-purge (after archive) would NULL only the live column while the file still holds the bytes — defeating the operator's data-minimization intent.

Each pass writes a chained event row attesting its transition (segment_transient_purged, segment_archived, segment_live_purged, segment_file_purged). The audit_trail_segment bookkeeping row records the matching *_at timestamp + the chained event's audit_trail.id in the *_event_id column. Both writes happen inside one DB transaction so a process crash between them can't persist a "ghost segment" (state flag set, event id zero).

transient_purge_after is the only optional threshold. An empty value disables the pass entirely; archives written for that chain carry the raw context_transient bytes into the NDJSON envelope as a sibling of the canonical payload (see below). Operators who DO configure transient-purge can also disable it per chain via the zero-duration sentinel PT0S on the per-chain form.

Segments as the lifecycle unit¶

Each pass operates on segments — contiguous id ranges recorded in audit_trail_segment — not on individual rows. A segment is just a bookkeeping row pinning [from_id, to_id] plus the lifecycle stamps (archived_at, transient_purged_at, live_purged_at, file_purged_at). Rows themselves stay in audit_trail; the segment table records which ranges have been processed in which way.

A segment is bare when all four stamps are zero (it exists but no lifecycle op has touched the range yet), transient-purged when transient_purged_at != 0, archived when archived_at != 0, and so on. Stamps accrue in the order transient_purged_at → archived_at → live_purged_at → file_purged_at; each lifecycle pass advances segments that match its prerequisite stamp pattern.

Chain writes are tied to lifecycle transitions, not segment existence. Creating a bare segment is silent — the bookkeeping row lands in audit_trail_segment but no chain event is written. The first lifecycle pass that advances the segment (transient-purge, archive, …) is what emits the segment_<verb> chain event and stamps the matching *_event_id column. The verifier discovers a segment via the row that references it (via the row's context_transient_hash covering rule, or the segment's anchor_before / anchor_after pinned in archive_hmac), not via a "segment created" event — there isn't one.

A direct consequence: bare segments without lifecycle stamps carry no chain reference. The application code can DELETE a redundant bare bookkeeping row without leaving any chain reference dangling, because none was ever written. (In practice the cron pipeline does not exercise this anymore -- ensureSegmentCoverage() treats existing segments as obstacles and slices the new mints around them rather than subsuming them. The safe-to-DELETE shape remains a property of the data model, not an action the cron passes take.)

The cron pipeline shares one mental model:

Cover orphan rows (rows past the earliest-needed retention cutoff not yet inside any segment). The coverage pass groups them into closed granularity buckets and mints one bare segment per uncovered gap via ChainArchiver::ensureSegmentCoverage(). Coverage is the SOLE minter of bare segments in the cron pipeline.
Walk existing segments in the prerequisite lifecycle state past each downstream pass's cutoff, and advance them by calling the matching ChainArchiver method. Each downstream pass (transient-purge, archive, live-purge, orphan-heal, file-purge) only walks; none mints.

Operator-driven entry points (drush audit_trail:archive, the segments admin Save action, the tugboat seeder) also call ensureSegmentCoverage() directly when they need bares outside the cron cadence.

Restore is a one-way ratchet, not a lifecycle escape¶

ChainArchiver::restore() brings an archived segment's rows back into audit_trail for incident response or operator inspection. On a successful restore, the segment row's live_purged_at is cleared back to 0 and live_purged_event_id is re-pointed at the freshly-minted segment_restored chain event. lifecycle_hmac is re-signed under the restored event's signing secret. Three things follow:

Retention re-applies. The cron live-purge pass selects segments via live_purged_at = 0 AND to_created < cutoff; restored segments match again, so the next eligible tick re-DELETEs the rows and stamps a fresh segment_live_purged event. Restored rows can't dodge the live_purge_after ceiling.
Operators inspecting restored rows must pause cron (or temporarily lift live_purge_after). The unpaused-cron behavior is to re-purge within one tick of the threshold window's cron cadence.
Lifecycle stamps don't progress strictly monotonically anymore. The original transient_purged_at → archived_at → live_purged_at → file_purged_at ordering invariant pins the first time each transition fires; restore + re-purge can produce a chain with multiple segment_live_purged events for the same segment, in id-ascending order, interleaved with a segment_restored event. The segment row carries the LATEST live-purge transition; the chain history carries the full sequence.

The verifier's segment-event cross check is aware of this shape -- live_purged_event_id on the segment row is allowed to point at any later same-chain, same-segment event whose action is segment_live_purged or segment_restored. See docs/verification.md for the lookup rule and what tampers it still catches. file_purged_at is left untouched on restore -- if the file was already file-purged before restore (only possible via the override_path import path), the cron file-purge pass's own file_purged_at = 0 filter prevents a second file-purge from firing.

Restore ordering: event first, rows second, segment row last¶

restore() is a three-step orchestration guarded by two upfront checks; each step has a specific concurrency contract.

Pre-flight -- refuse upfront when the segment is already in the finalized-restored state (live_purged_at = 0 AND live_purged_event_id != 0). This prevents a double-call on a healthy segment from emitting an orphan segment_restored event before Step 2's PK collision rolls it back. The importFromFile-then-restore path is preserved because a freshly-imported segment has live_purged_event_id = 0.
Step 1 -- emit the segment_restored chain event under a brief chain-write lock. The lock guards the chain anchor + per-chain id sequence for the single INSERT, then releases. Step 0 (the in-flight-event check) runs inside the lock so two concurrent restore() calls on the same segment can't both pass the check and both emit; the second caller throws on the in-flight detection (the crashed-mid-restore signature that the pre-flight check cannot detect, because the segment row's lifecycle pointer hasn't moved yet).
Step 2 -- replay archived rows + acks inside a DB transaction with no chain-write lock held. Logger writes on the same chain proceed unblocked; their AUTO_INCREMENT ids never collide with the explicit archived ids (which are always below the live head). A mid-loop throw rolls the transaction back so partial INSERTs don't persist.
Step 3 -- reacquire the chain-write lock briefly, reload the segment row, sign lifecycle_hmac against the FRESH lifecycle fields, then UPDATE the four restore-mutated columns (live_purged_at = 0, live_purged_event_id = restored event id, lifecycle_secret_id, lifecycle_hmac). The lock serializes Step 3 against filePurge(), the only sibling lifecycle op whose preconditions are satisfied during the Step 1 -> Step 3 window -- a concurrent file-purge would otherwise leave lifecycle_hmac signed against stale file_purged_at = 0 values.

The chain attestation is therefore committed BEFORE rows land in the live table. This is the property that lets Step 2 run without holding the chain-write lock: every intermediate state (Step 1 done / Step 2 partially done / Step 2 fully done / Step 3 done) verifies cleanly against the existing verifier rules, because:

The verifier does not require "rows present in [from_id, to_id] while segment.live_purged_at != 0" to be absent. It walks visible rows and validates their hash + previous_hash linkage.
segment_restored is a non-cross-checkable lifecycle action; the verifier doesn't strict-equal it against any segment-row column.
The existing live-purge supersession rule already accepts segment.live_purged_event_id pointing at a later segment_restored event (post-Step-3 state). Pre-Step-3 (segment row still references the old purge event) is the baseline strict-match case.

If Step 1 succeeds but Step 2 or Step 3 fails, the segment is in a partially-applied state: the chain records the restore intent, but the rows either rolled back (Step 2 failure) or landed without the segment-row finalization (Step 3 failure). A retry of restore() is refused by Step 0 -- the operator must verify chain integrity, audit_trail row presence in the archived window, and the segment row state, then finalize directly (re-run the missing step manually under SQL). The restore() API does not offer an automated resume path; the partial-state error message names the emitted event id and the [from_id, to_id] window so the operator knows exactly what to inspect.

Coverage pass: minting bares around granularity buckets¶

segment_granularity (hour / day / week / month) controls the bare segment width minted by the coverage pass. hour is intended for staging / testing workflows where waiting a full day for the first segment to materialize is impractical; production installs typically run with day or coarser. The coverage pass groups orphan rows into closed buckets -- buckets whose end-of-window has passed the coverage cutoff -- and mints one bare segment per uncovered gap inside each closed bucket via ensureSegmentCoverage(). Downstream passes (transient- purge, archive, live-purge, file-purge) operate on segments coverage produced; none of them mint.

                            cutoff (earliest needed)
                              ↓
 time  ──── bucket A ──── │ ── bucket B (still open) ── …
        ┌──────────────┐  │   ┌────────────────────┐
 rows   │ R1 R2 R3 R4  │  │   │ R5 R6 R7           │
        └──────────────┘  │   └────────────────────┘
              ↓ closed    │         ↓ open
        mint one bare     │   skip -- a future row may
        segment over      │   still land in bucket B
        [R1.id, R4.id]    │   before its window closes
        ↓
  segment A: bare (no lifecycle stamps yet)
        ↓
  later cron tick past transient_purge_after:
  transientPurgePass() advances segment A,
  stamps transient_purged_at and emits chain event.

The coverage cutoff slides:

With transient_purge_after set, coverage uses transient_purge_after_us -- rows enter segments just before transient-purge needs them.
Without transient_purge_after, coverage falls back to archive_after_us -- rows enter segments just before archive needs them.

The "closed bucket" requirement matters because a still-open bucket may receive new rows in the next cron tick, which would either land in a row id below the bare's from_id (impossible -- ids are monotonic per chain) or require amending an immutable segment (forbidden by the chain attestation). Closed-bucket minting guarantees the bucket boundaries match what archive will later turn into a single NDJSON file.

NDJSON envelope shape¶

Each archive file is a sequence of newline-delimited JSON envelopes. Three envelope types appear, in this order:

{"payload": <canonical>, "transient": <raw|null>, "type": "row"}
…
{"payload": <ack-canonical>, "type": "ack"}
…
{"payload": <archive-record-canonical>, "type": "archive_record"}

row envelopes carry one audit row each, ordered by ascending id. payload is the canonical the row's hash was computed over (channel, chain, severity, action, resource, context_permanent, context_transient_hash, created, secret_id, previous_hash) — byte-identical to what the live verifier reconstructs from the DB row. transient rides outside payload because the canonical / hash / HMAC layer was sealed at write time and cannot include bytes that the cron transient-purge pass may later drop. The verifier hash-binds the side-channel back to the canonical via the already-signed context_transient_hash: when transient is non-null, sha256(transient) == payload.context_transient_hash must hold, otherwise the file is tamper-evident at that row.

The transient field is optional. An envelope without it is read as "no forensic side channel available"; the verifier falls back to the segment-coverage rule on the live row.

ack envelopes carry one acknowledgment each, snapshotted at archive time so the chain can be verified against a restored copy without consulting the live audit_trail_acknowledgment table.
archive_record envelope is the last line and carries the metadata the verifier needs to rebuild the audit_trail_segment row from the file alone after a total- DB-loss scenario: chain, range, anchors, row + ack counts, range timestamps, signing secret id, and the segment's identity HMAC.

The whole file's SHA-256 (over data lines + footer) is signed into audit_trail_segment.archive_hmac at archive time; any byte-level edit anywhere in the file is detectable independent of the per-row HMAC layer.

Forensic envelope¶

Every chained row carries a small envelope of forensic metadata that the framework attaches automatically, without a bridge or contributor needing to opt in. It lands in context_transient so the regular retention contract applies — a GDPR purge clears the actor / IP / referer along with the rest of the row's transient payload.

Key	Source (in order)	Purpose
`uid`	`$context['uid']` if pre-stamped by `LoggerChannel`, else `currentUser->id()`	Actor identity at write time. `0` for anonymous.
`request_uri`	`$context['request_uri']` if pre-stamped, else `request_stack->getCurrentRequest()`	Path the actor was on when the row landed. Empty on non-HTTP paths.
`ip`	`$context['ip']` if pre-stamped, else `$request->getClientIp()`	Client IP. Empty outside HTTP.
`message_template`	Always the PSR-3 message string the logger received	Lets the entry-detail page re-substitute placeholders at view time.

Drupal core's LoggerChannel::log() stamps uid, ip, and request_uri onto the PSR-3 context before any handler sees the entry, so the framework's first lookup almost always wins. The fallback path covers drush invocations, kernel tests, and any direct call into the audit_trail.logger service that bypasses LoggerChannel.

The envelope is not pluggable: a bridge cannot register additional forensic fields. The fixed schema keeps the column shape predictable and the GDPR purge contract auditable — every transient column NULL-out wipes the same set of fields regardless of which bridge produced the row. Bridges that want additional attribution emit it through a contributor (subject to the same retention rules) or as plain context keys (which land in transient alongside the envelope).

Snapshot-delta bucket format¶

Contributors that snapshot the state of an audited subject on each event (entity create/update/delete, webdav lock/copy/move, etc.) historically emitted dual snapshots under two top-level bucket keys:

{"before": { … }, "after": { … }}

The bucket is normalized on the write path into a compact self-describing shape that keeps every field value at most once and carries diff hints alongside the current state:

{
  "_v": 1,
  "state": {
    "extra": "x",
    "status": 1,
    "tags": ["a", "b"],
    "title": "New"
  },
  "delta": {
    "new": ["extra"],
    "original": {
      "title": "Old",
      "old_field": "old_value"
    }
  },
  "key_order": ["title", "status", "old_field", "tags", "extra"]
}

_v is the wire-format version. Readers consult it before trusting the rest of the shape so future format evolutions can ship without rewriting existing rows.
state is the current state of every field after the event. Updated fields keep their post-event value here; newly-added fields too; unchanged fields just sit alongside with no annotation.
delta.new is the list of field names that appeared for the first time on this event. Values for those fields live in state — the list is names-only.
delta.original is the sparse map of previous values for fields whose before-value differed from state. For updated fields, the current value lives in state; for removed fields, the key is absent from state entirely and the value lives only here.
key_order is the merged before/after iteration order — state keys plus any removed-field keys inserted at their natural before-position. Survives canonical JSON encoding because arrays preserve order; object keys would otherwise be alphabetized at storage time.

Write-path overview:

flowchart LR
  CB["Contributor returns<br/>{before, after, ...}"]
  EB["AuditTrailLogger::<br/>encodeBucket()"]
  V{has _v?}
  SC["SnapshotDelta::<br/>compute()"]
  JSON["canonical JSON<br/>{_v, state, delta, key_order, ...}<br/>→ context_*"]
  CB --> EB
  EB --> V
  V -->|yes,<br/>opt-out| JSON
  V -->|no| SC
  SC --> JSON

Field classification at render time, given the shape above:

Field is	Classification
In `delta.new` AND `state`	added
In `delta.original` AND `state`	changed (before in `delta.original`, after in `state`)
In `delta.original` only	removed (value in `delta.original`)
In `state` only	unchanged

Per-action semantics:

Create — emitted as state only (no delta). The row's action=create column implies "everything is new".
Delete — emitted as state only, carrying the pre-delete snapshot. The row's action=delete column implies "everything was removed".
Update — emitted as state + delta blocks for whatever changed.

Equality¶

Field comparison uses loose !=. Drupal field storage routinely surfaces purely cosmetic scalar-type drift ('10000.00' vs 10000, '1' vs 1) across read/write round-trips that strict comparison would flag as a real change. PHP 8+ tightened loose equality enough that 0 == '' returns FALSE, so the historical type-juggling gotchas don't apply.

Contributors that need domain-specific equality semantics (fuzzy floats, whitespace-normalized strings, EXIF-stripped image bytes) pre-compute the snapshot-delta shape themselves and emit the resulting _v / state / delta / key_order keys directly. The framework's auto-fold leaves any bucket already carrying a _v key untouched.

Contributor contract¶

The simplest path: emit before and/or after as top-level peers in a bucket. The framework's SnapshotDelta::compute() folds them into the canonical shape at write time, in AuditTrailLogger::encodeBucket(). Existing contributors that already use the dual-snapshot pattern get the new wire format with zero plugin code change.

Contributors that need stricter equality opt out by emitting the canonical shape directly (with _v at the top); the auto-fold leaves their bucket alone.

The HMAC secret¶

Secrets are first-class config entities: each audit_trail_secret entity carries an integer secret_id column matching the per-row secret_id, a key_id referencing a drupal/key Key entity holding the actual bytes, and a lifecycle status (pending / active / retired).

Why Key-backed: the module never stores the secret bytes in its own storage. The Key module's provider plugins decide where bytes live — config storage, environment variable, file outside the webroot, AWS Secrets Manager, HashiCorp Vault, HSM-backed providers, etc. — and the module dispatches through that abstraction at write and verify time.

Rotation¶

Operators create a new pending audit_trail_secret entity backed by a new Key, then activate it (admin form or SecretRepositoryInterface::rotate()). The repository:

Saves the new entity as active.
Iterates other active entities and retires them.

The order is deliberate: a crash between the two saves leaves two active entities (benign — both have valid Key bytes, fresh writes still succeed) rather than zero (which would halt every chained write). getCurrentSecretId() picks the highest-id active when more than one exists, so fresh writes land on the just-promoted secret even in the transient two-active window. Operators converge to the steady state by re-running activate or by calling retire directly.

Rotation does not rewrite existing rows: every row carries the secret_id it was signed under, and the verifier dispatches per-row. A chain can span any number of rotated secrets without re-signing.

Drop-in via the `logger` service tag¶

AuditTrailLogger implements LoggerInterface and is registered with the logger service tag. Drupal's logger.factory collects every tagged logger and dispatches each log call to all of them — exactly the mechanism dblog and syslog use. No bespoke API, no replacement of core services.

Consumers do not need to know audit_trail exists. A call like:

\Drupal::logger('finance')->notice('Acte signed', [
  'chain' => TRUE,
  'action' => 'state_change',
  'resource' => 'node/' . $nid,
]);

works whether audit_trail is installed or not. With the module installed and the entry's context flag set, the call additionally lands in the chain. Without the module, only dblog / syslog record it.

The canonical write path for modules that audit their own business events is the orchestrator service AuditTrailInterface::event(). It pre-buckets context across the three retention tiers via the ContextContributor plugin pipeline, then dispatches through the same logger. Buggy contributor plugins cannot cascade into the caller — every applies() / contribute() call is wrapped in try/catch and exceptions are surfaced to the audit_trail logger channel without aborting the event.

Hot-path resolution cache¶

Every PSR-3 log call that reaches Drupal's logger pipeline hits AuditTrailLogger::log(). The vast majority of those calls are NOT meant to chain (they have no chain key in context). To keep that fast path cheap, the logger lazy-builds a per-request chain registry on the first log() call and reuses it for every subsequent call.

The registry holds:

entities — every active audit_trail_chain entity, indexed by id.
channel_claim — PSR-3 channel → chain id (first-match wins, deterministic via ksort).
auto_channels — subset of channel_claim restricted to channels that resolve to a mode: auto chain. Drives the implicit-write short-circuit. The default chain contributes its OWN claimed channels (and its id, via channel-as-id); it does NOT auto-claim every other channel via fallback — otherwise a default: mode: auto configuration would chain every log call on the site, including unrelated dblog noise like PHP deprecations.
any_auto — TRUE when at least one active entity is in mode: auto. When FALSE, every implicit log call exits immediately.

The implicit-write short-circuit:

log(level, msg, context):
  if context.chain === FALSE                                → drop
  if context.chain not set:
    if !any_auto                                            → drop
    if not in auto_channels[$channel]                       → drop
  resolveChain → write

On a site with zero mode: auto chains, every dblog-style log call exits in two array reads — no entity load, no foreach, no ksort.

Lifecycle: the registry follows Drupal's static config-entity cache. Saving a chain entity invalidates via the ChainRegistryInvalidator hook so the next log() call in the same request picks up the change.