Skip to content

Verifying integrity

Verification proves — at a chosen moment in time — that no row in a chain has been altered, removed or inserted since it was written.

The service

Drupal\audit_trail\AuditTrailVerifier exposes three methods:

$verifier = \Drupal::service('audit_trail.verifier');

// Discover.
$chain_ids = $verifier->listChains();
// → ['notarial', 'webdav', 'finance', …]

// Per chain.
$result = $verifier->verifyChain('notarial');
// → [
//     'ok' => TRUE,
//     'count' => 4827,
//     'first_broken_id' => NULL,
//     'message' => 'Chain "notarial" verified: 4827 entries intact.',
//   ]

// All chains in one call.
$all = $verifier->verifyAll();
// → ['notarial' => […], 'webdav' => […], …]

verifyChain walks the chain in id order. For each row it checks three layers, in order:

  1. Chain link. Confirms row.previous_hash matches the previous row's hash column. Catches inserted, removed, or reordered rows.
  2. Public hash. Recomputes SHA-256(canonicalize(payload)) and compares with row.hash in constant time. The canonical payload is built from the row's channel, chain, severity, action, resource, context_permanent, context_transient_hash, created, secret_id, and previous_hash columns — the raw context_transient is not in the canonical (only its write-time SHA-256 hash is), which is what lets the cron purge worker NULL the transient column at retention without breaking verification. This layer is publicly verifiable: no secret is required, anyone with read access to the row can reproduce the check.
  3. Operator HMAC. Loads the secret keyed by row.secret_id from the configured SecretRepository (Key-module backend), recomputes HMAC-SHA-256(row.hash, secret) and compares with row.hmac in constant time (hash_equals). Catches rows inserted directly into the DB by an attacker who has table write access but lacks the signing secret.

The two-layer split is deliberate: layer 2 surfaces tampering even when the row's secret is unavailable (rotated out, key deletion); layer 3 detects unsigned forgeries. A mismatch at layer 1 or 2 surfaces as a structural break; a mismatch at layer 3 (or a missing secret) surfaces as an authentication break. The verifier reports both flags independently on the verdict.

The first mismatch wins — the result reports the row id and a human-readable diagnostic. Subsequent rows are not walked (the chain is broken from that point downstream anyway).

Per-row secret_id dispatch means a single chain can span any number of rotated secrets without breaking verification — older rows verify under their original signing secret, newer rows under the current one. A row referencing a secret id that no longer exists (retired secret, or a forged value) is reported as "secret #N not available" at the row's id, which the operator investigates as either a legitimate retirement (cross-check WORM archive) or a forgery attempt.

What a clean walk proves

  • No row between the genesis row and the chain head has been edited (column tampering would change the recomputed hash, and so would the HMAC layer over that hash).
  • No row has been inserted in the middle (a new row's previous_hash would mismatch the surrounding rows).
  • No row has been removed (the row after the deletion would have a previous_hash pointing to a vanished hash).

Verification does NOT prove:

  • That a row at the chain head wasn't WRITTEN by a forger who has the secret. (The signature is symmetric; possession of the secret + database write access lets you produce a chain that validates.) Mitigation: external WORM export, RFC 3161 timestamps on batch boundaries — see security.
  • That every audit-worthy action made it into the chain. (A bug in the consumer that silently drops calls, or a code path that bypasses \Drupal::logger(), is invisible to AuditTrailVerifier.)

Segment-event cross-reference

For each segment_* event the verifier walks (segment_archived, segment_transient_purged, segment_live_purged, segment_file_purged), it confirms a mutual reference with the segment row the event names:

  • The chain event's resource field must be segment:<id>.
  • The matching audit_trail_segment.<transition>_event_id column must point back at the event's id.

A mismatch surfaces as segment row may have been rolled back to a pre-transition state in broken_ranges. This catches the canonical rollback tamper: an attacker with DB write access but no operator secret who tries to undo a lifecycle transition by clearing <transition>_event_id back to 0.

Live-purge supersession exemption. Restore is a legitimate reversal of a prior live-purge (see architecture.md). When the verifier sees a segment_live_purged event whose id doesn't match the segment's live_purged_event_id, it does one extra O(1) primary-key lookup on the value the segment points at. The mismatch is accepted only when the referenced event:

  1. Exists in audit_trail (rules out rollback-to-zero).
  2. Lives on the same chain as the current walk (rules out cross-chain pointer forging).
  3. Has resource = 'segment:<same_id>' (rules out pointers at events for a different segment).
  4. Has action segment_live_purged OR segment_restored (rules out pointers at archive / file-purge / unrelated events).
  5. Has an id strictly greater than the event under verification (rules out pointers at an older live-purge, which would let an attacker hide a more recent purge).

Forged segment_restored events can't exploit this exemption: the row's HMAC is checked at verifyRow() before the cross-reference helper runs, so an event without the operator secret never reaches the exemption code.

Other segment transitions (segment_archived, segment_file_purged, segment_transient_purged) are not reversible by restore, so their strict-equality check stays in force — any rollback tamper on those columns is still surfaced.

Running verification periodically

The standard pattern is a cron-driven verification job that calls drush audit_trail:verify and alerts (via Drupal watchdog → external monitor, or via email, or via a status report block) on any non-ok result. The drush command exits non-zero when any chain breaks, so a one-liner crontab covers the integration:

0 * * * * drush audit_trail:verify \
          || mail -s "audit_trail alert" admin@example.test

For embedded use (e.g., a custom alerting script that talks to PagerDuty), call the verifier service directly:

$results = \Drupal::service('audit_trail.verifier')->verifyAll();
$bad = array_filter($results, fn ($r) => !$r['ok']);
if ($bad !== []) {
  $msg = "AUDIT_TRAIL INTEGRITY BREAK:\n";
  foreach ($bad as $chain => $r) {
    $msg .= "  - {$chain}: {$r['message']}\n";
  }
  fwrite(STDERR, $msg);
  exit(1);
}
echo "All chains verified.\n";

Performance: incremental verification + checkpoints

A full walk costs O(chain length) — fine for hundreds of entries, intolerable for the multi-million-entry chains a long-lived audit-worthy install accumulates. The module keeps verification cheap with per-chain checkpoints:

  • Every time verifyChainIncremental() walks a chain cleanly to its current head, it mints a row in audit_trail_checkpoint recording (chain, last_id, last_hash, created) plus an hmac column signing the tuple.
  • The next call reads the most recent checkpoint for that chain, starts the walk after last_id, expecting last_hash as the genesis-equivalent of previous_hash.

A typical operational pattern:

  • Cron hourly: verifyAll() (default: incremental). Each chain walks only the rows since its last checkpoint — minutes-to-hours of activity, hundreds to thousands of entries at most. Sub-second.
  • Cron weekly (or on-demand): verifyAll(full: TRUE). Full cold walk from genesis to head. Useful as a belt-and-braces check even though incremental walks already validate checkpoint signatures (see below): a full walk re-derives every HMAC from the secret and ignores checkpoints entirely.

Checkpoints are themselves signed with the row's signing secret — each row carries an hmac column over (chain || last_id || last_hash || created), keyed by the secret_id that signed the audit row at last_id. verifyChainIncremental() validates the checkpoint's signature before trusting it; a forged or modified checkpoint fails the check, the verifier falls back to a full walk from genesis, and the result is flagged with checkpoint_forged => TRUE plus a warning in the message so operators can investigate the forgery itself as a security event.

Checkpoints are optimization, not source of truth. They speed up the common case (cron polling) but the chain itself is the authoritative record. Lose all checkpoints and a full walk still verifies the chain end-to-end.

For chains expected to outgrow what a single-process walk handles even with weekly full verification (multi-million row, multi-year archives), the roadmap plans for chain rotation (yearly closure + fresh chain) and external WORM export + qualified TSA timestamping — at which point old chains live in a write-once archive and are verified once, against their qualified timestamp, rather than re-walked from the DB.

API

$verifier = \Drupal::service('audit_trail.verifier');

// Incremental — fast, default.
$verifier->verifyChainIncremental('notarial');
// → ['ok' => TRUE, 'count' => 73, 'checkpoint_minted' => TRUE,
//    'message' => 'Chain "notarial" verified incrementally:
//                  73 new entries since last checkpoint at id 4754.
//                  Checkpoint refreshed.']

// Manually mint a checkpoint (independent of verification).
$verifier->mintCheckpoint('notarial');

// Full cold walk — slow, on demand.
$verifier->verifyChain('notarial');

// All chains: incremental by default, pass full: TRUE for cold.
$verifier->verifyAll();
$verifier->verifyAll(full: TRUE);

What a broken chain means

If verifyChain returns ok => FALSE, treat it as a security incident:

  • The break could be benign — a developer ran an ad-hoc SQL UPDATE in dev to fix a typo, a database migration altered the table. Verify the timestamp on the broken row against the local change log.
  • The break could be adversarial — someone with database access edited a row to cover an unauthorized action. Treat as a compromise: rotate the master secret, audit other systems, follow the incident response plan for the deployment.

Either way the chain stays usable for new entries from the break onwards (previous_hash of the next row records the broken row's hash so the chain "heals" from there). But every row from the break to the verification point is now considered unverified — annotate the incident in your audit log, preferably in the same chain (with a chain: TRUE event explaining the discrepancy), so the trace stays self-describing.