Roadmap¶

Tracked here so the design discussions that fed into this module don't get lost between 1.0 and 2.0.

In the 1.0 line¶

Chained PSR-3 logger plugged into Drupal's standard logger pipeline via the logger service tag.
Per-chain audit_trail_chain config entities with mode: flag (default) / mode: auto selection, channel claims via channels[] list, per-chain retention overrides, and per-chain contributor pipeline configuration.
Two-tier retention model (context_permanent / context_transient) with hash-only signing of the transient column for GDPR-compatible purge. Segment-row attestation legitimizes the NULL'd column at verify time so attacker-NULL is distinguishable from operator purge.
Publicly-verifiable SHA-256 hash chain plus operator HMAC-SHA-256 layer per row.
Multi-tamper detection in a single verifier walk — every contiguous broken range surfaces in verdict.broken_ranges without operators needing to ack-and-re-verify iteratively.
Schema-level chain-fork prevention via UNIQUE(chain, previous_hash).
Multi-secret rotation via audit_trail_secret config entities backed by drupal/key. Activate-new-first / retire-old-second ordering so a crash mid-rotation leaves two actives (benign) instead of zero (halts writes). Per-row secret_id lets a chain span any number of rotated secrets.
Integrated Key-backed secret storage — the module consults drupal/key providers at write / verify time. Provider choice (config / file / env / cloud-managed / HSM-backed) is the operator's, not the module's.
Signed verification checkpoints in audit_trail_checkpoint for incremental walks.
Auto-archive lifecycle — cron-driven 3-stage transitions (live → archived → live-purged → file-purged) with per-chain overrides and WORM-archive bridging in the verifier.
Operator acknowledgments in audit_trail_acknowledgment for known-unverifiable row ranges (e.g. secrets lost in an incident).
Cron auto-verify with checkpoint minting and a runtime-requirements surface on /admin/reports/status.
Silent-row-drop counter on chain-write lock contention — surfaces as a status-report warning so sustained contention can't be exploited to hide rows.
Permission split: view audit_trail reports (read-only) vs run audit_trail verification (gates the CPU-expensive verify routes) vs administer audit_trail (secrets / chains / acks).
Admin pages: entries listing with side-by-side diff, chain collection, secret list, archive list with per-archive SHA-256 + lifecycle HMAC verdicts, ack list, settings form.
Bundled submodules: audit_trail_entity (generic entity events bridge), audit_trail_entity_paragraphs (paragraph ancestry contributor), audit_trail_tsa (RFC-3161 TSA timestamping). WebDAV bridging is a separate top-level contrib module (audit_trail_webdav) released independently.
Drush commands: audit_trail:verify, audit_trail:auto-archive, audit_trail:archive, audit_trail:purge, audit_trail:archive-restore, audit_trail:timestamp (TSA), audit_trail:rotate-secret / audit_trail:retire-secret (lifecycle).
Comprehensive test coverage: 185+ tests across unit / kernel / functional, covering happy-path writes, every tampering shape (row edit / delete / insert / hmac forge / chain-link break), multi-tamper detection, secret-rotation atomicity, contributor isolation, byte-stability of canonicalize(), TSA auth-options threading, lock-contention drop counter, chain-archive lifecycle, ack workflow.

Near-term (1.x point releases)¶

Archive list lazy-load — the archive collection page recomputes SHA-256 on each visible NDJSON file at render time. Lazy-load the per-row badges via AJAX so the page becomes interactive in O(1) instead of O(archived files).
audit_trail:purge --resume — pick up after a partial purge run that crashed mid-way, instead of forcing operators to manually inspect the audit_trail_segment table to figure out the right next id range.
JSON-extract functional indexes — for sites that filter the entries list by uid / ip / request_uri on a multi-million-row chain, add generated columns + indexes in a settings-toggle-driven update hook. Default off (most installs don't filter at that scale).
TSA end-to-end test fixture — containerized TSA in audit_trail_tsa/tests/fixtures/ so the cron-driven TSA anchor flow gets covered without mocking the openssl pipeline.
Archive-format versioning column on the audit_trail_segment row — opt-in field declaring the NDJSON encoding (uncompressed | gzip | future variants). Default empty = uncompressed (current behavior); future compression / sharding MRs land when this column is populated. Small change; ships ahead of the first format- changing MR so writer + verifier branch on the column rather than the file content.
Multi-chain archive interleaving test — coverage gap: no kernel test currently exercises archive ops on two chains running concurrently. Pin the chain-write lock's per-chain isolation under interleaved access.
Published docs site — mkdocs build deployed alongside the drupal.org project page so operators get a navigable site instead of having to read the raw markdown. Read-the-Docs or a similar static host; the mkdocs.yml is already configured.

Medium-term (1.x / 2.0 candidates)¶

drupal/audit_log StorageBackendInterface integration so users of that module can opt into HMAC chaining without leaving their workflow. Bridges the two modules; collaboration rather than competition.
Crypto agility — store the algorithm name alongside each row (hash_algorithm column defaulting to sha-256), let the writer stamp new rows with a fresher algorithm, let the verifier dispatch per-row. Migrates the chain forward without rewriting existing rows when a primitive needs replacing. Required for multi-decade archival retention (notarial 75-year, regulatory 10-year).
Multi-instance distributed locking — Drupal's default database lock backend works fine for single-node deployments. Sites scaling horizontally with no central database lock need a Redis-backed lock. Document the configuration, ship a services.yml example.
audit_trail.id migration to bigserial — the current serial unsigned ceiling at ~4.3 billion rows is comfortable for typical sites but tight for IoT / high-frequency-trading deployments. Schema migration is expensive on large tables; ship in 2.0 with an explicit update hook + operator-visible timing warning.
Plugin-instance cache in AuditTrail::event() — the contributor-pipeline orchestrator currently re-instantiates every enabled #[ContextContributor] plugin on each event() call. Cache instances per chain so a high-volume caller doesn't pay the createInstance() cost per row. Drop in when telemetry shows the per-event microsecond budget matters.
Verifier code split (Walker / CheckpointStore / AcknowledgmentStore) — AuditTrailVerifier is approaching the size where carving out the checkpoint + ack responsibilities into collaborator services would aid navigation. Pure refactor, no behavior change.

Long-term (legal-grade archival)¶

Field-level redaction (GDPR-grade data minimization) — wholesale archive + purge handles row-level retention, but doesn't help when ops needs to drop a SINGLE field on a SINGLE row (e.g. erase a PII column under a right-to- erasure request) while keeping the rest of the row verifiable. The plan: a new audit_trail_redaction table recording (chain, row_id, field, original_value_hash, redacted_at, uid, secret_id, hmac). The redacted field is NULLed in the row's context and the verifier rebuilds the canonical payload by reading the redaction record's original_value_hash in place of the missing field, so the row's stored hash still derives. Operator-driven via drush / UI; redactions are themselves chain-anchored evidence. Cleaner long-term GDPR posture than archive + purge for surgical erasure requests.
Storage backend abstraction — pull the DB write behind a StorageBackendInterface. Plug-ins to consider:
Append-only file backend (logrotate-friendly, harder to tamper at OS level than a DB table).
KMS-sealed backend that signs each row with an HSM-held key (escrowed material).
WORM cloud bucket backend for direct write-once persistence without local archives.
Conditional "forget" stage for old archive bookkeeping rows (see brainstorming below).

Brainstorming (not committed to a timeline)¶

Conditional "forget" stage for old archive bookkeeping¶

The shipped 4-stage lifecycle ends at file-purge: the NDJSON file is gone but the audit_trail_segment bookkeeping row stays forever, providing the bridge anchors the verifier needs to span the empty range.

A 5th "forget" stage that deletes the bookkeeping row is technically safe in one configuration: when the live segment is fully detached from genesis (every row before the live segment has been live-purged), only the archive immediately preceding the live segment is consulted by the verifier. Older archive bookkeeping rows are ornamental — their anchor_after hashes don't line up with any live row's previous_hash.

What you give up by forgetting: public-hash linkage back to genesis. With only the bridge archive kept, you can HMAC-verify "the live chain plus its immediate-pre-live anchor are operator-authentic" — but you can't independently reconstruct the SHA-256 hash chain from genesis anymore. For operators who don't care about multi-decade chain walkability, it's a clean trade.

Why it isn't on the roadmap yet: storage savings are modest (a 10-year monthly-archived chain = ~120 bookkeeping rows, total ~10 KB); the bookkeeping is forensic evidence of how the chain got from genesis to here; the use case is rare. If a concrete operator need surfaces, the rule above is a clean starting point.

Hash-salt secondary HMAC (defense in depth)¶

Today each row's hmac column signs (row.hash, secret). A secondary HMAC over (row.hash, salt) where salt is a per-secret random byte string would let an operator detect secret-extraction attacks: if an attacker steals the secret bytes and starts forging rows, their forgeries wouldn't carry the secondary HMAC (the salt isn't in the key material; it lives in a separate Key entity). Detection-not-prevention; useful for incident-response posture. Not on the roadmap proper because the threat model it addresses (attacker holds the secret but not the salt) is narrow and operationally fragile (rotating the salt requires re-signing every row, same shape as a full key rotation). Captured here so the conversation doesn't get lost.

Submitting to drupal.org¶

A separate axis from version progression. Items needed to publish on drupal.org as a project:

Drupal.org Security Advisory opt-in — coordinate with the drupal.org security team for SA-CONTRIB coverage. The module's tamper-evidence promise makes covered-status particularly valuable. Requires maintainer approval workflow.
hook_uninstall() cleanup verification — confirm every config entity, schema table, State key, and Key reference the module installs is cleanly removed on uninstall, leaving no orphan data on a fresh-uninstall test.
Update-hook chain audit — every update_N correctly reversible if possible; otherwise documented as one-way. Currently no released versions, so the update-hook surface is empty; this becomes relevant the first time we ship schema changes against a released version.
Cspell project-words cleanup — .cspell-project-words.txt has accumulated entries over development; sanity-check that every word is genuinely necessary (no typos slipped in as permanent dictionary entries).
PHPStan baseline-free check on a fresh checkout — confirm phpstan analyse runs clean without any baseline ignores. Important for drupal.org's static-analysis CI pipeline.
README badges / project metadata for the drupal.org project page — pipeline status, latest tagged release, PHPUnit coverage, Drupal core compatibility matrix.

Out-of-scope (other tools do it better)¶

Real-time stream into a SIEM — for sites already running Splunk / ELK / Sentry / Datadog Logs, the existing core/syslog + a syslog → SIEM pipe handles this. The chain is a complement (integrity proof on a small subset of events), not a replacement.
Entity revision tracking — that's audit_log's expertise. The two modules can coexist.
Application-layer access control — chain integrity doesn't say anything about whether the actor was authorized to perform the recorded action. That stays the responsibility of the consumer module.
Asymmetric / third-party-verifiable signatures — the module's HMAC layer is symmetric (operator proves authenticity to themselves). For third-party-verifiable provenance, layer TSA timestamps (already shipped) and WORM archival on top. A future module could add an asymmetric signature column for full PKI-grade provenance, but that's a different problem domain.