| Record | Timestamp | System | User | Role | Action | Entity | Field | Reason | IP / Host | Batch |
|---|
The premise. Across the last decade of FDA warning letters and 483 observations, the data-integrity findings cited most often — shared logins, after-hours edits without reason, deletion-after-creation patterns, post-batch edits, audit-trail discontinuity, vague or missing reasons for change — are all mechanically detectable from audit-trail exports. Most sites don't look. AuditTrail Sentinel looks.
Pure Python, ~1,400 LOC, fully unit-tested. pandas for tabular work, lxml for XML audit-trail exports, regex for pattern detection inside reason-for-change strings, SQLite for inter-rule queries and finding persistence. CLI for batch runs (cron-able), Streamlit front-end for QA review. Designed and documented to run as a controlled, validated utility under GAMP 5 Category 5 with versioned URS, IQ/OQ/PQ, and audit trail of its own findings.
Plugins for each system family — LIMS (LabWare, SampleManager), CDS (Empower, Chromeleon), MES (Werum PAS-X, Aspen PEM), instrument-resident logs (LabX, KQCL, FTIR vendor logs). Each plugin maps the system's native audit-trail schema to a normalized internal record: record_id · sequence_num · timestamp · system · user_id · user_role · action · entity_type · entity_id · field · old_value · new_value · reason_for_change · ip_address · hostname · session_id · batch_id · batch_status.
Each rule is a separate Python module with a documented detection function, a unit-test suite, configurable thresholds, and a fixed severity weight. The orchestrator runs all enabled rules sequentially against the SQLite store, persists findings with cross-references to the source records, and emits a JSON manifest plus a human-readable report. Rule weights are configurable per site so each deployment can tune sensitivity without touching code.
| Rule | Name | Primary ALCOA+ attributes | MHRA 2018 citation |
|---|---|---|---|
| R-001 | Shared Account Detection | Attributable | §6.2 · Access control |
| R-002 | Unauthorized Privilege Escalation | Attributable · Accurate | §6.3 · User access management |
| R-003 | Abnormal Time-Stamp Clustering | Contemporaneous · Accurate | §6.6 · Data review |
| R-004 | Deletion-After-Creation | Original · Enduring | §6.16 · Data lifecycle |
| R-005 | Sequence Gap Detection | Complete | §3.5 · Audit trail completeness |
| R-006 | Post-Batch Edit | Contemporaneous · Enduring | §6.6 · Contemporaneous record |
| R-007 | After-Hours Edit Pattern | Contemporaneous · Attributable | §6.6 · Data review timing |
| R-008 | Audit Trail Gap (System-Wide) | Complete · Enduring | §3.5 · Audit trail continuity |
| R-009 | Original Record Modification | Original · Accurate | §6.16 · Reason for change |
All records, users, IPs, hostnames, and findings shown above are fabricated for portfolio demonstration purposes. The dataset is generated client-side from a seeded random process designed to plant a known number of violations across each rule pack so the demo produces verifiable, repeatable results. No real GxP audit-trail data is present anywhere in this artifact.
Multivariate scoring (instead of per-rule independent weights), an ML-based anomaly layer trained on per-site baselines, automatic ticket creation into the QMS via API, and a Power BI back-end so QA can trend findings the same way they trend deviations. The path is clear; the rule layer comes first because it's deterministic and inspector-explainable.