RzR

Posted on Jun 14

Stop Hand-Rolling Audit Logs in EF Core

#efcore #audit #trail #csharp

Every team that touches regulated data eventually has the same meeting. Someone from compliance asks a deceptively simple question: who changed this customer's email address, when, and what was it before? And the room goes quiet, because the honest answer is that nobody knows. The change happened three sprints ago, the previous value is gone, and the closest thing to a record is a ModifiedBy column that says "system."

If you have lived through that meeting, the rest of this will feel familiar. And if you are the manager who has to explain to the auditor, or to legal, or to a customer why the answer is "we'll have to look into it," it probably stung more.

This post is written for both seats at that table. There is enough detail here for the developer who has to wire it up, and enough framing for the engineering manager who has to decide whether it is worth the team's time. Skip ahead if a section is not yours; the code blocks are for the people writing code, and the build-versus-buy math is for the people approving it.

Most .NET teams solve auditing the slow way. They sprinkle logging calls through their service layer, add CreatedBy and UpdatedBy columns to every table, maybe write a base repository that tries to diff entities by hand. It works until it doesn't. Someone adds a new entity and forgets the audit hook. A bulk update bypasses the service layer entirely. The diff logic captures the new value but not the old one. Six months in, you have audit coverage that is roughly 70 percent complete and 100 percent untrustworthy, which is worse than none at all because it gives you false confidence.

RzR.DataVigil takes a different bet. Instead of asking you to remember to audit, it hooks into the one place every change has to pass through anyway: EF Core's SaveChanges.

The interceptor does the watching

Here is the core idea. EF Core has an interceptor pipeline, and DataVigil plugs into it with AuditSaveChangesInterceptor. When SavingChanges fires, it walks the ChangeTracker, picks out every Added, Modified, or Deleted entry that implements IAuditable, and builds a record of exactly what changed at the property level. Old value and new value, side by side, for every field that moved.

Your entities do not change shape. You do not write diff code. You add a marker interface and the watching happens underneath you.

public class Order : IAuditable
{
    public int Id { get; set; }
    public string CustomerEmail { get; set; }
    public decimal TotalAmount { get; set; }
    public string Status { get; set; }
}

That is the whole entity-side change. From that point on, every create, update, and delete on Order produces an audit transaction. Entities that do not implement IAuditable are ignored completely. The interceptor does not even look at them, so there is no performance tax on the parts of your model you do not care about.

The shape of what gets captured is worth pausing on, because it is more honest than most homegrown systems. A create records null as the old value and the new value as new. An update records only the properties that actually changed, with their previous and current values. A delete records the previous values against null. No noise, no rows full of unchanged fields pretending to be changes.

Who, not just what

Capturing the change is half the job. The half that compliance actually cares about is attribution, and this is where most roll-your-own systems fall apart.

DataVigil enriches every transaction with context. In an ASP.NET Core app, you call AddAuditTrailAspNetCore() and it pulls the user identity straight from HttpContext: the NameIdentifier or sub claim for the user ID, the Name claim for the username, the remote IP address. It looks for an X-Correlation-Id or X-Request-Id header and falls back to the current Activity if neither exists, so the audit record ties back to the exact request that caused it. Trace IDs come along for the ride. You get a record that can answer "who changed what, when, and as part of which request" without you wiring a single thing into your controllers.

And before you ask: yes, it works outside the web. Background workers and console apps have no HttpContext, so you drop the ASP.NET Core package and set the user yourself through IAuditScopeContext.

scopeContext.SetUser(new AuditUserInfo
{
    UserId = "worker-1",
    UserName = "BackgroundWorker",
    IpAddress = "127.0.0.1"
});

Now your nightly batch job writes audit records attributed to worker-1 instead of a shrug. That distinction matters more than it sounds. The first time an auditor asks why a thousand records changed at 3 a.m., "the reconciliation worker did it, here is the run" is a much better answer than silence.

GDPR is built in, not bolted on

This is the part that earns the library its place in a regulated codebase, and it is genuinely the strongest thing about it.

Auditing and privacy law are in natural tension. You want to record everything that changed, but you are not allowed to keep a plaintext copy of someone's Social Security number sitting in an audit table forever just because it appeared in an update. DataVigil resolves this with field-level policies that run before anything reaches the database.

options.Gdpr.ForEntity<Customer>(e =>
{
    e.ExcludeOnStorage(c => c.CreditCardNumber);   // never written
    e.MaskOnStorage(c => c.Email);                 // j***n@mail.com
    e.HashOnStorage(c => c.Ssn);                   // SHA-256 digest
    e.AnonymizeOnStorage(c => c.Phone);            // [ANONYMIZED]
});

You decide, per field, whether a value is excluded entirely, masked, hashed, anonymized, or run through a transform you write yourself. The credit card number never lands in the audit store. The email is masked. The SSN is hashed, so you can still prove two records referenced the same value without ever storing the value.

Then there is a second, separate layer for retrieval. Storing data safely is one thing; controlling who can read it back is another. DataVigil lets you gate fields by role or claim at query time.

options.Gdpr.ForEntity<Customer>(e =>
{
    e.MaskOnRetrieval(c => c.Email, access => access
        .AllowRoles("Admin", "Auditor"));

    e.AnonymizeOnRetrieval(c => c.Ssn, access => access
        .AllowClaim("gdpr", "full"));
});

A support tier-one user querying the audit trail sees masked emails. An auditor with the right role sees them in full. The same stored record, two different views, decided by who is asking. Roles are checked first, then claims, and a field with no match stays masked. You do not build this. You declare it.

The piece that will make a compliance officer actually smile is the right-to-erasure support. GDPR Article 17 says a person can demand their personal data be wiped. With most audit systems that is a contradiction, because the whole point of an audit log is that you do not delete from it. DataVigil threads the needle:

await auditStore.AnonymizeByUserAsync("user-123");

That call scrubs the user's identity: UserId, UserName, IpAddress, across every transaction they ever touched, replacing them with [ERASED], while leaving the structural audit records intact. You satisfy the erasure request and keep a defensible audit trail. Those two goals usually fight each other. Here they do not.

The cost nobody puts on the sprint board

Here is the part for whoever owns the roadmap.

A homegrown audit system rarely shows up as a line item. It arrives as scope creep on a dozen other tickets. A developer adds two hours of audit plumbing to a feature, then another two when the requirements shift, then a day six months later debugging why the trail went silent after a refactor. None of it is tracked as "audit work," so it never gets estimated, and it never gets credited. It just quietly eats velocity, and the bill comes due during a compliance review when you discover the coverage has holes and someone has to spend a sprint backfilling.

The build-versus-buy math here is not close. The capability you would spend weeks building and then maintaining indefinitely: interceptor wiring, property diffing, user attribution, field-level redaction, retrieval gating, erasure, retention, already exists, tested, behind a marker interface and a few lines of registration. The maintenance cost moves off your team and onto the library. When EF Core 10 lands, that is the maintainer's problem to keep current, not a backlog item that competes with your roadmap.

There is a risk angle too, and it is the one that should get a manager's attention. An incomplete audit trail is a liability dressed up as a feature. It tells the organization "we have auditing" while quietly failing to record the exact events that matter most, which is precisely the situation that turns a routine GDPR inquiry into a bad week. Pushing the audit logic down to the interceptor removes the human step where coverage gets forgotten. That is not a convenience. It is the difference between a control you can attest to and one you are hoping holds.

And the GDPR features above are not just developer ergonomics. Field-level masking, hashing, role-gated retrieval, and one-call right-to-erasure are the concrete artifacts you point to when someone asks how you handle personal data in your logs. They turn an abstract policy commitment into something a developer can show working in code, which is exactly what an auditor wants to see and exactly what is hardest to produce after the fact.

Store it wherever you already live

None of this forces a database on you. DataVigil ships four storage providers behind a single IAuditStore interface: SQL Server, PostgreSQL, and MongoDB for the real backends, plus flat JSON files (one per day) for when you just need something quick or local. The SQL providers create their own schema with sensible indexes on timestamp, user, and correlation ID, and you point them at a separate database so a careless migration on your app data cannot take the audit trail down with it.

If none of the four fit, you implement IAuditStore yourself. Four methods: save, query, anonymize by user, purge before a date. That is the entire contract.

That last method, purge, hooks into a built-in retention service. Set a window, register the background service, and old records get cleaned up on a daily cycle without a cron job you have to babysit.

options.Storage.WithRetention(90); // purge entries older than 90 days
services.AddAuditRetentionService();

What it actually buys you

The honest pitch is not that DataVigil does something you could never do yourself. You could build all of this. The pitch is that the version you build yourself will have gaps, and the gaps are exactly where the risk lives: the entity someone forgot to hook, the bulk operation that skipped the service layer, the SSN that got logged in plaintext because nobody thought about that field.

Moving the audit logic down to the interceptor closes the gap by construction. There is no path to the database that does not pass through SaveChanges, so there is no path that escapes the audit. That structural guarantee is the whole thing. Everything else like the GDPR policies, the multi-store support, the erasure call, is what makes the guarantee livable in a codebase that has to answer to a regulator.

The library targets .NET Standard 2.1 and works with EF Core 5 through 9, so it slots into nearly anything you are running today. If you have a .NET app that holds data someone might one day ask questions about, the setup is about fifteen lines in Program.cs and a marker interface on your entities.

The next time compliance walks in and asks who changed what, you would rather have the answer ready than promise to go digging. Auditing is one of those things that is cheap to add before you need it and miserable to reconstruct after; which makes it a rare case where the developer's instinct and the manager's risk calculus point in the same direction.

The repo is here, and the wiki walks you from zero to a working trail in about five minutes. Fifteen lines in Program.cs for the team that builds it, one fewer open question for the person who has to answer for it.