Andreas Hatlem

Posted on Mar 6

GDPR Compliance Is Not a Cookie Banner: The Engineering Work Nobody Talks About

#security #webdev #saas #devops

Ask a developer what GDPR compliance means and you'll get one of two answers: "we added a cookie banner" or "that's a legal problem, not an engineering problem."

Both are wrong.

GDPR compliance is fundamentally an engineering problem. It requires changes to your database schema, your API layer, your logging infrastructure, your backup strategy, and your deployment pipeline. The cookie banner is maybe 5% of it. The other 95% lives in code that most teams never write — until a Data Protection Authority comes knocking, or a user submits a Subject Access Request and the team realizes they have no way to fulfill it.

Let me walk through what GDPR compliance actually requires at the systems level, with real implementation details.

The Scope Problem: You Can't Protect What You Can't Map

Before you write a single line of compliance code, you need to answer a deceptively hard question: where does personal data live in your system?

This is data mapping, and it's where most compliance efforts either succeed or fall apart. In a typical SaaS application, personal data doesn't sit neatly in a users table. It's scattered across:

Primary database: user profiles, preferences, billing info
Application logs: IP addresses, user agents, request paths with query parameters containing emails
Error tracking: Sentry, Datadog, LogRocket — full stack traces with user context
Analytics: event streams with user IDs, session recordings
Email services: SendGrid, Postmark — email addresses, delivery logs, open tracking
Payment processors: Stripe, Paddle — customer objects, invoice history
File storage: S3 buckets with user-uploaded content, profile photos
CDN logs: Cloudflare, CloudFront — IP addresses, geolocation data
Search indexes: Elasticsearch, Algolia — denormalized user data
Cache layers: Redis — session data, user objects
Message queues: RabbitMQ, SQS — events containing user data in transit
Third-party integrations: CRM systems, support tools like Intercom or Zendesk
Backups: database snapshots that contain everything above

That's potentially 15+ systems where a single user's personal data exists. GDPR Article 30 requires you to maintain a Record of Processing Activities (ROPA) that documents every one of these, including what data is stored, why, the legal basis for processing, retention periods, and any third-party transfers.

Here's what a minimal data map entry looks like in practice:

interface ProcessingActivity {
  system: string;
  dataCategories: string[];       // "email", "ip_address", "name", etc.
  purpose: string;                // "account authentication", "error tracking"
  legalBasis: string;             // "consent", "contract", "legitimate_interest"
  retentionPeriod: string;        // "30 days", "duration of contract + 6 months"
  thirdPartyRecipients: string[]; // "Stripe Inc (US)", "Sentry.io (US)"
  transferMechanism?: string;     // "SCCs", "adequacy decision"
  deletionMethod: string;         // "API call", "automatic TTL", "manual process"
}

Most teams don't have this documented. They don't know which systems hold personal data, let alone the retention periods or deletion mechanisms for each one. Building this inventory is the first real engineering task, and it requires touching every service in your architecture.

Right to Erasure: The Hardest Delete You'll Ever Write

GDPR Article 17 gives users the right to request deletion of their personal data. On the surface, this sounds simple: DELETE FROM users WHERE id = ?. In practice, it's one of the most complex distributed systems problems you'll face.

Here's why. When a user requests erasure, you need to:

Delete their data from every system in your data map
Do it within 30 days (the legal deadline)
Prove you did it (audit trail)
Handle cases where you legally must retain some data (tax records, fraud prevention)
Propagate the deletion to any third parties you've shared the data with

Let's look at what a real erasure pipeline looks like:

interface ErasureRequest {
  id: string;
  userId: string;
  requestedAt: Date;
  deadline: Date;              // requestedAt + 30 days
  status: 'pending' | 'processing' | 'completed' | 'partially_completed';
  systems: SystemErasureStatus[];
}

interface SystemErasureStatus {
  system: string;
  status: 'pending' | 'completed' | 'failed' | 'retained';
  retainedReason?: string;     // Legal basis for keeping data
  completedAt?: Date;
  error?: string;
}

The deletion pipeline needs to be orchestrated — you can't just fire off parallel deletes and hope for the best. Some systems have dependencies:

async function processErasureRequest(request: ErasureRequest) {
  const userId = request.userId;

  // Phase 1: Stop processing immediately
  await disableAccount(userId);
  await revokeAllSessions(userId);
  await removeFromActiveQueues(userId);

  // Phase 2: Delete from primary systems
  // Order matters — delete from dependents first
  await deleteFromSearchIndex(userId);     // Algolia/Elasticsearch
  await deleteFromCache(userId);            // Redis sessions, cached objects
  await deleteFromAnalytics(userId);        // Anonymize event streams
  await deleteFromFileStorage(userId);      // S3 uploads, profile photos

  // Phase 3: Delete from third-party services
  await deleteFromEmailService(userId);     // SendGrid contacts
  await deleteFromErrorTracking(userId);    // Sentry user data
  await deleteFromPaymentProcessor(userId); // Stripe (with legal retention check)
  await deleteFromSupportTool(userId);      // Intercom/Zendesk

  // Phase 4: Delete from primary database
  // This goes last because other systems may reference user data
  await deleteFromDatabase(userId);

  // Phase 5: Handle data that must be retained
  await anonymizeRetainedRecords(userId);   // Invoices, tax records

  // Phase 6: Log the completion
  await logErasureCompletion(request);
}

Each of those functions is its own challenge. Let's look at a couple of the hard ones.

Anonymizing Instead of Deleting

For financial records (invoices, tax receipts), you're often legally required to retain them for 5-10 years depending on jurisdiction. But you can't keep them with personal data attached. The solution is anonymization:

async function anonymizeRetainedRecords(userId: string) {
  // Replace personal data with anonymous identifiers
  // Keep the financial data intact for tax/audit purposes
  await prisma.invoice.updateMany({
    where: { userId },
    data: {
      customerName: 'REDACTED',
      customerEmail: 'REDACTED',
      customerAddress: 'REDACTED',
      // Keep: amount, tax, date, invoice number (required for accounting)
    }
  });

  await prisma.transaction.updateMany({
    where: { userId },
    data: {
      userEmail: null,
      userName: null,
      // Keep: amount, date, reference number
    }
  });
}

The Backup Problem

Here's the question that trips up every engineering team: what about backups?

Your database backups contain the user's data. Are you going to restore every backup, delete the user, and re-create the backup? For most teams, that's operationally impossible.

The common approach is to maintain a "tombstone" or exclusion list — a record of deleted user IDs that gets checked whenever a backup is restored. If you restore from a backup, the restoration process must check the erasure log and re-delete any users who were erased after the backup was taken.

// On backup restore, run this before the application starts
async function reconcileErasures() {
  const erasedUsers = await prisma.erasureLog.findMany({
    where: {
      completedAt: { gte: backupTimestamp }
    }
  });

  for (const record of erasedUsers) {
    await processErasureRequest(record);
  }
}

This is non-trivial infrastructure. It requires discipline in your backup restoration process and an erasure log that itself is never included in the data that gets deleted.

Consent Management: More Than a Boolean

Most applications store consent as a single boolean field: marketingConsent: true. That's insufficient under GDPR. You need:

Granularity: Separate consent for each processing purpose
Versioning: What did the consent text say when the user agreed?
Timestamps: When was consent given or withdrawn?
Proof: Enough context to demonstrate the consent was freely given and informed

Here's a more complete consent model:

model Consent {
  id          String   @id @default(cuid())
  userId      String
  purpose     String   // "marketing_emails", "analytics", "data_sharing"
  granted     Boolean
  version     String   // Version of the consent text shown
  source      String   // "signup_form", "settings_page", "api"
  ipAddress   String?
  userAgent   String?
  grantedAt   DateTime
  withdrawnAt DateTime?
  createdAt   DateTime @default(now())

  @@index([userId, purpose])
}

When a user updates their consent preferences, you don't update the existing record — you create a new one. The full history must be preserved:

async function updateConsent(
  userId: string,
  purpose: string,
  granted: boolean,
  context: { ip: string; userAgent: string; source: string }
) {
  // Withdraw the current consent
  await prisma.consent.updateMany({
    where: { userId, purpose, withdrawnAt: null },
    data: { withdrawnAt: new Date() }
  });

  // Create new consent record
  await prisma.consent.create({
    data: {
      userId,
      purpose,
      granted,
      version: getCurrentConsentTextVersion(purpose),
      source: context.source,
      ipAddress: context.ip,
      userAgent: context.userAgent,
      grantedAt: new Date(),
    }
  });

  // Propagate the change
  if (!granted) {
    await stopProcessingForPurpose(userId, purpose);
  }
}

The stopProcessingForPurpose function is where it gets real. If someone withdraws consent for marketing emails, you need to immediately unsubscribe them from your email service, remove them from any active email sequences, and ensure no queued emails get sent. If they withdraw consent for analytics, you need to stop tracking their activity and potentially anonymize historical data.

Audit Logging: Your Compliance Safety Net

GDPR Article 5(2) requires you to demonstrate compliance — the "accountability principle." This means every access, modification, or deletion of personal data should be logged in an immutable audit trail.

This isn't your regular application logging. Audit logs need to be:

Immutable: append-only, cannot be modified or deleted
Complete: every read, write, and delete of personal data
Searchable: you need to find all access to a specific user's data
Retained: long enough to respond to regulatory inquiries

interface AuditLogEntry {
  id: string;
  timestamp: Date;
  actor: {
    type: 'user' | 'system' | 'admin' | 'api_key';
    id: string;
    ip?: string;
  };
  action: 'read' | 'create' | 'update' | 'delete' | 'export' | 'share';
  resource: {
    type: string;     // "user_profile", "payment_info", "consent_record"
    id: string;
  };
  dataSubjectId: string;  // The user whose data was affected
  details?: Record<string, unknown>;  // What changed (old/new values for updates)
  legalBasis?: string;
}

Implementing this at the application level means intercepting every database operation that touches personal data. With Prisma, you can use middleware:

prisma.$use(async (params, next) => {
  const personalDataModels = [
    'User', 'Profile', 'Address', 'PaymentMethod', 'Consent'
  ];

  if (personalDataModels.includes(params.model ?? '')) {
    const result = await next(params);

    await auditLog.create({
      action: mapPrismaAction(params.action),
      resource: {
        type: params.model!,
        id: extractId(params, result),
      },
      actor: getCurrentActor(), // From request context
      dataSubjectId: extractDataSubjectId(params, result),
      timestamp: new Date(),
    });

    return result;
  }

  return next(params);
});

The challenge here is performance. You're adding a write operation to every database query that touches personal data. For high-throughput applications, buffer audit events and flush in batches, or push them to a message queue for asynchronous processing.

Data Subject Access Requests (DSARs)

Under Article 15, any user can request a complete copy of all personal data you hold about them. You have 30 days to respond. This means you need a system that can:

Identify every piece of data associated with a user across all systems
Compile it into a portable, machine-readable format (typically JSON or CSV)
Deliver it securely (not via unencrypted email)

async function generateDataExport(userId: string): Promise<DataExport> {
  const [
    profile,
    consents,
    orders,
    supportTickets,
    activityLog,
    emailHistory,
    analyticsData,
  ] = await Promise.all([
    exportUserProfile(userId),
    exportConsentHistory(userId),
    exportOrders(userId),
    exportSupportTickets(userId),
    exportActivityLog(userId),
    exportEmailHistory(userId),
    exportAnalyticsData(userId),
  ]);

  return {
    exportDate: new Date().toISOString(),
    dataController: {
      name: 'Your Company Ltd',
      email: 'dpo@yourcompany.com',
    },
    data: {
      profile,
      consents,
      orders,
      supportTickets,
      activityLog,
      emailHistory,
      analyticsData,
    }
  };
}

Every system you can delete from, you also need to export from. If your data map is incomplete, your DSAR responses will be incomplete — and that's a compliance failure.

Data Breach Notification

GDPR Article 33 requires you to notify the relevant supervisory authority within 72 hours of becoming aware of a personal data breach. That's not a lot of time. You need detection systems monitoring for unauthorized access, assessment procedures to determine scope, notification templates ready to go, and documentation infrastructure. All of this needs to exist before anything goes wrong — building it during a breach is too late.

Why Automation Matters

Look at everything above: data mapping, erasure pipelines, consent management, audit logging, DSAR fulfillment, DPA tracking, breach notification. For a small team, building and maintaining all of this is a multi-month engineering project. For a large organization with dozens of services, it's a permanent headcount.

The core problem is that compliance is a continuous process, not a one-time implementation. Regulations change. Your architecture evolves. New services get added. Every change requires updating your data map, adjusting your erasure pipeline, updating your DSAR export, and verifying your audit logging still covers everything.

Manual processes break down at scale. A spreadsheet-based data map goes stale within weeks. A hand-written erasure script misses the new service someone added last sprint.

This is where automation changes the equation. Instead of treating compliance as a project with a finish line, automated platforms keep your compliance posture current as your systems change.

ComplianceBureau automates the heavy lifting: data mapping and processing records, DSAR and erasure request workflows with deadline tracking, consent management with full audit trails, breach notification procedures with regulatory templates, and sub-processor monitoring. It replaces the spreadsheets, manual scripts, and ad-hoc processes with a system that stays current as your architecture evolves.

Compliance Checklist for Engineering Teams

Use this as a starting point:

Data Mapping

[ ] All systems storing personal data are documented
[ ] Each system has a defined retention period
[ ] Deletion mechanism documented for each system
[ ] Third-party processors identified with DPAs in place

Right to Erasure

[ ] Erasure pipeline covers all systems in data map
[ ] Anonymization in place for legally retained records
[ ] Backup restoration process includes erasure reconciliation
[ ] Erasure completion is logged with audit trail

Consent Management

[ ] Consent recorded per purpose with timestamps
[ ] Consent text version tracked
[ ] Withdrawal immediately stops processing for that purpose
[ ] Full consent history preserved (append-only)

Audit Logging

[ ] All personal data access/modification is logged
[ ] Logs are immutable and retained per policy
[ ] Logs include actor, action, resource, and timestamp
[ ] Logs are searchable by data subject ID

Data Subject Requests

[ ] DSAR export covers all systems
[ ] Export delivered in machine-readable format
[ ] Secure delivery mechanism (not plain email)
[ ] 30-day deadline tracked with alerts

Breach Preparedness

[ ] Unauthorized access monitoring in place
[ ] Breach assessment procedure documented
[ ] Notification templates ready for authority and individuals
[ ] 72-hour notification deadline built into incident response

The Bottom Line

GDPR compliance is not a cookie banner. It's not a privacy policy page. It's a set of engineering systems that need to be built, maintained, and continuously updated as your application evolves.

Start with data mapping. If you don't know where personal data lives, nothing else matters.

If you'd rather not build all of this from scratch, ComplianceBureau automates GDPR compliance workflows — data mapping, erasure pipelines, consent management, audit trails, and regulatory reporting — so your engineering team can focus on building product instead of compliance infrastructure.

DEV Community