GDPR Data Minimisation: The Principle Most Businesses Ignore (And How to Apply It)
Article 5(1)(c) of GDPR states that personal data shall be "adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed." It sounds simple. In practice, almost every business violates it — and most don't even know it.
Data minimisation is the GDPR principle that gets the least attention. Businesses obsess over consent banners and privacy policies while their databases quietly accumulate years of unnecessary personal data that they never use, never review, and never delete.
This is a practical guide to actually applying data minimisation — not as a compliance checkbox, but as an operational discipline that reduces your risk, your storage costs, and your exposure if something goes wrong.
What Data Minimisation Actually Means
GDPR uses three criteria: data must be adequate, relevant, and limited to what is necessary.
Adequate means you collect enough data to actually fulfil the purpose. If you're running a delivery service, you need an address. That's adequate.
Relevant means the data has a logical connection to the purpose. Collecting a user's date of birth to send them a newsletter isn't relevant unless the newsletter is age-restricted.
Limited to what is necessary is the hard one. "Necessary" doesn't mean "useful" or "might be handy someday." It means you cannot achieve the purpose without this specific data. If you could fulfil the purpose without collecting a phone number, collecting it violates minimisation.
The European Data Protection Board (EDPB) has been clear: necessity is a strict test. "It would be nice to have" doesn't pass.
The Common Failure Modes
Collecting "Just in Case"
This is the most common minimisation failure. A form gets built, someone adds a field that "might be useful for sales," and it never gets reviewed. Years later, you have tens of thousands of records with data you've never looked at.
The "just in case" mindset is a direct contradiction of data minimisation. Under GDPR, you cannot collect data speculatively. You need a specific purpose at the time of collection, and the data you collect must be necessary for that purpose — not a potential future purpose you haven't defined yet.
Legacy Fields That Never Got Deleted
Businesses evolve. A field that was necessary five years ago might be obsolete today. But databases don't shrink automatically. Legacy fields persist indefinitely unless someone actively removes them.
This is particularly common with CRM data. Fields added for a product that no longer exists, campaigns that ran in 2019, integrations that got deprecated. The data stays. Nobody has a process to clean it up.
Form Fields Nobody Reviews
When did you last audit your signup form? Your contact form? Your checkout flow? Most businesses set up forms once and never revisit them. Each unnecessary field is a minimisation violation you're committing at scale, with every new user.
Analytics Tracking Everything
Google Analytics, Mixpanel, Amplitude, Segment — these tools can track an enormous amount of user behaviour. And by default, most businesses let them. Every page view, every click, every scroll depth, every session duration.
Not all of it is necessary. If you're tracking 47 custom events but only ever look at 6 of them, you're violating minimisation on the other 41. Data you collect and never use is data you've collected without a purpose.
Third-Party Tools With Their Own Appetites
Every SaaS tool you install on your website potentially collects its own data. Intercom collects visitor information. Hotjar records sessions. Facebook Pixel collects browsing behaviour. Each of these represents data collection you're accountable for under GDPR, even if you didn't build it.
Minimisation at Collection vs. Minimisation Over Time
There are two distinct minimisation obligations that businesses often conflate.
Minimisation at collection is about what you gather in the first place. Only collect data that is adequate, relevant, and necessary for a specific, defined purpose.
Minimisation over time is the storage limitation principle (Article 5(1)(e)) — keeping data "in a form which permits identification of data subjects for no longer than is necessary." This is a separate but related obligation. Even if your original collection was justified, you must delete or anonymise data when it's no longer needed for the purpose it was collected for.
Both matter. Businesses that get minimisation at collection right but then never delete anything are still in violation. A retention policy is not optional — it's a core GDPR requirement.
Want to see what data your website is currently collecting? Run a free scan at Custodia — results in 60 seconds.
Product Design Implications
Data minimisation isn't just a legal compliance exercise. It has direct implications for how you build products.
Don't build features that require data you don't need. If you're considering a personalisation feature that would require collecting users' precise location, ask yourself: is this feature worth the compliance burden? Is the data actually necessary, or could you achieve the same outcome with less granular data (city instead of GPS coordinates)?
Default to collecting less. When designing a new feature, start with the minimum viable data set. It's much easier to add a field later if you discover you genuinely need it than to justify collecting data you don't use.
Privacy by design requires minimisation thinking. Article 25 of GDPR requires "data protection by design and by default" — which explicitly includes minimisation. This isn't a bolt-on; it's a design principle that should inform every product decision where personal data is involved.
Aggregation over individual tracking. In many cases, aggregate metrics serve the same business purpose as individual tracking without requiring personal data. Do you need to know that specific user Alice viewed the pricing page three times, or do you just need to know that 34% of trial users visit pricing before converting?
Minimisation in Forms
Forms are the most obvious minimisation opportunity. Here's how to approach them:
For every field, ask: What specific business process requires this data? Who uses it? What happens if we don't collect it?
If you can't answer those questions concretely, remove the field.
Common form fields that fail the necessity test:
- Phone number on a newsletter signup (if you don't do phone outreach)
- Company size on a free trial signup (if you don't use it to segment)
- Job title on a contact form (if you have no sales process that uses it)
- Date of birth on a general user account (unless you have a legal age requirement)
- Full address when only a country or region is needed
Make optional fields genuinely optional. If a field is truly optional — not necessary for the purpose — make it optional. Don't collect it by default or put pressure on users to fill it.
Minimisation in APIs
APIs are a frequently overlooked minimisation surface. When your API returns a user record, does it return only the fields the calling service needs, or does it return everything?
Over-permissive API responses are a minimisation problem. If your front-end only needs a user's name and email address but your API returns 40 fields including purchase history, IP address, and internal identifiers, you're sharing more data than necessary within your own systems — and potentially exposing more in the event of a breach.
Apply the principle of least privilege: APIs should return only the fields actually needed by the caller, for the purpose of that specific call.
Marketing Data and Minimisation
There's a persistent belief in marketing that more demographic data equals better targeting. It often doesn't — and it always increases compliance risk.
Demographic data you collect for targeting but never use is a minimisation violation. Collecting age ranges, income brackets, or interest categories because your marketing platform supports it — not because you have a defined use for it — fails the necessity test.
Segment carefully. The question isn't "could we use this to segment our list?" but "do we have a specific, defined campaign purpose that requires this segmentation, and can we achieve it with less granular data?"
Review your email data regularly. Unsubscribed contacts, bounced addresses, contacts from campaigns that have ended — these should be purged according to your retention policy. Keeping them indefinitely "in case we need them later" violates both minimisation and storage limitation.
Minimisation and AI/ML Training Data
AI and machine learning create a particular minimisation tension. Models generally benefit from more data. But "more data is better for the model" is not a valid GDPR purpose for collecting personal data.
If you're training models on personal data, you need to:
- Define the specific purpose of the model and what data is genuinely necessary to achieve it.
- Anonymise or pseudonymise training data where possible. A model trained on pseudonymised data may perform comparably to one trained on fully identified data — and if so, you must use the anonymised approach.
- Apply data minimisation to training sets. Don't feed a model every field in your database because it's convenient. Determine what's actually predictive or necessary, and use only that.
- Apply retention to training data the same way you would to operational data. Models shouldn't be trained on data you're no longer entitled to hold.
The ICO has published detailed guidance on AI and data protection that is worth reading if you're building ML-dependent products.
How Minimisation Interacts With Purpose Limitation
Minimisation and purpose limitation (Article 5(1)(b)) are closely related but distinct.
Purpose limitation says you can only use data for the specific purpose you stated when collecting it.
Minimisation says you can only collect data that is necessary for that stated purpose.
They reinforce each other: if you've properly defined your purpose (purpose limitation), it becomes easier to evaluate what data is necessary for that purpose (minimisation). Vague purposes ("improving our services") make minimisation almost impossible to evaluate — which is another reason the EDPB consistently emphasises specificity in purpose definition.
If you find yourself unable to determine whether a field is "necessary," that's often a sign your purpose isn't defined clearly enough.
The Data Minimisation Audit Process
Here's a practical process for auditing your data collection against the minimisation principle.
Step 1: Map What You Collect
Start with a complete inventory. For each system that handles personal data:
- What personal data does it collect or receive?
- At what point is data collected (signup, purchase, cookie, API, form)?
- Who has access to it?
If you don't know what you're collecting, you can't assess minimisation. Use a tool like Custodia's website scanner to identify what's being collected on your website automatically.
Step 2: Map Each Field to a Purpose
For every data field in your inventory:
| Field | Purpose | Who uses it? | Necessary? | Action |
|---|---|---|---|---|
| Account login, transactional email | Auth system, email platform | Yes | Keep | |
| Phone | SMS alerts (if opted in) | SMS provider | Conditional | Only collect if opted in for SMS |
| Job Title | Sales segmentation | CRM | Review | Remove if sales team doesn't use it |
| IP Address (stored) | Fraud detection | Security | Review | Delete after 30 days if fraud checks complete |
| Birthday | Birthday promotion email | Email platform | Marginal | Remove unless campaign is active and proven |
Step 3: Challenge Every "Yes"
Don't accept "we use it" as sufficient justification. "We use it" must become "we use it specifically for [defined purpose], and we cannot achieve that purpose without it."
Step 4: Remove or Stop Collecting
For fields that fail the necessity test: stop collecting them (for new data) and delete historical records that are no longer justified.
This requires coordination between product, engineering, and whoever owns your data retention policy.
Step 5: Set Retention Periods
For every category of data you decide to keep, set a maximum retention period and implement automated deletion. "We keep it until someone asks us to delete it" is not a retention policy — it's a violation waiting to happen.
Step 6: Build Minimisation Into Your Process
The audit is a one-time cleanup. The ongoing discipline is harder. Add minimisation review to:
- Your product development process (before collecting any new data, justify it)
- Your vendor onboarding process (what data does this tool collect?)
- Your quarterly data review cadence
Minimisation Audit Template
Use this as a starting checklist for your audit:
Forms
- [ ] List every form on your website and in your product
- [ ] For each field, document the purpose and business owner
- [ ] Remove or make optional any field without a clear, current purpose
Analytics
- [ ] List every analytics and tracking tool installed
- [ ] Document what events/properties each tool collects
- [ ] Remove events/properties not actively used in reporting or product decisions
CRM / Database
- [ ] List all personal data fields in your CRM
- [ ] Identify fields not updated in the last 12 months
- [ ] Review with sales/marketing: which fields are actively used?
Third-Party Tools
- [ ] List all third-party SaaS tools that receive personal data
- [ ] Review their data collection scope (what do they collect beyond what you send them?)
- [ ] Remove tools whose data collection isn't justified
Retention
- [ ] Do you have a documented retention policy?
- [ ] Is deletion automated or manual?
- [ ] When was it last reviewed?
APIs
- [ ] Do your internal APIs return only necessary fields to callers?
- [ ] Do data exports or reports include more fields than recipients need?
The Bottom Line
Data minimisation is not a bureaucratic formality. It's a principle that, when applied properly, makes your business less risky, leaner, and more defensible — both to regulators and to your own users.
The businesses that get this right don't do it through annual compliance audits. They build minimisation thinking into every product decision, every form design, every analytics implementation. "Do we need this?" becomes a reflex, not an afterthought.
If you don't know what your website is currently collecting, that's the right place to start.
Run a free privacy scan at Custodia — we'll show you what's being collected on your site in 60 seconds, so you can make informed decisions about what to keep, what to stop collecting, and where your minimisation gaps are.
Top comments (0)