As an admin who's managed Salesforce for 10+ enterprises (including healthcare, SaaS, and manufacturing), I've seen data quality issues cost companies millions. Bad data isn't just messy—it causes failed campaigns, compliance risks, and broken integrations. You don't need to be a data scientist to fix it. Here's the battle-tested checklist I use on day one of every new org.
1. Master Data Foundation: The Non-Negotiables
Before anything else, verify your master data is clean. I once inherited a retail org where 42% of accounts had duplicate names ("Acme Inc" vs "Acme Inc. Ltd"). This caused revenue leakage in their analytics. Start here:
- Account/Contact Duplicates: Run this SOQL to find duplicates by name (adjust for your industry):
SELECT Name, COUNT(Id) FROM Account GROUP BY Name HAVING COUNT(Id) > 1
For contacts, check for matching emails across accounts:
SELECT Email, COUNT(Id) FROM Contact GROUP BY Email HAVING COUNT(Id) > 1
2. Critical Field Validation: Where Compliance Lives
Missing or invalid data in key fields triggers regulatory nightmares. In a healthcare client, missing NPI numbers (physician IDs) blocked 30% of claims submissions. Audit these fields:
-
Required Fields: Ensure all mandatory fields (e.g., Account Type, Industry) have 100% population. Use
SELECT COUNT(Id) FROM Account WHERE Industry = nullto spot gaps.
Standardized Values: If your org uses "MFG" for manufacturing in Industry, but sales entered "Manufacturing," reports are skewed. Check with:
SELECT Industry FROM Account WHERE Industry NOT IN ('MFG', 'Retail', 'Tech')
Phone/Email Format: Invalid formats break SMS/email campaigns. Validate with:
SELECT Id, Phone FROM Contact WHERE Phone NOT LIKE '%[0-9]%' OR Phone LIKE '%[^0-9]%' -- Filters non-numeric
3. Integration Health: The Silent Killer
Integrations fail when source data is bad. A manufacturing client's ERP sync failed daily because 25% of Product SKUs had spaces in their names ("P-100" vs "P-100 "). Fix this before going live:
Unique Identifiers: Verify external IDs (e.g., ERP Product ID) have no duplicates.
SELECT External_ID__c, COUNT(Id) FROM Product2 GROUP BY External_ID__c HAVING COUNT(Id) > 1
-
Field Mappings: Check if mapped fields (like "Account Owner" to ERP) have clean values. Use
SELECT OwnerId FROM Account WHERE OwnerId NOT IN (SELECT Id FROM User)to find orphaned owners.
4. Historical Data: The Ghosts in the Machine
Old data buried in archives causes errors. A financial services client had 10K inactive leads from 2018 that were still triggering email campaigns. Audit ruthlessly:
Inactive Records: Find old records:
SELECT Id FROM Lead WHERE IsConverted = false AND CreatedDate < LAST_YEAR
Orphaned Data: Check for contacts without accounts:
SELECT Id FROM Contact WHERE AccountId = null
Final Reality Check
Don't just scan—act. My rule: If a field isn't used in 80% of reports or workflows, kill it. And always, always validate before deleting. I once deleted a "Lead Source" option that had 200 records—only to realize it was the only one used by a critical partner. Use SELECT LeadSource FROM Lead GROUP BY LeadSource ORDER BY COUNT(Id) DESC LIMIT 5 to find your top 5 sources before pruning.
Quality data isn't a project—it's a daily habit. The checklist above is your shield against chaos. But let's be real: even the best admins miss something. That's why I use OrgScanner to run a 5-minute health scan on every org I touch. It catches duplicates, missing fields, and integration risks I’d otherwise overlook.
Get your free Salesforce health scan today—no strings, no sales pitch. Just actionable insights to fix the 3 data issues that’ll cost you the most this quarter.
📚 Recommended Resource: Salesforce for Dummies — great for anyone learning Salesforce.
📚 Recommended Resource: NIST Cybersecurity Framework Guide — great for anyone security frameworks.
Need a second opinion on your Salesforce org? Request a diagnostic.
Top comments (0)