In the previous step, I added company-specific CSV import profiles so the intake flow could understand different source formats:
[Previous article: company-specific CSV import profiles]
https://dev.to/fastapier/i-added-company-specific-csv-import-profiles-to-my-intake-console-42eh
That solved one operational problem:
different companies send different column names.
But it still left another one unsolved.
Different companies also define duplicates differently.
Some imports should match on email.
Some should match on company name.
Some only become duplicates when company name + phone match together.
That is where CSV intake gets dangerous.
If one universal duplicate rule is applied to every file, valid rows get blocked, bad rows slip through, and operators stop trusting the workflow.
So in this update, I added profile-specific duplicate rules to the intake console.
Now a profile controls not just column mapping, but also how duplicate detection works.
For example:
-
default→ exact email -
regional_ops→ exact company name -
partner_directory→ company name + phone
That makes duplicate handling part of the import contract instead of a hidden assumption.
In this example, the partner_directory profile correctly understood the CSV shape, then blocked only the row that collided on the active rule:
company_name_phone
That is the behavior I want.
A different source format should not be treated as bad data.
A row should stop only when the active business rule says it should stop.
Then I fixed the blocked row inside the UI, changed the phone value, and re-ran the evaluation in place.
One detail that matters here:
The original CSV file is not rewritten. Fixes happen on staged import rows, and apply/revert works from that tracked state.
That makes the flow much safer for operational teams.
The intake engine can absorb and repair dirty imports without mutating the source file someone uploaded.
After that, the run applied cleanly, and the audit trail preserved the full path:
- which profile was active
- which duplicate rule was used
- which row was blocked
- what was changed
- when the run was applied
This project is turning into something much stronger than “upload a CSV and hope for the best.”
It is becoming a defensive intake engine that can:
- stage dirty operational data
- understand company-specific CSV formats
- apply company-specific duplicate rules
- let operators repair blocked rows in place
- preserve audit evidence for what happened
If you work on messy CSV onboarding, operational imports, or audit-ready intake workflows, feel free to reach out:



Top comments (0)