DEV Community

Ahmed HAMMAMI
Ahmed HAMMAMI

Posted on • Updated on

EXTRACTING DUPLICATES USING SOQL QUERIES

1: Simple find account duplicates example

SELECT Count(id), name,ShippingCity FROM account GROUP BY ShippingCity, name having Count(ID) > 1
Enter fullscreen mode Exit fullscreen mode

Explanation: we can effectively detect duplicates in Salesforce. This query groups records based on specific criteria, such as name and city in the case of accounts, and then counts the number of records within each group. By filtering the results to include only groups where the count is greater than 1, we identify instances where multiple records share the same name and city, indicating potential duplicates. This approach allows us to quickly identify and address duplicate data, helping to maintain data integrity and accuracy within the Salesforce platform.

Now to get the IDs of those records (because in the query results it would be “aggregate results”) we'd typically need to run two separate queries. The first query, identifies the duplicates based on specific criteria. Then, we use the information from that query to construct a second query to fetch the IDs of those duplicate records, in this case we could use the name values exported with the first query, we can copy them in an Excel file and then put them in SOQL query,
example:

SELECT id, name,ShippingCity FROM account WHERE Name in (‘’)
Enter fullscreen mode Exit fullscreen mode

So in summary:

1_ Run our initial aggregate query to identify the duplicates based on
name and shipping city.

2_ Retrieve the names and shipping cities of the duplicate records.

3_ Construct a new query using the retrieved names and shipping cities
to fetch the IDs of those duplicate records.

2: Real life scenario with detailed explanation

We’ll now examine a real life scenario example; looking for duplicate ContactOpportunityRole records created by user has “hammami” in name, for these records to be duplicates they MUST have the same OpportunityId and ContactId.
I find this query was a good example to show how we could add a lot of logic to quickly determine duplicates using Salesforce Inspector or any other Salesforce tool.
Query for finding duplicates:

SELECT Opportunity.Name AS Op_Name, 
        Opportunity.Id, 
        Contact.Name AS Cnt_Name, 
        COUNT(OpportunityId) AS NB_OP 
 FROM OpportunityContactRole 
 WHERE CreatedBy.Name LIKE '%hammami%' 
 GROUP BY Contact.Name, Opportunity.Name, Opportunity.Id 
 HAVING COUNT(OpportunityId) > 2
Enter fullscreen mode Exit fullscreen mode

Query breakdown;
This SOQL query retrieves data from the OpportunityContactRole object in Salesforce and performs the following actions:

  1. It selects the following fields: _Opportunity Name (renamed as "Op_Name") _Opportunity ID _Contact Name (renamed as "Cnt_Name") _The count of Opportunity IDs associated with each record (renamed as "NB_OP")
  2. It filters the records based on the name of the user who created them. The filter condition is that the "CreatedBy" field should contain the name '%hammami%', meaning it should match any user whose name contains "hammami".
  3. It groups the records by the Contact name, Opportunity name, and Opportunity ID. This means that records with the same combination of Contact name, Opportunity name, and Opportunity ID will be grouped together.
  4. It applies a filter condition using the "HAVING" clause to include only groups where the count of Opportunity IDs is greater than 2. How this query can determine duplicates;

In this SOQL query, we're using the "GROUP BY" clause to group the records by Contact name, Opportunity name, and Opportunity ID. This allows us to aggregate the data and perform calculations (such as counting the number of Opportunity IDs) for each unique combination of Contact and Opportunity.
We're also using the "HAVING" clause to filter the grouped results. Specifically, we're filtering to include only groups where the count of Opportunity IDs is greater than 2. This means that we're interested in finding instances where a Contact is associated with the same Opportunity more than twice.
So, by using the "GROUP BY" and "HAVING" clauses in this query, we're effectively extracting and filtering data to identify Contacts that are associated with the same Opportunity multiple times or in other words, duplicates.

Top comments (0)