Got dupes?!?! You’re not alone.
Duplicate problems are among the more common pain points we hear from new and prospective clients. Even with the tightest integrations and smartest users, you’ll always have duplicates. It’s something we all have to deal with and it’s not glamorous, but the investment to fix isn’t huge.
We have a standard process for duplicate management with all of our clients and it starts with ownership and culture around data management. Dealing with duplicates is a pain and no one enjoys doing it—however, it is critical to keep a clean database. Duplicates = lack of confidence in data quality = undermining your whole database. So like a lot of things in life, the work to get there isn’t very enjoyable but the outcome is.
Create a culture in your organization of dealing with duplicates on a routine basis. Build a dashboard of duplicates (and other data issues) emailed to you weekly as a reminder to go clean them up. If you are starting with a lot of duplicates, buy lunch for a few people and knock them out in an afternoon.
Tip: There is no tool out there that can dedupe records better than a person comparing the data side by side. Having eyes on every record is best practice and the best route to having confidence in the data. A tool is useful when you have a large number of low-quality duplicates and/or don't have duplicate rules configured. It’s best as a one-time historical cleanup, rather than an ongoing practice.
Build out Salesforce matching and duplicate rules to catch duplicates. Begin by using loose criteria (like fuzzy First Name and exact Last Name no Email) to catch as many as possible. You can tighten it up later depending on your scenario and include other fields like phone or address. Anytime the rule catches a possible duplicate, it creates Duplicate Record Item records which are grouped together via a Duplicate Record Set.
Tip: If you have any integrations running, you may want to only use the ‘Report’ feature of Duplicate Rules, meaning it will create the Duplicate Record Item/Set records, but not alert the user of the error and break the API connection.
Build reports to display your Duplicate Record Sets and add them to a Dashboard! You’ll need a custom report type and can filter by Duplicate Rule Name. With the report, you’ll be able to identify who/what is causing your duplicates.
Identifying your duplicates = step 1 in addressing the problem. Set up and subscribe to a data quality dashboard for a weekly snapshot of what needs to be addressed.
Permissions: Only grant merge permissions to one person or a small group of people manage duplicates. They should have an understanding of how Contacts work with Accounts and be able to identify ways to reduce duplicates through integration improvement, user training, etc. Once you merge the records, the action is irreversible.
Nonprofits: Use the NPSP Contact Merge! It will allow you to compare the records side-by-side and pick which you’d like to keep/override. It’s built to manage households properly where the native Salesforce Contact merge is not set up for this. More information here.
Marking duplicates as not duplicates: Salesforce doesn’t have a native way to handle this. The Salesforce method is to delete the Duplicate Record Set so it is no longer in your report, however, if you update one of the records, it will end up triggering the Duplicate Rule again and creating another Duplicate Record Set. We use a method to mark the set as Not a Duplicate (checkbox) and use Flow to uncheck the box if a new Duplicate Record Item is added. This method is not perfect though as Salesforce will still tell you there are potential duplicates on the page layout and NPSP Contact Merge. We generally believe it’s better to have the (incorrect) alert on a record page than try to permanently exclude a record from matching as a duplicate.
To wrap it up, the takeaways are:
Duplicates are the worst but you can’t have a database without ‘em
The best way to dedupe is manually and regularly
Use Duplicate Rules to identify duplicates and duplicate sources
Duplicates will undermine user confidence in the data = total chaos: