Duplicate Record Remover Help

 

The process of removing duplicates from your database

 

There are three steps you need to undertake:

1)    Configure the Duplicate Record Remover to your data and have it examine the data for possible duplicates.  This is done with the Setup and Examination Wizard.

2)    Manually process any matches that couldn’t be automatically merged.  This is done with the Processing Tool.

3)    Export the cleaned data back into your live systems.  The Processing Tool gives you many options for doing this, but depending on your database and your own level of expertise, this step can sometimes require specialist advice for getting the changes uploaded back into your live database.  Precision Data can help you with this service should you need it.

 

Step 1: Setup and Examine your Data

 

This step involves importing your data into the Duplicate Record Remover (as it works on its own internal copy of your data), setting the various examination and merging options and running the examination process to look for duplicates.  You do this with the tool:

Once the data is examined for duplicates, you no longer need to work with this tool and can do everything else with the Processing Tool in the next step.

 

Step 2: Manually Merge any Duplicates Found

 

Any clear duplicates with no conflicting data will be automatically merged - however there are likely to be many duplicates that have conflicting data that needs manual editing before they can be merged (eg: If you have two records that only differ by a misspelled name, you will need to decide which name is spelled correctly before merging those records).

You do this with the tool:

 

Step 3: Export the cleaned data back into your live systems

 

Once your data has been cleaned of all its duplicates, you must then take the cleaned data and move it back into your live systems.  This can be done in a number of ways including:

  • Exporting the cleaned data itself in the form of CSV, Text, Excel or XML.  This can then be used to simply overwrite your live data.
  • Exporting the edits and deletes made (sometimes called the Delta, or the Change Set) in the form of XML or TSQL statements.  These changes can be run against your database by a database administrator or developer to update your live data.
  • Exporting the edits and deletes made in the form of a printable report.  This can be used to manually update the live data when complex business-rules and company policy disallow direct access to live data.

 

IMPORTANT RECOMMENDATION: Do test-runs of your de-duplication process to become familiar with the examination and processing features.

We strongly recommend that you run the Setup Wizard several times and examine the resulting matches to ensure you have configured it optimally for your set of data. 

We also recommend you become familiar with the Processing Tool and how it merges and edits data – before you undertake your full database merging exercise.  This can save you considerable time when processing the manual merges and reduces the risk that you lose data or incorrectly merge it.

Finally, we recommend you also test the exporting features to get a good understanding of the best way to export the changes or cleaned data back into your live systems.

If you are in doubt about how to prepare, examine and process your data – Precision Data provides a data-cleaning bureau service and can provide all the advice you need to get you started, overcome a difficult problem, or complete the job for you from start to finish.

 

 

Related Topics

Introduction

 

Duplicate Record Remover
Copyright (c) 2009 Precision Data, All Rights Reserved.