Data Matching & Merging
Matching and merging is a method of dealing with duplicate data. These records contain information about the same entity with potentially conflicting data. Once you discover duplicate data (match new data against existing data), you can use data quality tools to merge (combining duplicate records into one) or discard, depending on the use case.
For example:
Suppose you had two entities labeled “John Smith,” with one having a phone number attached to it and another having an email address. You could match those entries together via the name and merge them into one master record for John Smith containing both his email address and phone number.
Data matching and merging are usually done on your stored data (e.g., customer records) through rules, algorithms, metadata, and machine learning to help determine which records should be kept, combined, or deleted.
Duplicate data can lead to inaccurate reporting, skew your metrics, and make it difficult to decide which record can be trusted/is the most reliable. It can also lead to errors in machine learning and AI. That’s why matching and merging is an essential data quality capability. It is best to use an MDM solution for more advanced matching and merging in situations with multiple input points.