Lesson 21: Creating a Find Duplicates Query
A find duplicates query allows you to search for and identify duplicate records within a table or tables. A duplicate record is a record that refers to the same thing or person as another record.
Not all records containing similar information are duplicates. For instance, records of two orders that were placed on different dates but that contained identical items would not be duplicate records. Likewise, not all duplicate records contain completely identical information. For example, two customer records could refer to the same person but include different addresses. The record with the out-of-date address would be the duplicate record.
Why is getting rid of duplicate records so important? Consider the example above. If we had multiple records for one customer, it would be difficult to view an order history for him because the information would be spread across several unlinked records. We might even deliver his order to the wrong address if the person entering the order information selects an outdated record. It's easy to see how having duplicate records can undermine the integrity and usefulness of your database.
Fortunately, Access makes it easy to search for and locate potential duplicate records. Note that Access won't delete the records for you or help you figure out which one is current—you'll have to do those things for yourself. If you're familiar with the data in your database, though, getting rid of duplicate records will be a manageable task.
To create a find duplicates query:
- Select the Create tab on the Ribbon, locate the Queries group, and click the Query Wizard command.
Clickign the Query Wizard Command
- The New Query dialog box will appear. Select Find Duplicates Query Wizard from the list of queries, then click OK.
Choosing to create a find duplicates query
- Select the table you want to search for duplicate records, then click Next. We're searching for duplicate customer records, so we'll select the Customers table.
Selecting the table to search for duplicates
- Choose the fields you want to search for duplicate information by selecting them and then clicking the right arrow button. Only select fields that should not be identical in nonduplicate records. For instance, because we're searching for duplicate customers we'll only select the First Name and Last Name fields because it's unlikely that multiple people with the exact same first and last names would place orders at our bakery.
- When you've added the desired fields, click Next.
Selecting the fields to search for duplicate information
- Select additional fields to view in the query results. Choose fields that will help you distinguish between the duplicate records, and choose which one you want to keep. In our example, we'll add all of the fields relating to customer addresses, plus the Phone Number field because records with identical customer names might contain nonidentical information in this field. When you're satisfied, click Next.
Selecting the other fields to view in the results
- Access will suggest a name for your query, but you can type a different name if you want. When you're satisfied with the query name, click Finish to run your query.
Naming the query
- If Access found any duplicate records in your query, they will be displayed in the query results. Review the records and delete any outdated or incorrect records as needed.
Duplicate records in the query results
Tips for resolving duplicate records
- Save your duplicate records queries, and run them often.
- Investigate potential duplicate records by looking at linked data in other tables. You can do this by searching for these records' record ID numbers in related tables. Is one record linked to mostly old orders while another contains recent ones? The latter is likely to be the current one.
- Once you decide which record to delete, make sure you won't be losing any information you might need. In our example, before we deleted our duplicate record we found all of the orders linked to that record's ID number and replaced them with the ID number of the record we decided to keep.