Deleting Duplicate Records in a Query

B

Burton

I know there are numerous posts about deleting duplicate records, but I am
too big of a novice in sql and Access to adapt their questions/answers to my
problem. I have 1 table called structure_existing_area_3 that has duplicate
records. I need to get rid of these duplicates.

They are considered duplicates if two fields, "right_fid" and "length" are
both the same. The other fields may have different values, but if these two
fields match for two different records, I need to delete one of the records.

Just to be clear (and redundant), following is an attempt to re-create an
example of a duplicate record.

left_id---right_fid---length
---1--------2----------3
---5--------2----------3

Because both fields match at the same time, it is considered a duplicate. In
this instance, I want to get rid of one of the records (doesn't matter which
one)

This is not a duplicate record:

left_id---right_fid---length
---1--------5----------3
---5--------2----------3

If anyone could explain, step-by-step, how to do this in design view, or
write out the sql statement for me, it would be greatly appreciated. Again, I
am fairly inept at Access and sql, so please be detailed and clear.

Thanks in advance
 
R

Roger Carlson

Burton said:
I know there are numerous posts about deleting duplicate records, but I am
too big of a novice in sql and Access to adapt their questions/answers to
my
problem. I have 1 table called structure_existing_area_3 that has
duplicate
records. I need to get rid of these duplicates.

They are considered duplicates if two fields, "right_fid" and "length" are
both the same. The other fields may have different values, but if these
two
fields match for two different records, I need to delete one of the
records.

Just to be clear (and redundant), following is an attempt to re-create an
example of a duplicate record.

left_id---right_fid---length
---1--------2----------3
---5--------2----------3

Because both fields match at the same time, it is considered a duplicate.
In
this instance, I want to get rid of one of the records (doesn't matter
which
one)

This is not a duplicate record:

left_id---right_fid---length
---1--------5----------3
---5--------2----------3

If anyone could explain, step-by-step, how to do this in design view, or
write out the sql statement for me, it would be greatly appreciated.
Again, I
am fairly inept at Access and sql, so please be detailed and clear.

Thanks in advance
 
R

Roger Carlson

Sorry, send the last one by mistake. Try this:

1) Create a BACKUP COPY of your table!

2) If the table does not have one, create an autonumber field called sea3ID.
(The name, of course, is irrelevant, but it needs to be used in the query
below.) If you want, you can delete it later, but it is essential that you
have a single, unique identifier on the record.

3) Create and run this query:

DELETE *
FROM structure_existing_area_3 AS A
WHERE A.sea3ID>
(SELECT MIN(B.sea3ID)
FROM structure_existing_area_3 B WHERE A.right_fid= B.right_fid AND
A.length= B.length);

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 
B

Burton

Roger,

Thanks. That worked great! I was wondering if you could explain that code to
me in layman's terms. What is each command doing. I don't understand the A.
and B. language.

Thanks again,

Burton
 
B

Burton

Roger,

Thanks. That worked great! I was wondering if you could explain the code to
me in layman's terms. I don't understand the A. B. language in the code.

Thanks,

Burton
 
R

Roger Carlson

The A and B are aliases for the table so the query can compare the table to
itself. It could have been "structure_existing_area_3_1" and
"structure_existing_area_3_2", which might have been clearer, but the
convention of using single characters is quite common to reduce the amount
of typing needed.

This type of query is called a Correlated Subquery. Here are some articles
explaining them better than I can:

http://msdn.microsoft.com/en-us/library/aa217680(office.11).aspx

http://books.google.com/books?id=Fd...=book_result&resnum=2&ct=result#PRA2-PA203,M1

http://books.google.com/books?id=4u...X&oi=book_result&resnum=5&ct=result#PPA155,M1

(I hope these links survive. If not, Google on "correlated subquery
access".

In a correlated subquery, the subquery is executed for every line in the
main query, so it can sometimes take a long time to execute.

What it's doing, is that the subquery is returning the Minimum unique id for
all the duplicated records and deletes all of those whose unique id is
greater. This leaves just one instance of each of the duplicated rows.

Hope that makes sense.


--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top