Finding and deleting dupe queries across two fields

E

efandango

How can I display duplicate addresses with matching postcodes. At the moment
my SQL will only find duplicate Street Names, which will only be duplicate if
they also have a matching postcode. Because like a lt of big cities, you can
have more than one identical street name, but it is when combined with the
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;
 
J

John Spencer

You can try the following modification. I think it may give you the result
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
E

efandango

Thanks John,

But how do I delete the dupes leaving a single record of each entry?



John Spencer said:
You can try the following modification. I think it may give you the result
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

efandango said:
How can I display duplicate addresses with matching postcodes. At the
moment
my SQL will only find duplicate Street Names, which will only be duplicate
if
they also have a matching postcode. Because like a lt of big cities, you
can
have more than one identical street name, but it is when combined with the
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;
 
J

John Spencer

Do you have a primary key field in the table street names? And if so is
it a single field?

For instance if StreetNameID is unique for every record in the table,
you can use the query below to delete selected records.

DELETE *
FROM [Street Names]
WHERE StreetNameID in
(SELECT Max([Street Names].StreetNameID)
FROM [Street Names] as Tmp2
WHERE Tmp2.StreetName In
(SELECT StreetName
FROM [Street Names] As Tmp
GROUP BY StreetName, PostCode
HAVING Count(*)>1 AND Tmp.PostCode = Tmp2.PostCode)
GROUP BY Tmp2.StreetName)

That may fail due to the table name being [Street Names] instead of
StreetNames (no spaces). If it does, you will have to build a series of
queries to do this.


'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Thanks John,

But how do I delete the dupes leaving a single record of each entry?



John Spencer said:
You can try the following modification. I think it may give you the result
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

efandango said:
How can I display duplicate addresses with matching postcodes. At the
moment
my SQL will only find duplicate Street Names, which will only be duplicate
if
they also have a matching postcode. Because like a lt of big cities, you
can
have more than one identical street name, but it is when combined with the
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;
 
E

efandango

Thanks John,

Yes, I do have a primary key field in the table street names. it's called
[StreetNameID].

Here's a sample of the data:


StreetNameID StreetName Postcode
1 Abberley Mews SW4
2 Abbevelle Mews SW4
3 Abbeville Mews SW4
4 Abbeville Road SW4
5 Abbey Business Centre SW8

I will try your suggested SQL and let you know. It's late here now (UK) and
I've been trying to sort this huge table of adresses all day and slowly going
out of my mind in the process...






John Spencer said:
Do you have a primary key field in the table street names? And if so is
it a single field?

For instance if StreetNameID is unique for every record in the table,
you can use the query below to delete selected records.

DELETE *
FROM [Street Names]
WHERE StreetNameID in
(SELECT Max([Street Names].StreetNameID)
FROM [Street Names] as Tmp2
WHERE Tmp2.StreetName In
(SELECT StreetName
FROM [Street Names] As Tmp
GROUP BY StreetName, PostCode
HAVING Count(*)>1 AND Tmp.PostCode = Tmp2.PostCode)
GROUP BY Tmp2.StreetName)

That may fail due to the table name being [Street Names] instead of
StreetNames (no spaces). If it does, you will have to build a series of
queries to do this.


'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Thanks John,

But how do I delete the dupes leaving a single record of each entry?



John Spencer said:
You can try the following modification. I think it may give you the result
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

How can I display duplicate addresses with matching postcodes. At the
moment
my SQL will only find duplicate Street Names, which will only be duplicate
if
they also have a matching postcode. Because like a lt of big cities, you
can
have more than one identical street name, but it is when combined with the
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;
 
E

efandango

JOhn,

That didn't work. (it just slowly goes through small progress bar botom left
of screen and eventually results nothing)

When you mention 'failure due to the table name being [Street Names] instead
of StreetNames (no spaces).' Can I just rename the query (it is independent
of everything else). I don't really understand "If it does, you will have to
build a series of queries to do this." I can't envision this series of
queries.





John Spencer said:
Do you have a primary key field in the table street names? And if so is
it a single field?

For instance if StreetNameID is unique for every record in the table,
you can use the query below to delete selected records.

DELETE *
FROM [Street Names]
WHERE StreetNameID in
(SELECT Max([Street Names].StreetNameID)
FROM [Street Names] as Tmp2
WHERE Tmp2.StreetName In
(SELECT StreetName
FROM [Street Names] As Tmp
GROUP BY StreetName, PostCode
HAVING Count(*)>1 AND Tmp.PostCode = Tmp2.PostCode)
GROUP BY Tmp2.StreetName)

That may fail due to the table name being [Street Names] instead of
StreetNames (no spaces). If it does, you will have to build a series of
queries to do this.


'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================

Thanks John,

But how do I delete the dupes leaving a single record of each entry?



John Spencer said:
You can try the following modification. I think it may give you the result
you are looking for.

SELECT [Street Names].StreetName
, [Street Names].StreetNameID
, [Street Names].Postcode
FROM [Street Names]
WHERE [Street Names].StreetName In
(SELECT [StreetName]
FROM [Street Names] As Tmp
GROUP BY [StreetName], PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [Street Names].PostCode)
ORDER BY [Street Names].StreetName;

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

How can I display duplicate addresses with matching postcodes. At the
moment
my SQL will only find duplicate Street Names, which will only be duplicate
if
they also have a matching postcode. Because like a lt of big cities, you
can
have more than one identical street name, but it is when combined with the
postcode that it becomes unique, but with possible duplicates.

Below is the SQL to find duplicate Street Names, but I need it to match
against the corresponding postcode to find the duplicate addresses.

How can i find the dupes across both fields; [StreetName] - [Postcode] and
then delete just the dupes, leaving the one copy of each record intact?




SELECT [Street Names].StreetName, [Street Names].StreetNameID, [Street
Names].Postcode
FROM [Street Names]
WHERE ((([Street Names].StreetName) In (SELECT [StreetName] FROM [Street
Names] As Tmp GROUP BY [StreetName] HAVING Count(*)>1 )))
ORDER BY [Street Names].StreetName;
 
J

John Spencer

STANDARD ADVICE: Backup your data before doing this. There is no undo
available when you run the last query.

First query is one you used to identify the duplicate records (save it for
example as - qDeleteA)
SELECT *
FROM [Street Names]
WHERE StreetName in (
SELECT StreetName
FROM [Street Names] As Tmp
GROUP BY StreetName, PostCode
HAVING Count(*)>1 AND Tmp.PostCode = [StreetNames].PostCode)

qDeleteA should give you all the duplicates. Now use that in another query
that you save as qDeleteB
SELECT StreetName, PostCode, Max(StreetNameID) as MaxStreetID
FROM qDeleteA
GROUP BY StreetName, PostCode

This should give you the records to keep

Now use that in the last query
DELETE DISTINCT ROW [A].*
FROM [Street Names] as A INNER JOIN qDeleteB
ON A.StreetName = qDeleteB.StreetName
And A.PostCode = qDeleteB.PostCode
WHERE A.StreetNameID <> B.MaxStreetID

STANDARD ADVICE: Backup your data before doing this. There is no undo
available when you run the last query.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..

efandango said:
JOhn,

That didn't work. (it just slowly goes through small progress bar botom
left
of screen and eventually results nothing)

When you mention 'failure due to the table name being [Street Names]
instead
of StreetNames (no spaces).' Can I just rename the query (it is
independent
of everything else). I don't really understand "If it does, you will have
to
build a series of queries to do this." I can't envision this series of
queries.





John Spencer said:
Do you have a primary key field in the table street names? And if so is
it a single field?

For instance if StreetNameID is unique for every record in the table,
you can use the query below to delete selected records.

DELETE *
FROM [Street Names]
WHERE StreetNameID in
(SELECT Max([Street Names].StreetNameID)
FROM [Street Names] as Tmp2
WHERE Tmp2.StreetName In
(SELECT StreetName
FROM [Street Names] As Tmp
GROUP BY StreetName, PostCode
HAVING Count(*)>1 AND Tmp.PostCode = Tmp2.PostCode)
GROUP BY Tmp2.StreetName)

That may fail due to the table name being [Street Names] instead of
StreetNames (no spaces). If it does, you will have to build a series of
queries to do this.


'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top