Find duplicates

R

Ron Hinds

I'm building an appointment scheduling feature and I'm having trouble trying
to figure out if a user has created scheduling conflicts. Here is the
scenario: The user will choose from a list of their customers the day(s) of
the week they wish to call that customer, a call time, and optionally
another call time, in five minute increments. The actual schedule will be
built from that template nightly rolling 90 days out. I want to determine if
there are conflicts in the template. The template table is laid out as
follows:

[SalesID] Text (50)
[CustID] Text (50)
[CallTime1] Date/Time
[CallTime2] Date/Time
[Monday] Boolean
[Tuesday] Boolean
[Wednesday] Boolean
[Thursday] Boolean
[Friday] Boolean
[Periodicity] Number (Long)

The Primary Key is [SalesID] and [CustID]

I tried using the Find Duplicates wizard, selecting only the columns that
may have dupes (i.e. everything except the SalesID/CustID). In the next
step, it asks which columns to return in addition to those that might be
dupes. I ask for the CustID field. But the resulting query doesn't return
anything when I run a test, even though every field except CustID is
identical in two records. Here is the SQL the wizard generated:

SELECT DISTINCTROW tblSalesmanApptsTemp.CallTime1,
tblSalesmanApptsTemp.CallTime2, tblSalesmanApptsTemp.Monday,
tblSalesmanApptsTemp.Tuesday, tblSalesmanApptsTemp.Wednesday,
tblSalesmanApptsTemp.Thursday, tblSalesmanApptsTemp.Friday,
tblSalesmanApptsTemp.Periodicity, tblSalesmanApptsTemp.CustID
FROM tblSalesmanApptsTemp
WHERE (((tblSalesmanApptsTemp.CallTime1) In (SELECT [CallTime1] FROM
[tblSalesmanApptsTemp] As Tmp GROUP BY
[CallTime1],[CallTime2],[Monday],[Tuesday],[Wednesday],[Thursday],[Friday],[
Periodicity] HAVING Count(*)>1 And [CallTime2] =
[tblSalesmanApptsTemp].[CallTime2] And [Monday] =
[tblSalesmanApptsTemp].[Monday] And [Tuesday] =
[tblSalesmanApptsTemp].[Tuesday] And [Wednesday] =
[tblSalesmanApptsTemp].[Wednesday] And [Thursday] =
[tblSalesmanApptsTemp].[Thursday] And [Friday] =
[tblSalesmanApptsTemp].[Friday] And [Periodicity] =
[tblSalesmanApptsTemp].[Periodicity])))
ORDER BY tblSalesmanApptsTemp.CallTime1, tblSalesmanApptsTemp.CallTime2,
tblSalesmanApptsTemp.Monday, tblSalesmanApptsTemp.Tuesday,
tblSalesmanApptsTemp.Wednesday, tblSalesmanApptsTemp.Thursday,
tblSalesmanApptsTemp.Friday, tblSalesmanApptsTemp.Periodicity;

If I remove CustID then it finds the dupes no problem. But, that isn't very
useful to me! BTW I'm using Access 97.
 
A

Allen Browne

Ron the difficulty you are experiencing is due to the fact that you are
using a non-normalized table structure. CallTime1 and CallTime2 are
repeating fields, and so are the days of the week.

You could solve the problem easily by creating a table with these fields:
SalesID - the salesman
CustID
CallDay - 1 = Monday; 2 = Tuesday, ...
CallTime - A number for each 5-minute slot in a workday.
Then it's dead simple to see if you have a duplicate.

If you are determined to go ahead with your existing structure anyway, you
will be facing monstrously absurd queries, like this one which will not run
in the query window, but just might if you execute it programmatically as an
ADO recordset:

SELECT tblSalesmanApptsTemp .* FROM tblSalesmanApptsTemp
WHERE EXISTS (SELECT SalesID FROM tblSalesmanApptsTemp AS Dupe
WHERE (Dupe.SalesID = tblSalesmanApptsTemp.SalesID)
AND (Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime1
OR Dupe.CallTime2 = tblSalesmanApptsTemp.CallTime2
OR Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime2)
AND ((IIf(tblSalesmanApptsTemp.[Monday],1)
+ IIf(tblSalesmanApptsTemp.[Tuesday], 2)
+ IIf(tblSalesmanApptsTemp.[Wednesday], 4)
+ IIf(tblSalesmanApptsTemp.[Thursday], 8)
+ IIf(tblSalesmanApptsTemp.[Friday], 16)
BAND IIf(Dupe.[Monday],1)
+ IIf(Dupe.[Tuesday], 2)
+ IIf(Dupe.[Wednesday], 4)
+ IIf(Dupe.[Thursday], 8)
+ IIf(Dupe.[Friday], 16)) <> 0));

Just noticed the end of your post, which says you are using A97, so that's
useless anyway. The BAND operator is only available in JET 4. Go back to the
simple, normalized structure suggested above, and you can find the
duplicates just by dragging a 2nd copy of the table into query design, and
joining to your original on all fields.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Ron Hinds said:
I'm building an appointment scheduling feature and I'm having trouble
trying
to figure out if a user has created scheduling conflicts. Here is the
scenario: The user will choose from a list of their customers the day(s)
of
the week they wish to call that customer, a call time, and optionally
another call time, in five minute increments. The actual schedule will be
built from that template nightly rolling 90 days out. I want to determine
if
there are conflicts in the template. The template table is laid out as
follows:

[SalesID] Text (50)
[CustID] Text (50)
[CallTime1] Date/Time
[CallTime2] Date/Time
[Monday] Boolean
[Tuesday] Boolean
[Wednesday] Boolean
[Thursday] Boolean
[Friday] Boolean
[Periodicity] Number (Long)

The Primary Key is [SalesID] and [CustID]

I tried using the Find Duplicates wizard, selecting only the columns that
may have dupes (i.e. everything except the SalesID/CustID). In the next
step, it asks which columns to return in addition to those that might be
dupes. I ask for the CustID field. But the resulting query doesn't return
anything when I run a test, even though every field except CustID is
identical in two records. Here is the SQL the wizard generated:

SELECT DISTINCTROW tblSalesmanApptsTemp.CallTime1,
tblSalesmanApptsTemp.CallTime2, tblSalesmanApptsTemp.Monday,
tblSalesmanApptsTemp.Tuesday, tblSalesmanApptsTemp.Wednesday,
tblSalesmanApptsTemp.Thursday, tblSalesmanApptsTemp.Friday,
tblSalesmanApptsTemp.Periodicity, tblSalesmanApptsTemp.CustID
FROM tblSalesmanApptsTemp
WHERE (((tblSalesmanApptsTemp.CallTime1) In (SELECT [CallTime1] FROM
[tblSalesmanApptsTemp] As Tmp GROUP BY
[CallTime1],[CallTime2],[Monday],[Tuesday],[Wednesday],[Thursday],[Friday],[
Periodicity] HAVING Count(*)>1 And [CallTime2] =
[tblSalesmanApptsTemp].[CallTime2] And [Monday] =
[tblSalesmanApptsTemp].[Monday] And [Tuesday] =
[tblSalesmanApptsTemp].[Tuesday] And [Wednesday] =
[tblSalesmanApptsTemp].[Wednesday] And [Thursday] =
[tblSalesmanApptsTemp].[Thursday] And [Friday] =
[tblSalesmanApptsTemp].[Friday] And [Periodicity] =
[tblSalesmanApptsTemp].[Periodicity])))
ORDER BY tblSalesmanApptsTemp.CallTime1, tblSalesmanApptsTemp.CallTime2,
tblSalesmanApptsTemp.Monday, tblSalesmanApptsTemp.Tuesday,
tblSalesmanApptsTemp.Wednesday, tblSalesmanApptsTemp.Thursday,
tblSalesmanApptsTemp.Friday, tblSalesmanApptsTemp.Periodicity;

If I remove CustID then it finds the dupes no problem. But, that isn't
very
useful to me! BTW I'm using Access 97.
 
R

Ron Hinds

Allen - thanks for the reply. Could you elaborate on this statement please?
Go back to the
simple, normalized structure suggested above, and you can find the
duplicates just by dragging a 2nd copy of the table into query design, and
joining to your original on all fields.

How exactly is joining to the original on all fields going to show me the
duplicates? Could you show me some SQL using your example? TIA!

Allen Browne said:
Ron the difficulty you are experiencing is due to the fact that you are
using a non-normalized table structure. CallTime1 and CallTime2 are
repeating fields, and so are the days of the week.

You could solve the problem easily by creating a table with these fields:
SalesID - the salesman
CustID
CallDay - 1 = Monday; 2 = Tuesday, ...
CallTime - A number for each 5-minute slot in a workday.
Then it's dead simple to see if you have a duplicate.

If you are determined to go ahead with your existing structure anyway, you
will be facing monstrously absurd queries, like this one which will not run
in the query window, but just might if you execute it programmatically as an
ADO recordset:

SELECT tblSalesmanApptsTemp .* FROM tblSalesmanApptsTemp
WHERE EXISTS (SELECT SalesID FROM tblSalesmanApptsTemp AS Dupe
WHERE (Dupe.SalesID = tblSalesmanApptsTemp.SalesID)
AND (Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime1
OR Dupe.CallTime2 = tblSalesmanApptsTemp.CallTime2
OR Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime2)
AND ((IIf(tblSalesmanApptsTemp.[Monday],1)
+ IIf(tblSalesmanApptsTemp.[Tuesday], 2)
+ IIf(tblSalesmanApptsTemp.[Wednesday], 4)
+ IIf(tblSalesmanApptsTemp.[Thursday], 8)
+ IIf(tblSalesmanApptsTemp.[Friday], 16)
BAND IIf(Dupe.[Monday],1)
+ IIf(Dupe.[Tuesday], 2)
+ IIf(Dupe.[Wednesday], 4)
+ IIf(Dupe.[Thursday], 8)
+ IIf(Dupe.[Friday], 16)) <> 0));

Just noticed the end of your post, which says you are using A97, so that's
useless anyway. The BAND operator is only available in JET 4. Go back to the
simple, normalized structure suggested above, and you can find the
duplicates just by dragging a 2nd copy of the table into query design, and
joining to your original on all fields.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

I'm building an appointment scheduling feature and I'm having trouble
trying
to figure out if a user has created scheduling conflicts. Here is the
scenario: The user will choose from a list of their customers the day(s)
of
the week they wish to call that customer, a call time, and optionally
another call time, in five minute increments. The actual schedule will be
built from that template nightly rolling 90 days out. I want to determine
if
there are conflicts in the template. The template table is laid out as
follows:

[SalesID] Text (50)
[CustID] Text (50)
[CallTime1] Date/Time
[CallTime2] Date/Time
[Monday] Boolean
[Tuesday] Boolean
[Wednesday] Boolean
[Thursday] Boolean
[Friday] Boolean
[Periodicity] Number (Long)

The Primary Key is [SalesID] and [CustID]

I tried using the Find Duplicates wizard, selecting only the columns that
may have dupes (i.e. everything except the SalesID/CustID). In the next
step, it asks which columns to return in addition to those that might be
dupes. I ask for the CustID field. But the resulting query doesn't return
anything when I run a test, even though every field except CustID is
identical in two records. Here is the SQL the wizard generated:

SELECT DISTINCTROW tblSalesmanApptsTemp.CallTime1,
tblSalesmanApptsTemp.CallTime2, tblSalesmanApptsTemp.Monday,
tblSalesmanApptsTemp.Tuesday, tblSalesmanApptsTemp.Wednesday,
tblSalesmanApptsTemp.Thursday, tblSalesmanApptsTemp.Friday,
tblSalesmanApptsTemp.Periodicity, tblSalesmanApptsTemp.CustID
FROM tblSalesmanApptsTemp
WHERE (((tblSalesmanApptsTemp.CallTime1) In (SELECT [CallTime1] FROM
[tblSalesmanApptsTemp] As Tmp GROUP BY
[CallTime1],[CallTime2],[Monday],[Tuesday],[Wednesday],[Thursday],[Friday],[
Periodicity] HAVING Count(*)>1 And [CallTime2] =
[tblSalesmanApptsTemp].[CallTime2] And [Monday] =
[tblSalesmanApptsTemp].[Monday] And [Tuesday] =
[tblSalesmanApptsTemp].[Tuesday] And [Wednesday] =
[tblSalesmanApptsTemp].[Wednesday] And [Thursday] =
[tblSalesmanApptsTemp].[Thursday] And [Friday] =
[tblSalesmanApptsTemp].[Friday] And [Periodicity] =
[tblSalesmanApptsTemp].[Periodicity])))
ORDER BY tblSalesmanApptsTemp.CallTime1, tblSalesmanApptsTemp.CallTime2,
tblSalesmanApptsTemp.Monday, tblSalesmanApptsTemp.Tuesday,
tblSalesmanApptsTemp.Wednesday, tblSalesmanApptsTemp.Thursday,
tblSalesmanApptsTemp.Friday, tblSalesmanApptsTemp.Periodicity;

If I remove CustID then it finds the dupes no problem. But, that isn't
very
useful to me! BTW I'm using Access 97.
 
A

Allen Browne

This example assumes you created the table using the fields described, added
an AutoNumber named AppointID, and saved the table with the name Appoint:

SELECT Count(Appoint.AppointID) AS HowMany,
Min(Appoint.AppointID) AS MinAppointID,
Max(Appoint.AppointID) AS MaxAppointID
FROM Appoint INNER JOIN Appoint AS Dupe
ON (Appoint.SalesID = Dupe.SalesID)
AND (Appoint.CallDay = Dupe.CallDay)
AND (Appoint.CallTime = Dupe.CallTime)
WHERE Appoint.AppointID <> Dupe.AppointID
HAVING Count(Appoint.AppointID) > 1;

By joining the table to itself (a second copy, aliased here as Dupe), and
matching on the salesman, callday, and calltime, you identify the
duplicates. The WHERE clause stops a record finding itself as a duplicate.
The HAVING clause chooses only the record where the count is more than 1
(i.e. there are duplicates.) The SELECT clause outputs the number of
duplictes, and the lowest and highest value of the primary key field to help
you identify which ones the duplicates are.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Ron Hinds said:
Allen - thanks for the reply. Could you elaborate on this statement
please?
Go back to the
simple, normalized structure suggested above, and you can find the
duplicates just by dragging a 2nd copy of the table into query design,
and
joining to your original on all fields.

How exactly is joining to the original on all fields going to show me the
duplicates? Could you show me some SQL using your example? TIA!

Allen Browne said:
Ron the difficulty you are experiencing is due to the fact that you are
using a non-normalized table structure. CallTime1 and CallTime2 are
repeating fields, and so are the days of the week.

You could solve the problem easily by creating a table with these fields:
SalesID - the salesman
CustID
CallDay - 1 = Monday; 2 = Tuesday, ...
CallTime - A number for each 5-minute slot in a workday.
Then it's dead simple to see if you have a duplicate.

If you are determined to go ahead with your existing structure anyway,
you
will be facing monstrously absurd queries, like this one which will not run
in the query window, but just might if you execute it programmatically as an
ADO recordset:

SELECT tblSalesmanApptsTemp .* FROM tblSalesmanApptsTemp
WHERE EXISTS (SELECT SalesID FROM tblSalesmanApptsTemp AS Dupe
WHERE (Dupe.SalesID = tblSalesmanApptsTemp.SalesID)
AND (Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime1
OR Dupe.CallTime2 = tblSalesmanApptsTemp.CallTime2
OR Dupe.CallTime1 = tblSalesmanApptsTemp.CallTime2)
AND ((IIf(tblSalesmanApptsTemp.[Monday],1)
+ IIf(tblSalesmanApptsTemp.[Tuesday], 2)
+ IIf(tblSalesmanApptsTemp.[Wednesday], 4)
+ IIf(tblSalesmanApptsTemp.[Thursday], 8)
+ IIf(tblSalesmanApptsTemp.[Friday], 16)
BAND IIf(Dupe.[Monday],1)
+ IIf(Dupe.[Tuesday], 2)
+ IIf(Dupe.[Wednesday], 4)
+ IIf(Dupe.[Thursday], 8)
+ IIf(Dupe.[Friday], 16)) <> 0));

Just noticed the end of your post, which says you are using A97, so
that's
useless anyway. The BAND operator is only available in JET 4. Go back to the
simple, normalized structure suggested above, and you can find the
duplicates just by dragging a 2nd copy of the table into query design,
and
joining to your original on all fields.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

I'm building an appointment scheduling feature and I'm having trouble
trying
to figure out if a user has created scheduling conflicts. Here is the
scenario: The user will choose from a list of their customers the
day(s)
of
the week they wish to call that customer, a call time, and optionally
another call time, in five minute increments. The actual schedule will be
built from that template nightly rolling 90 days out. I want to determine
if
there are conflicts in the template. The template table is laid out as
follows:

[SalesID] Text (50)
[CustID] Text (50)
[CallTime1] Date/Time
[CallTime2] Date/Time
[Monday] Boolean
[Tuesday] Boolean
[Wednesday] Boolean
[Thursday] Boolean
[Friday] Boolean
[Periodicity] Number (Long)

The Primary Key is [SalesID] and [CustID]

I tried using the Find Duplicates wizard, selecting only the columns that
may have dupes (i.e. everything except the SalesID/CustID). In the next
step, it asks which columns to return in addition to those that might
be
dupes. I ask for the CustID field. But the resulting query doesn't return
anything when I run a test, even though every field except CustID is
identical in two records. Here is the SQL the wizard generated:

SELECT DISTINCTROW tblSalesmanApptsTemp.CallTime1,
tblSalesmanApptsTemp.CallTime2, tblSalesmanApptsTemp.Monday,
tblSalesmanApptsTemp.Tuesday, tblSalesmanApptsTemp.Wednesday,
tblSalesmanApptsTemp.Thursday, tblSalesmanApptsTemp.Friday,
tblSalesmanApptsTemp.Periodicity, tblSalesmanApptsTemp.CustID
FROM tblSalesmanApptsTemp
WHERE (((tblSalesmanApptsTemp.CallTime1) In (SELECT [CallTime1] FROM
[tblSalesmanApptsTemp] As Tmp GROUP BY
[CallTime1],[CallTime2],[Monday],[Tuesday],[Wednesday],[Thursday],[Friday],[
Periodicity] HAVING Count(*)>1 And [CallTime2] =
[tblSalesmanApptsTemp].[CallTime2] And [Monday] =
[tblSalesmanApptsTemp].[Monday] And [Tuesday] =
[tblSalesmanApptsTemp].[Tuesday] And [Wednesday] =
[tblSalesmanApptsTemp].[Wednesday] And [Thursday] =
[tblSalesmanApptsTemp].[Thursday] And [Friday] =
[tblSalesmanApptsTemp].[Friday] And [Periodicity] =
[tblSalesmanApptsTemp].[Periodicity])))
ORDER BY tblSalesmanApptsTemp.CallTime1,
tblSalesmanApptsTemp.CallTime2,
tblSalesmanApptsTemp.Monday, tblSalesmanApptsTemp.Tuesday,
tblSalesmanApptsTemp.Wednesday, tblSalesmanApptsTemp.Thursday,
tblSalesmanApptsTemp.Friday, tblSalesmanApptsTemp.Periodicity;

If I remove CustID then it finds the dupes no problem. But, that isn't
very
useful to me! BTW I'm using Access 97.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top