Ignore duplicate records

K

Keith Woollacott

I have a database with two tables; one which holds details of customers
(tbl_Customers) and the other has details of insurance policies
(tbl_Insurance). Over time the tbl_Insurance gets multiple entries for one
customer (the relationship is on the Customer_ID field).

Using the criteria "<Now()" on the Policy_Date field pulls out lapsed
policies, but I need the query to ignore expired policies if there is another
record with a duplicated Customer_ID that has a current policy.

I hope this makes sense to somebody!
 
J

John Spencer

One method that should be fine with data sets that are not too large

SELECT CustomerID
FROM tbl_Insurance
WHERE Policy_Date <Date()
AND NOT Exists
(SELECT CustomerID
FROM tbl_Insurance
WHERE Policy_Date >= Date())

With really large data sets that may be slow. If so, post back and I will
post an alternative that is faster but may be more difficult to comprehend.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
K

Keith Woollacott

Excuse my lack of knowledge, but do I enter that as a criteria for the
CustomerID field? When I do I get an error message telling me that the syntax
is incorrect and to put the subquery in parentheses.
 
J

John Spencer

What I posted would be a complete query and it was wrong anyway since I
forgot to correlate it with the main query on the customerid

If you are not used to entering in the SQL window, then try this.

In the query grid add a calculated field:

Field: Exists (SELECT * FROM tbl_Insurance as I WHERE I.Policy_Date >=
Date() AND I.CustomerID = tbl_Insurance.CustomerID)
Criteria: False

The SQL would look like the following to return just the customer ID
SELECT CustomerID
FROM tbl_Insurance
WHERE Policy_Date <Date()
AND NOT Exists
(SELECT *
FROM tbl_Insurance as I
WHERE I.Policy_Date >= Date() AND
I.CustomerID = tbl_Insurance.CustomerID)

If you cannot work this out, post the SQL of your current query (Menu
View:SQL) and someone (probably me) should be able to modify it to do what
you want.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
K

Keith Woollacott

John, It seems to have worked - many thanks. You mentioned about only working
on small data sets; mine is currently small (about 6,500 records in the
tbl_Insurance). This will increase by about 5,000 each year - how long until
it becomes a large data set and slows down?
 
J

John Spencer

Hard to say. It slows down when you decide the performance is too slow. I
wouldn't worry about it until you get 50,000 or more records (pure guess).

Faster? probably
You could use SQL for that might look like the following and cannot be built
in the query grid

SELECT tbl_Customers.*
FROM (tbl_Customers INNER JOIN tbl_Insurance
ON tbl_Customers.CustomerID = tbl_Insurance.CustomerID)
LEFT JOIN
(SELECT I2.CustomerID
FROM tbl_Insurance as ID
WHERE I2.PolicyDate >= Date()) as I
ON tbl_Customers.CustomerID = I.CustomerID
WHERE tbl_Insurance.PolicyDate < Date()
AND I.CustomerId is Null


You can build something like this using nested queries.
QueryOne identifies all those customers with a policydate after the cutoff
date
Then you use that query and the customers table to build an unmatched query
using the query wizard
and then use this query along with tbl_Insurance to get the final result.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top