Need query to filter out duplicates based on another column

M

mscertified

I need a query that will select records from a table based on a GroupId
column. For a given GroupId there can be many different Invoices (InvoiceNo
column). The InvoiceNo needs to be unique. However, the InvoiceNo may not be
unique and in those cases I need to look an another column BatchDate and take
the record for the InvoiceNo with the latest date ignoring any earlier dates.
I'm unsure as to how to structure the SQL to handle the last part of
eliminating earlier versions of invoices based on the date.
 
K

KARL DEWEY

Try this --
SELECT GroupId, InvoiceNo, Max([BatchDate ]) AS InvoiceDate
FROM YourTable
GROUP BY GroupId, InvoiceNo;
 
M

mscertified

Thanks for the response.
Yes, that would work if those columns were all I needed but there are a host
of other columns relating to the individual invoice. As soon as I add the
other columns it wants an aggregate function to know which to give me. I
think I need some kind of self-join to select what I want but so far exactly
how to do it has defeated me.


KARL DEWEY said:
Try this --
SELECT GroupId, InvoiceNo, Max([BatchDate ]) AS InvoiceDate
FROM YourTable
GROUP BY GroupId, InvoiceNo;
--
KARL DEWEY
Build a little - Test a little


mscertified said:
I need a query that will select records from a table based on a GroupId
column. For a given GroupId there can be many different Invoices (InvoiceNo
column). The InvoiceNo needs to be unique. However, the InvoiceNo may not be
unique and in those cases I need to look an another column BatchDate and take
the record for the InvoiceNo with the latest date ignoring any earlier dates.
I'm unsure as to how to structure the SQL to handle the last part of
eliminating earlier versions of invoices based on the date.
 
K

KARL DEWEY

Ok, save that query and join it in another query on those three field to your
table and include the other fields you need.
If you need more help with this post yourr select query and someone can
marry it up for you.
--
KARL DEWEY
Build a little - Test a little


mscertified said:
Thanks for the response.
Yes, that would work if those columns were all I needed but there are a host
of other columns relating to the individual invoice. As soon as I add the
other columns it wants an aggregate function to know which to give me. I
think I need some kind of self-join to select what I want but so far exactly
how to do it has defeated me.


KARL DEWEY said:
Try this --
SELECT GroupId, InvoiceNo, Max([BatchDate ]) AS InvoiceDate
FROM YourTable
GROUP BY GroupId, InvoiceNo;
--
KARL DEWEY
Build a little - Test a little


mscertified said:
I need a query that will select records from a table based on a GroupId
column. For a given GroupId there can be many different Invoices (InvoiceNo
column). The InvoiceNo needs to be unique. However, the InvoiceNo may not be
unique and in those cases I need to look an another column BatchDate and take
the record for the InvoiceNo with the latest date ignoring any earlier dates.
I'm unsure as to how to structure the SQL to handle the last part of
eliminating earlier versions of invoices based on the date.
 
M

mscertified

hjere is what I ended up with:

SELECT DISTINCT RTRIM(IH.InvoiceNumber) AS ESIInvoice,
Convert(datetime,IH.CreationDate,101) AS ESIDate,
IH.InvoiceTotalAmount AS ESIAmount
FROM dbo.InvoiceHeaderInterface AS IH
INNER JOIN
(SELECT InvoiceNumber, max(CycleId) As MaxCycle
FROM dbo.InvoiceHeaderInterface
GROUP By InvoiceNumber) As IM
ON (IH.InvoiceNumber = IM.InvoiceNumber
AND IH.CycleId = IM.MaxCycle)
WHERE LTRIM(IH.VendorNumber) = 'G11111111'

Now I'm wondering if it would be more efficient to have the WHERE clause
inside each of the SELECTS rather than outside the SELECTS.


KARL DEWEY said:
Ok, save that query and join it in another query on those three field to your
table and include the other fields you need.
If you need more help with this post yourr select query and someone can
marry it up for you.
--
KARL DEWEY
Build a little - Test a little


mscertified said:
Thanks for the response.
Yes, that would work if those columns were all I needed but there are a host
of other columns relating to the individual invoice. As soon as I add the
other columns it wants an aggregate function to know which to give me. I
think I need some kind of self-join to select what I want but so far exactly
how to do it has defeated me.


KARL DEWEY said:
Try this --
SELECT GroupId, InvoiceNo, Max([BatchDate ]) AS InvoiceDate
FROM YourTable
GROUP BY GroupId, InvoiceNo;
--
KARL DEWEY
Build a little - Test a little


:

I need a query that will select records from a table based on a GroupId
column. For a given GroupId there can be many different Invoices (InvoiceNo
column). The InvoiceNo needs to be unique. However, the InvoiceNo may not be
unique and in those cases I need to look an another column BatchDate and take
the record for the InvoiceNo with the latest date ignoring any earlier dates.
I'm unsure as to how to structure the SQL to handle the last part of
eliminating earlier versions of invoices based on the date.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top