GROUP BY vs DISTINCT

W

Warrio

Hello!

Which of GROUP BY and DISTINCT is the quickest? and why?

Thanks for any relevant answer!

PS: Please don't tell me that there is no difference because it's a matter
of one millionth second.
 
W

Warrio

Hi Baz!

So Would you please Tell me if there is difference in the result between
these 2 queries:

"SELECT DINSTINC Field1 FROM Table1"
and
"SELECT Field1 FROM Table1 GROUP BY Field1"

??? in this case, the result will be the same, but I'm asking which one is
the most efficient?
 
B

Baz

Just to explain further, suppose that Table1 has 50 columns, and you do
this:

SELECT DISTINCT * FROM Table1

You *could* get the same results with your GROUP BY trick, but it's not a
query I would want to create, life's too short. Your little academic
exercise might seem meaningful to you with a query that returns only one
column, but it doesn't have much practical use. Not what GROUP BY is
designed for, you see.
 
M

Michel Walsh

GROUP BY will NOT be slower, since it forces you to specify which field(s)
you really need to define 'unique-ness', it will be faster, or take the same
time, at worst, than DISTINCT, but note that GROUP BY is more rigid than
DISTINCT, since any expression in the SELECT clause must then be either in
the GROUP BY, either aggregated.


SELECT DISTINCT lastName, firstName, telNumber FROM somewhere

will be slower than

SELECT lastName, firstName, MAX(telNumber) FROM somewhere GROUP BY lastName,
firstName



since not only distinct will uselessly compare the telNumber field, but it
will also reserve memory to keep it as 'group identifier'; the second
solution neither does the comparison, neither has to reserve that extra
memory.



Vanderghast, Access MVP
 
J

John Spencer

I am not sure you are correct in your evaluation. I think the only way to
know would be to test with data and to run the tests multiple times.


--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
M

Michel Walsh

Unless a DISTINCT is effectively implemented as a GROUP BY ... :)



SELECT DISTINCT lastName, firstName, telNumber FROM somewhere


is 'evaluated' as if it was:


SELECT lastName, firstName, telNumber FROM somewhere
GROUP BY lastName, firstName, telNumber




Vanderghast, Access MVP
 
W

Warrio

Great Thanks Michel for your answer! exactly what I was looking for! :)

I'll have probably to think about it again once or twice so I can really
have a vision of how the memory is allocated!

John, the best way would be to have the code that's behind these GROUP BY
and DISTINCT

the test won't concern a big time difference, plus there is too many
variables within the same machine, you'd have only an average.

thanks again.
 
W

Warrio

Were you the one who said that life is too short?? appearently you have a
lot of time to waste ;)
and about your test, maybe when you were testing the query with GROUP BY,
you had your windows update downloading files

try it 100 times during different times a day and please get back to me ;)
 
M

Michel Walsh

Both queries plan being exactly the same, with:


SELECT DISTINCT au_fname FROM authors

and

SELECT au_fname FROM authors GROUP BY au_fname


*if* you experience some difference in time execution, that is probably due
to some exterior cause.



Vanderghast, Access MVP
 
M

Michel Walsh

and here is the query plan, again, the same, for both statements (excuse the HTML format, I suspect that without it, it turns out un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname] ASC))

|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))




Vanderghast, Access MVP
 
W

Warrio

3stone said:
|
| Vanderghast, Access MVP
|

Un grand bonjour amical à Michel !

(dommage qu'il ne vient plus sur ce forum...)

--
A+
Pierre (3stone) Access MVP
Perso: http://www.3stone.be/
MPFA: http://www.mpfa.info/ (infos générales)
"Michel Walsh" <[email protected]> a écrit dans le message de %[email protected]...
and here is the query plan, again, the same, for both statements (excuse the HTML format, I suspect that without it, it turns out un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname] ASC))

|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))




Vanderghast, Access MVP
 
M

Michel Walsh

"Warrio", c'est un alias pour "3stone" ?


Je renvoie le grand bonjour à Pierre, en tout cas.



Vanderghast, Access MVP




3stone said:
|
| Vanderghast, Access MVP
|

Un grand bonjour amical à Michel !

(dommage qu'il ne vient plus sur ce forum...)

--
A+
Pierre (3stone) Access MVP
Perso: http://www.3stone.be/
MPFA: http://www.mpfa.info/ (infos générales)
"Michel Walsh" <[email protected]> a écrit dans le
message de %[email protected]...
and here is the query plan, again, the same, for both statements (excuse the
HTML format, I suspect that without it, it turns out un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname] ASC))
|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))



Vanderghast, Access MVP
 
M

Michel Walsh

but there is a property in the query's property sheet: "Unique Values", set
to no, by default. {:?)
(the property is reset to no if you set "Unique Records" to yes.)


Vanderghast, Access MVP
 
W

Warrio

Non du tout!
warrio c'est le méchant sur nintendo...

Michel Walsh said:
"Warrio", c'est un alias pour "3stone" ?


Je renvoie le grand bonjour à Pierre, en tout cas.



Vanderghast, Access MVP




3stone said:
|
| Vanderghast, Access MVP
|

Un grand bonjour amical à Michel !

(dommage qu'il ne vient plus sur ce forum...)

--
A+
Pierre (3stone) Access MVP
Perso: http://www.3stone.be/
MPFA: http://www.mpfa.info/ (infos générales)
"Michel Walsh" <[email protected]> a écrit dans le
message de %[email protected]...
and here is the query plan, again, the same, for both statements (excuse
the HTML format, I suspect that without it, it turns out un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname] ASC))
|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))



Vanderghast, Access MVP
 
D

David W. Fenton

I am not sure you are correct in your evaluation. I think the
only way to know would be to test with data and to run the tests
multiple times.

Well, surely the way to tell if they are parsed the same is to use
SHOWPLAN and find out. If they are parsed the same, then there isn't
any need to test with data. If they aren't, then you can only answer
the question with testing.
 
D

David W. Fenton

Were you the one who said that life is too short??

The test demonstrated exactly what he had forecast, that DISTINCT is
the logical way to get DISTINCT values, and that GROUP BY is for a
different purpose (and would thus very likely not be as fast).

He only did the test because you stop you from continuing to natter
on about it.
 
D

David W. Fenton

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))

Er, that's a SQL Server query plan, not a Jet one.

Who said we were using SQL Server as the back end?
 
D

David W. Fenton

and here is the query plan, again, the same, for both statements
(excuse the HTML format, I suspect that without it, it turns out
un-readable):

|--Sort(DISTINCT ORDER BY:([pubs].[dbo].[authors].[au_fname]
|ASC))

|--Index Scan(OBJECT:([pubs].[dbo].[authors].[aunmind]))

And if you run it in Jet, the SHOWPLAN gives this (Jet 3.5 or 4.0,
run both wiht LastName indexed and indexed (non-unique)):

--- Query1 ---
[SELECT DISTINCT tblPerson.LastName
FROM tblPerson;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

store result in temporary table


--- Query2 ---
[SELECT tblPerson.LastName
FROM tblPerson
GROUP BY tblPerson.LastName;]

- Inputs to Query -
Table 'tblPerson'
- End inputs to Query -

01) Group table 'tblPerson'

This shows that the answer to the question depends on the database
engine being used, and, as expected, Jet handles the two completely
differently.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top