Median and Quartiles - Speeding Up The Calculations

  • Thread starter ScottMSP via AccessMonster.com
  • Start date
S

ScottMSP via AccessMonster.com

Hello,

With a lot of help from Mike Walsh (Vanderghast), I was able to successfully
calculate the quartiles (25th, Median, and 75th) within Access.

The issue I am having now is that when I run the queries, it takes a really
long time to produce the results. I am wondering if there is a way to speed
up the queries to calculate the information.

I have posted below the SQL for the three queries that run to produce the
results. When I test the first two, they run quickly (within seconds), but
the last one (the final one), it is the one that takes too long to produce
the results.

Thanks in advance.

Query1
SELECT tEmployeeMaster.DataEffectiveDate, tEmployeeMaster.JobCode,
tEmployeeMaster.HourlyRate, (SELECT Count (*) FROM tEmployeeMaster AS M1
WHERE M1.JobCode = tEmployeeMaster.JobCode AND M1.DataEffectiveDate =
tEmployeeMaster.DataEffectiveDate AND M1.HourlyRate < tEmployeeMaster.
HourlyRate)+(SELECT Count (*) FROM tEmployeeMaster AS M1 WHERE M1.JobCode =
tEmployeeMaster.JobCode AND M1.DataEffectiveDate = tEmployeeMaster.
DataEffectiveDate AND M1.HourlyRate = tEmployeeMaster.HourlyRate AND M1.
Position <= tEmployeeMaster.Position) AS Rank
FROM tEmployeeMaster
GROUP BY tEmployeeMaster.DataEffectiveDate, tEmployeeMaster.JobCode,
tEmployeeMaster.HourlyRate, tEmployeeMaster.Position
HAVING (((tEmployeeMaster.DataEffectiveDate) Between #4/14/2010# And
#4/16/2010#))
ORDER BY tEmployeeMaster.JobCode, tEmployeeMaster.HourlyRate;

Query2
SELECT qryQuartiles_Sub_Rank.JobCode, Count(qryQuartiles_Sub_Rank.JobCode) AS
CountOfJobCode, 0.25*([CountOfJobCode]+3) AS InterpolQ1, Int([InterpolQ1]) AS
N0Q1, 1-([InterpolQ1]-[N0Q1]) AS Frac0Q1, [N0Q1]+1 AS N1Q1, 1-[Frac0Q1] AS
Frac1Q1, 0.5*([CountOfJobCode]+1) AS InterpolQ2, Int([InterpolQ2]) AS N0Q2, 1-
([InterpolQ2]-[N0Q2]) AS Frac0Q2, [N0Q2]+1 AS N1Q2, 1-[Frac0Q2] AS Frac1Q2, 0.
25*(([CountOfJobCode]*3)+1) AS InterpolQ3, Int([InterpolQ3]) AS N0Q3, 1-(
[InterpolQ3]-[N0Q3]) AS Frac0Q3, [N0Q3]+1 AS N1Q3, 1-[Frac0Q3] AS Frac1Q3
FROM qryQuartiles_Sub_Rank
GROUP BY qryQuartiles_Sub_Rank.JobCode;

Query3
SELECT QryQuartiles_Sub_Interpolation.JobCode, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank].[HourlyRate]*[Frac0Q1]+
[qryQuartiles_Sub_Rank_1].[hourlyRate]*[Frac1Q1]) AS Incumbent_Q1, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank_2].[HourlyRate]*[Frac0Q2]+
[qryQuartiles_Sub_Rank_3].[hourlyRate]*[Frac1Q2]) AS Incumbent_Q2, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank_5].[HourlyRate]*[Frac0Q3]+
[qryQuartiles_Sub_Rank_4].[hourlyRate]*[Frac1Q3]) AS Incumbent_Q3
FROM (((((QryQuartiles_Sub_Interpolation LEFT JOIN qryQuartiles_Sub_Rank ON
(QryQuartiles_Sub_Interpolation.jobCode = qryQuartiles_Sub_Rank.jobCode) AND
(QryQuartiles_Sub_Interpolation.N0Q1 = qryQuartiles_Sub_Rank.Rank)) LEFT JOIN
qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_1 ON
(QryQuartiles_Sub_Interpolation.jobCode = qryQuartiles_Sub_Rank_1.jobCode)
AND (QryQuartiles_Sub_Interpolation.N1Q1 = qryQuartiles_Sub_Rank_1.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_2 ON
(QryQuartiles_Sub_Interpolation.JobCode = qryQuartiles_Sub_Rank_2.JobCode)
AND (QryQuartiles_Sub_Interpolation.N1Q2 = qryQuartiles_Sub_Rank_2.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_3 ON
(QryQuartiles_Sub_Interpolation.JobCode = qryQuartiles_Sub_Rank_3.JobCode)
AND (QryQuartiles_Sub_Interpolation.N0Q2 = qryQuartiles_Sub_Rank_3.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_4 ON
(QryQuartiles_Sub_Interpolation.N1Q3 = qryQuartiles_Sub_Rank_4.Rank) AND
(QryQuartiles_Sub_Interpolation.JobCode = qryQuartiles_Sub_Rank_4.JobCode))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_5 ON
(QryQuartiles_Sub_Interpolation.N0Q3 = qryQuartiles_Sub_Rank_5.Rank) AND
(QryQuartiles_Sub_Interpolation.JobCode = qryQuartiles_Sub_Rank_5.JobCode);
 
G

GeoffG

Please confirm:

1. Is Query 1 named "qryQuartiles_Sub_Rank"?

2. Is Query 2 named "QryQuartiles_Sub_Interpolation"?

3. Did you copy the SQL statements of Queries 1 & 2
correctly?
I'm getting the following error message:
Invalid use of '.', '!' or '()' in 'M1.JobCode =
tEmployeeMaster.JobCode AND M1.DataEffectiveDate =
tEmployeeMaster.DataEffectiveDate AND M1.HourlyRate <
tEmployeeMaster. HourlyRate)'

4. Did you mean to use "<" in above subquery.
You use "=" in second subquery.

5. Are the field data types as follows?
DataEffectiveDate (Date/Time)
JobCode (Text)
HourlyRate (Currency)
Position (Number, Long)

Geoff


ScottMSP via AccessMonster.com said:
Hello,

With a lot of help from Mike Walsh (Vanderghast), I was
able to successfully
calculate the quartiles (25th, Median, and 75th) within
Access.

The issue I am having now is that when I run the queries,
it takes a really
long time to produce the results. I am wondering if there
is a way to speed
up the queries to calculate the information.

I have posted below the SQL for the three queries that run
to produce the
results. When I test the first two, they run quickly
(within seconds), but
the last one (the final one), it is the one that takes too
long to produce
the results.

Thanks in advance.

Query1
SELECT tEmployeeMaster.DataEffectiveDate,
tEmployeeMaster.JobCode,
tEmployeeMaster.HourlyRate, (SELECT Count (*) FROM
tEmployeeMaster AS M1
WHERE M1.JobCode = tEmployeeMaster.JobCode AND
M1.DataEffectiveDate =
tEmployeeMaster.DataEffectiveDate AND M1.HourlyRate <
tEmployeeMaster.
HourlyRate)+(SELECT Count (*) FROM tEmployeeMaster AS M1
WHERE M1.JobCode =
tEmployeeMaster.JobCode AND M1.DataEffectiveDate =
tEmployeeMaster.
DataEffectiveDate AND M1.HourlyRate =
tEmployeeMaster.HourlyRate AND M1.
Position <= tEmployeeMaster.Position) AS Rank
FROM tEmployeeMaster
GROUP BY tEmployeeMaster.DataEffectiveDate,
tEmployeeMaster.JobCode,
tEmployeeMaster.HourlyRate, tEmployeeMaster.Position
HAVING (((tEmployeeMaster.DataEffectiveDate) Between
#4/14/2010# And
#4/16/2010#))
ORDER BY tEmployeeMaster.JobCode,
tEmployeeMaster.HourlyRate;

Query2
SELECT qryQuartiles_Sub_Rank.JobCode,
Count(qryQuartiles_Sub_Rank.JobCode) AS
CountOfJobCode, 0.25*([CountOfJobCode]+3) AS InterpolQ1,
Int([InterpolQ1]) AS
N0Q1, 1-([InterpolQ1]-[N0Q1]) AS Frac0Q1, [N0Q1]+1 AS
N1Q1, 1-[Frac0Q1] AS
Frac1Q1, 0.5*([CountOfJobCode]+1) AS InterpolQ2,
Int([InterpolQ2]) AS N0Q2, 1-
([InterpolQ2]-[N0Q2]) AS Frac0Q2, [N0Q2]+1 AS N1Q2,
1-[Frac0Q2] AS Frac1Q2, 0.
25*(([CountOfJobCode]*3)+1) AS InterpolQ3,
Int([InterpolQ3]) AS N0Q3, 1-(
[InterpolQ3]-[N0Q3]) AS Frac0Q3, [N0Q3]+1 AS N1Q3,
1-[Frac0Q3] AS Frac1Q3
FROM qryQuartiles_Sub_Rank
GROUP BY qryQuartiles_Sub_Rank.JobCode;

Query3
SELECT QryQuartiles_Sub_Interpolation.JobCode, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank].[HourlyRate]*[Frac0Q1]+
[qryQuartiles_Sub_Rank_1].[hourlyRate]*[Frac1Q1]) AS
Incumbent_Q1, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank_2].[HourlyRate]*[Frac0Q2]+
[qryQuartiles_Sub_Rank_3].[hourlyRate]*[Frac1Q2]) AS
Incumbent_Q2, IIf(
[QryQuartiles_Sub_Interpolation].[CountOfJobCode]=1,[qryQuartiles_Sub_Rank].
[HourlyRate],[qryQuartiles_Sub_Rank_5].[HourlyRate]*[Frac0Q3]+
[qryQuartiles_Sub_Rank_4].[hourlyRate]*[Frac1Q3]) AS
Incumbent_Q3
FROM (((((QryQuartiles_Sub_Interpolation LEFT JOIN
qryQuartiles_Sub_Rank ON
(QryQuartiles_Sub_Interpolation.jobCode =
qryQuartiles_Sub_Rank.jobCode) AND
(QryQuartiles_Sub_Interpolation.N0Q1 =
qryQuartiles_Sub_Rank.Rank)) LEFT JOIN
qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_1 ON
(QryQuartiles_Sub_Interpolation.jobCode =
qryQuartiles_Sub_Rank_1.jobCode)
AND (QryQuartiles_Sub_Interpolation.N1Q1 =
qryQuartiles_Sub_Rank_1.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_2
ON
(QryQuartiles_Sub_Interpolation.JobCode =
qryQuartiles_Sub_Rank_2.JobCode)
AND (QryQuartiles_Sub_Interpolation.N1Q2 =
qryQuartiles_Sub_Rank_2.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_3
ON
(QryQuartiles_Sub_Interpolation.JobCode =
qryQuartiles_Sub_Rank_3.JobCode)
AND (QryQuartiles_Sub_Interpolation.N0Q2 =
qryQuartiles_Sub_Rank_3.Rank))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_4
ON
(QryQuartiles_Sub_Interpolation.N1Q3 =
qryQuartiles_Sub_Rank_4.Rank) AND
(QryQuartiles_Sub_Interpolation.JobCode =
qryQuartiles_Sub_Rank_4.JobCode))
LEFT JOIN qryQuartiles_Sub_Rank AS qryQuartiles_Sub_Rank_5
ON
(QryQuartiles_Sub_Interpolation.N0Q3 =
qryQuartiles_Sub_Rank_5.Rank) AND
(QryQuartiles_Sub_Interpolation.JobCode =
qryQuartiles_Sub_Rank_5.JobCode);
 
S

ScottMSP via AccessMonster.com

GeoffG said:
Please confirm:

1. Is Query 1 named "qryQuartiles_Sub_Rank"?

2. Is Query 2 named "QryQuartiles_Sub_Interpolation"?

3. Did you copy the SQL statements of Queries 1 & 2
correctly?
I'm getting the following error message:
Invalid use of '.', '!' or '()' in 'M1.JobCode =
tEmployeeMaster.JobCode AND M1.DataEffectiveDate =
tEmployeeMaster.DataEffectiveDate AND M1.HourlyRate <
tEmployeeMaster. HourlyRate)'

4. Did you mean to use "<" in above subquery.
You use "=" in second subquery.

5. Are the field data types as follows?
DataEffectiveDate (Date/Time)
JobCode (Text)
HourlyRate (Currency)
Position (Number, Long)

Geoff
[quoted text clipped - 112 lines]
(QryQuartiles_Sub_Interpolation.JobCode =
qryQuartiles_Sub_Rank_5.JobCode);

Geoff,

Thanks for reading my post.

In response to your questions:

1. Yes, Query 1 is named qryQuartiles_Sub_Rank
2. Yes, Query 2 is named qry Quartiles_Sub_Interpolation
3. The query SQLs are copied correctly. If it matters, I am using Access
2007.
4. Not sure what you are asking in this question. When I use a much smaller
data set (73 records), the three queries work and produce the results
5. The table fields and types are:
DataEffectiveDate (Date/Time) - Primary Key
Person (Text) - Primary Key
Position (Text) - Primary Key
JobCode (Text)
HourlyRate (Currency)

Thanks in advance.
 
G

GeoffG

Scott:

There is a fast way to calculate the the quartiles.
I have a solution that does not use queries or domain
aggregate functions (like DCount).

I loaded my solution database with 19800 records, using 1800
job codes.
I created 11 records for each job code, giving 11 * 1800 =
19800 records.
There are 3 quartiles for each job code; i.e. 3 * 1800 =
5400 quartiles.
The 5400 results appear in a 32-page Access report.
The first page, containing the three quartiles for each of
the first 54 job codes, appears in 1 second.
If you then forward to the last page of the report, it
appears in 8 seconds (on my slow computer), having
calculated the 5400 quatiles in that time.
Would that be fast enough?

The calculations are performed as the report is created.
The report can be be previewed on screen or printed.
If you need a more sophisticated report - for example, that
will preview just one or a range of job codes - then that is
entirely possible, but I haven't taken my solution that far.

As you may know, Microsoft Excel uses the Gumbell method for
calculating quartiles.
From reading your posts in the AccessMonster forum, I note
you are using the Freund and Perles method.
The Gumbell method seems to produce the same results as
those you posted.

I'm not sure where to begin the explanation.
I cannot send you my database via the newsgroup, as the
Microsoft newsgroup won't accept mdb file attachments.

There is another way I could send the report as a text file
via the newsgroup. But it will be fiddly.

Alternatively, you can email me and I can return my database
directly to you.
Remove NO SPAM from the following email address.
(e-mail address removed)
We can then discuss any issues you may have.

Geoff
 
G

GeoffG

Scott:

Further to my previous post:

I have added two further reports:

1. A report that prints the quartiles for one or more
JobCodes.
(This report looks like a spreadsheet with columns for
JobCode, Q1, Q2 and Q3 and missing hourly rates.)

2. An advanced report that calculates the quartiles for
each JobCode, puts each member of the population into its
appropriate quartile (1-4), and then creates a report that
prints each JobCode once as a heading, followed by rows of
data grouped into quartiles, showing DataEffectiveDate,
Person, Position and HourlyRate.

I don't know whether these additional reports will be
useful, but they demonstrate Access's capabilities and may
give you some ideas.

BTW:
Microsoft will be closing this newsgroup on June 1st, 2010.
For more information, see Microsoft's post in this
newsgroup, entitled:
"Reminder - Microsoft Responds to the Evolution of
Community"
Dated: May 20, 2010

Geoff
 
G

GeoffG

Special Message to John:
I replied to your email but my reply was returned to me:
Undeliverable. Blocked by Barracuda Reputation. Unknown
Address error 554.
Very happy to copy you in to solution.
Please let me know by email if I need do anything to
overcome email issue.
Geoff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top