Help with creative thinking needed

T

TonyT

apologies if this appears as a repost, the first went south.

I have a fairly complex design situation in which I need to produce part
order quantities based on 'just in time' ordering, using historical sales
data, I'll explain a bit more to help fill in the picture;

Setup is accdb front end to accdb backend on XP, everything in DAO

I have a test setup with 90000 parts, about 9000 of these parts are
designated stock items and have a field of # of Days to Stock. There are
about 150000 records in the invoice parts table for the 9000 stock line parts.

This is used for seasonal sales, so I have a function that creates a totals
query (currently including 3 sub queries) to get sales info for each stock
part for the last 4 weeks sales and last years sales for the next 4 weeks (ie
-52 wks to -48 wks) and the same for the preceeding year.

I then work out the quantity required and also the number of days that the
current stock will last from 2 separate formulae and put this data into a
temporary table for the user to overide/edit its recommendations before that
gets transferred to the order detail table.

The whole process is quite slow, originally it was taking 90 seconds to
populate the temporary table with the order quantities and other data, but
with loads of tweaking and testing I've got it down to 22 seconds. BUT, this
is on my quick un-networked pc, not my customers ancient tat on slooow
networks.

I was surprised to see that during all my testing and re-working, that using
a series of nested subqueries to give me 3 different sets of date related
totals was considerably faster than using separate queries, a single crosstab
query or a single joined query between sales and parts tables. I've tried
moving data into more tables and parsing it more times and less, but I can't
improve on my current speed.

So my questions are these;

What should be the outright quickest method for getting 3 different sum
totals of items from a single table with moving date ranges as the sum
criteria? (I also require 4 fields of data from the parts table to complete
the calculations without having to multi-nest queries further down the line.)

Is there a better way to get acturate 'just in time' order info without the
overhead of my current nested query design? Has anyone come up with a decent
model for this in an access environment?

Are there other ways of shaving milliseconds from calculations and queries,
for example in my function, if I open a recordset based on the nested
subquery design and iterate through it, referring to the same field (of the
same record) repeatedly (ie the same field QtyInSt appears multiple times for
the calculations), is it quicker or slower to assign this value to a variable
and then refer to that, rather than the recordset field rcd!QtyInSt? Does it
matter if I use SELECT * FROM rather than naming the individual fields if I
want ALL of the fields returned?

I know that I'm into the realms of theory here, but I'm interested to learn
and I've done all the usual speed improvement tweaks (subdatasheets off,
short db names on the root folder, persistant be connections etc)

I've already sacrificed a calculated stock quantity for a stored one to save
the additional overhead, I just can't think of a better approach to solving
this, the only other approach I've considered, is trying is to use pc idle
time to calculate the figure that I need the most - how many parts will I
sell per day at this particular point in time - and then store that as a
figure within the parts table and then be able to get much quicker
calculations of qty required and remaining days of the current stock holding.

Your thoughts are most welcomed,

TonyT
 
S

Stuart McCall

Are there other ways of shaving milliseconds from calculations and
queries,
for example in my function, if I open a recordset based on the nested
subquery design and iterate through it, referring to the same field (of
the
same record) repeatedly (ie the same field QtyInSt appears multiple times
for
the calculations), is it quicker or slower to assign this value to a
variable
and then refer to that, rather than the recordset field rcd!QtyInSt? Does
it
matter if I use SELECT * FROM rather than naming the individual fields if
I
want ALL of the fields returned?

Wow that's some question. I'll take this little bit. I won't discuss db
theory because I'm self-taught, but one thing springs to mind that you
didn't mention (at least not directly). Indexes. In this kind of situation
it's been my experience that some judicious index-tweaking can work wonders.
I tend to find that concentrating on fields used in criteria and making the
indexing as 'tight' as possible (my terminology - I mean index with no
duplicates as opposed to allow dupes etc.) has the most beneficial effect.

Re the matter of SELECT *, if the sql in question is in a named query, I
believe Access (or rather Jet) has special optimizations for these, so they
are slightly quicker.

HTH
 
J

Jack Leach

Have you tried taking a look at JetShowPlan? I haven't done much than peek
out of curiousity myself, but this will show you how Jet has the query
optimized, and allow you to make some (possibly) better decisions on how to
present the data to Jet. If you are able to do this (and utilize Indexes per
Stuart's comment), you may be able to do yourself some large favors in
performance.

hth

--
Jack Leach
www.tristatemachine.com

"I haven't failed, I've found ten thousand ways that don't work."
-Thomas Edison (1847-1931)
 
T

TonyT

Thank you both for your replies, I know it was a long question, but the
technical aspects interest me and I feel I ought to know more, but always
seem to draw a blank when searching on such subjects.

The indexing should be fine, the work is being done on ID and indexed
fields. The table inserts are pretty rapid 15-20ms per record (faster as
rcd.Addnew/Update than db.Execute by 5ms or so), it's just the quantity of
records being manipulated that makes it all add up to a large time. So every
millisecond counts.

Is there a version of showplan for 2007 and its renamed engine (not now
called jet)? Also I note that it doesn't handle subqueries.

I think I've got a lot more head scratching to do on this one.
 
J

Jack Leach

Is there a version of showplan for 2007 and its renamed engine (not now
called jet)? Also I note that it doesn't handle subqueries.

I don't know.

Another specific name that comes to mind for query performance is Joe
Celko... if I understand correctly this guy is an SQL geninus... and that's
pretty much a directly translation to performance. The few bits I've read
from him are amazing (and far beyond my near non-existant SQL skills).

He has a book titled Joe Celko's SQL For Smarties... you might see about
finding it on google to read an intro on it. You'll probably end up buying
it.

--
Jack Leach
www.tristatemachine.com

"I haven''t failed, I''ve found ten thousand ways that don''t work."
-Thomas Edison (1847-1931)
 
D

David H

1) The fact that your requering sales information for the last 4 weeks and
historical sales tells me that you haven't considered using summary tables at
all.

Each week that you calculate sales for the past four weeks, you're
duplicating work. The sales for the last week of August will be exactly the
same in the 2nd, 3rd and 4th weeks of September. Why recalculate the value
each time you need to review sales for the past 4 weeks?

Database normalization is a theoretical ideal. There is a time and place for
tables that contain summary information. For example, if you need to know the
dollar amount that customers in a bank withdrew in December 2000, its
entirely appropriate to have a summary table that contains that specific
value. The information isn't going to change. It is common in banking
application to use summary tables to contain the current balance of an
account, obviously this changes as debits and credits hit, but having the
information in a summary table speeds up retrieving the value as opposed to
totaling it all. While the value is updated as the transactions hit, there
will be a job at night that validates it or updates it based on the actual
detail of the account.

2) Read up on the concept of Datawarehousing and Online Analytical
Processing. What you're doing in terms of the historical data falls within
that area.

3) Anytime you're dealing with volume, its going to take time. You can
certainly optimize, but optimization can only go so far. Back when I worked
at Walt Disney World, the property management system ran on a AS/400 (aka
iSeries) and was lighting fast. However, there was on job KILLERG that would
take nearly a 1/2 day to complete. The job cycled through past reservations
that had passed the purge date and purged them. It wasn't that the server was
slow, it was that there was a huge number of records across numerous files
that had to go and it took time.
 
T

TonyT

Thank you for that response David, I'll do some reading up and educate myself
a bit.

It's unatural for me to think in terms of deliberate non normalisation,
especially as my usual work is dealing with 'normal' data & I only use it for
financial transactions to do with sales and accounting.

I'll stay with what I have for the time being as sub 25 seconds is bordering
on acceptable and I have more pressing issues and a deadline to work to.
Hopefully after doing some research I'll find a better approach that suits
the task at hand.

thanks again,

TonyT
 
D

David H

You'll find that if go with summary tables that the performance is greatly
enhanced. It also has the added benefit of greatly impacting the performance
of any historical reports that you have in mind.

When you do read up on DW/OLAP keep in mind that its the concepts behind
them that are key.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top