Random number generator expression used

R

roger Avery

My final year (Statistics degree) project is concerned
with the PRNG in MS Excel.

It was stated by the lecturer in Simulation Studies that
this PRNG was no good for simulation purposes .. the
cycle length is not long enough.

So that I can test this statement I need to be able to
run the in-built PRNG so that it repeats the sequence of
generated random numbers. This cannot be done in situ as
there are not enough cells available (max 16k ?).

I would propose to run the PRNG as a VBA macro - or
stand_alone in VB6 - but for that I need to obtain the
mathematical expression that is used by Excel.

Can you tell me what it is, or where it may be published,
or who to ask?
 
M

Michael R Middleton

Roger -

A long-ago knowledge base article said that Excel 4's RAND() is calculated
by multiplying 9821 times the previous value, adding 0.211327, and taking
the fractional part.

RAND() in more recent versions is likely different, and the one in Excel
2003 has been grreatly improved.

- Mike Middleton, www.usfca.edu/~middleton
 
T

Tushar Mehta

[This followup was posted to microsoft.public.excel.misc and a copy was
sent to the cited author.]

How do you get a limit of 16K? A *single* worksheet has 65536 x 256
cells. [Older versions of XL that had 16K rows would still support 16K
* #columns cells on one worksheet.]

Also, keep in mind that XL's random number generator -- RAND() -- is
unavailable in VBA, which has its own -- Rnd().

--
Regards,

Tushar Mehta
MS MVP Excel 2000-2003
www.tushar-mehta.com
Excel, PowerPoint, and VBA tutorials and add-ins
Custom Productivity Solutions leveraging MS Office
 
D

David J. Braden

Roger,
Your instructor is on the right track. There are *three* distinct PRNGs
in Excel: the worksheet function RAND(), the VBA funtion Rnd(), and the
function used by the Analysis ToolPak, which is so bad that I'm sorta
curious about how the heck it was implemented, and who was having issues
with halucinagens when it was coded, and approved.

The cycle length is just the tip of the iceberg on problems with all
three generators. As for the ATP version, there is no deterministic
number for the cycle length that I can fing, independent of the seed
value.
Try this: Use the ATP to generate 10,000 uniform variates in a column.
Then sort them. Insert/Name/Define the range something like "MyRange" or
"Bozo"; whatever. Then use the FREQUENCY function as follows:
select a column of cells as long as your data set, and ctrl-shift-enter
=FREQUENCY(Bozo,Bozo). Copy / Paste-special(values) that data to
another column, and sort it. You will see how many values are repeated;
I have a number of cases where some are repeated up to 8 times; less
than half of the numbers are unique. It is *so* bad that I am pretty
curious about the whole thing. Worse, it will generate both 0 *and* 1,
which amazes me. Ideally, the generator avoids both values, though many
reasonable ones generate o *or* 1, but never both. You can make a
histogram of *that* histogram to summarize the appalling results.

For the worksheet RAND() function, it is vastly improved in Excel 11
over earlier versions (which comprise at least 98% of used
installations, by my guess). The algorithm used has not been made
public, yet. To check out its cycle length, as well as that for the VBA
Rnd() function, just run a short VBA loop where you grab the first
value, then to a DO-UNTIL loop which tests for an identical value. Might
as well do it when you are away from the computer for a while,
especially for testing RAND(), because you need to do a
activesheet.calculate (be sure to do Application.Screenupdating = False
before you start the loop.)

This is only the tip of the iceberg as regards Excel's unbelievably poor
implementation of PRNG's. While the cycle lengths are amazingly short,
there are many other problems; assuming Excel development started in
1981, then the folks clearly chose to ignore the widely available, open
source knowledge of the time. And We've learned a lot about PRNG since
then; only in Excel 11 has *some* of that sunk in, and nopt throughout
the product (such as the ATP).

For more insight, Google up Marsaglia, a prof over in Florida, who has
written extensively about PRNG for many years. I don't think he has
commented directly on Excel's functions, but he doesn't need to: they
are widely known to be simply awful, in many dimensions.

HTH
Dave Braden
MVP - Excel
 
K

Ken Schmidt

David: Thanks for the heads up on the ATP random generator. I ran 10,000
numbers then sorted them. I found 8 sets of duplicate
numbers in the the first 48 entries! Wow. Also thanks for the tip on Prof
Marsaglia. I will check out his work.

Ken
 
D

David J. Braden

Ken,
Thanks for the thanks. I've a macro that I wrote that really brings this
stuff out, but it didn't seem like it was worth clogging up bandwidth
with the darn thing. Glad you got the idea. It seems to be not as well
known, though developers within MS have been given the hard cold facts,
along with an easy way to show them <veg>. To MS's credit, they made
some strides in improving the stat functions when going from v 10 to v11
*within* then worksheet; perhaps the ATP end of things will be fixed by
v12. Believe me, I'm really working on it, much to the annoyance of many.

Dr. Marsaglia's stuff has really taken off in the last 17 years or so;
he seems to have built up a neat cadre of support (starving PhD
candidates, etc.), and the importance of his work is certainly fueled by
the impact that computer-intensive Bayesian and non-parametric stat
methods (such as bootstrapping) have taken on since about 1980, and how
seriously useful they are, through a spreadsheet interface, to folks
working high-end problems in insurance, pharma, petro-chem,
portfolio-assessment, etc.. Hi,s and that of Ripley's, work is really
cool (and of course that by Knuth) in this specialty. Given that it is
out there, published, well-documented, and *free*, I take it as a
slightly-more-than-bare minimum I hope to see MS implement. For another
slant, check out the free PopTools for a very solid PRNG; site is
http://www.cse.csiro.au/poptools/
and its generator is seriously well tested and documented.

In contrast, the ATP is *so* pathological that I (hey, I *am* sick) am
sometimes tempted to figure out just how the heck it could have been so
badly messed up. I admit to have having nightmares trying to work
through an algorithm that could produce what the ATP does wrt uniform
variate generation.

Anyway, please *do* know that this particular issue has been
communicated to MS since version 9 (at the latest); perhaps they might
resolve it before I lose the rest of my hair.

Regards,
Dave Braden
MVP _ Excel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top