Problem with RAND and RANDBETWEEN?

Iggles14 · May 5, 2010

Version: 2008 Operating System: Mac OS X 10.6 (Snow Leopard) Processor: Intel I've got a simple spreadsheet set up to do a simulation of dice rolls. The formula I am using is this:

=RANDBETWEEN(1,6)

I have this formula repeated in four columns and 1000 rows for a total of 4000 trials. This should be more than enough for the central limit theorem to come into play. I would expect each number from 1 to 6 to occur with roughly equal frequency, say between 660-670 times.

In my last run, I had 647 occurrences of 2 and 696 of 6. Every time I have run the simulation, there has been a difference of at least 30 between the count for the most and least frequent roll, and the difference has been as high as 61.

Any ideas? Am I doing something wrong?

Mike Middleton · May 5, 2010

Iggles14 -

Any ideas? <

Search for "law of large numbers," not "central limit theorem."

Am I doing something wrong? <

Probably not.

But you've tried only 4,000 trials.

And you're already typically within 5% of the expected relative frequency.

- Mike
http://www.MikeMiddleton.com

Version: 2008 Operating System: Mac OS X 10.6 (Snow Leopard) Processor: Intel I've got a simple spreadsheet set up to do a simulation of dice rolls. The formula I am using is this:

=RANDBETWEEN(1,6)

I have this formula repeated in four columns and 1000 rows for a total of 4000 trials. This should be more than enough for the central limit theorem to come into play. I would expect each number from 1 to 6 to occur with roughly equal frequency, say between 660-670 times.

In my last run, I had 647 occurrences of 2 and 696 of 6. Every time I have run the simulation, there has been a difference of at least 30 between the count for the most and least frequent roll, and the difference has been as high as 61.

Any ideas? Am I doing something wrong?

Iggles14 · May 5, 2010

Mike -- you're right about CLT vs. large numbers, of course. I was distracted because I am also trying to produce a distribution curve for when 2 dice are rolled. I would expect that with 3000 trials (6000 rolls) that the mean of that distribution would reliably be 7 (CLT). It is not. It is between 6.95 and 7.05 less than 50% of the time.

I just ran a series of trials for RANDBETWEEN (1,2), and even with 40,000 trials (a1:an1000), the error is almost always at least 0.1% (0.001) and is sometimes as high as 0.6%.

The general rule of thumb for a Monte Carlo trial is that 1000 trials should be plenty.

Mike Middleton · May 6, 2010

Iggles14 -

I suggest you learn about the "standard error of the mean," e.g., in
Wikipedia
http://en.wikipedia.org/wiki/Standard_error_(statistics)
or other sources, like a basic statistics book or web searches.

Although a Monte Carlo simulation of 1,000 trials may be satisfactory for
determining the general shape of a distribution, the appropriate number of
trials when estimating the mean depends on the confidence interval you wish
to construct. For example, to reduce the width of a specific confidence
interval in half, you need a sample size four times as large.

- Mike
http://www.MikeMiddleton.com

Mike -- you're right about CLT vs. large numbers, of course. I was
distracted because I am also trying to produce a distribution curve for when
2 dice are rolled. I would expect that with 3000 trials (6000 rolls) that
the mean of that distribution would reliably be 7 (CLT). It is not. It is
between 6.95 and 7.05 less than 50% of the time.

I just ran a series of trials for RANDBETWEEN (1,2), and even with 40,000
trials (a1:an1000), the error is almost always at least 0.1% (0.001) and is
sometimes as high as 0.6%.

The general rule of thumb for a Monte Carlo trial is that 1000 trials should
be plenty.

Iggles14 · May 7, 2010

Mike and Bob -- I'm not dealing with samples or normal distributions. I'm trying to generate random numbers in a flat distribution.

I did some additional research, and it turns out that Excel generates "pseudo-random" numbers. Excel is not able to generate true random numbers that would duplicate coin flips or dice rolls. Nor is any computer. So it seems that my gut feel was correct.

I googled "random number generator" and found a couple of sites that use different algorithms to produce random number sets that are closer to true randomness.

Mike Middleton · May 7, 2010

Iggles14 -

This may be a matter of terminology, but I suggest that when you are using
RAND the way you described in your previous messages you are "sampling."

When you sample a total (sum of two dice) or a mean (based on two or more
random numbers) as you described, the central limit theorem applies and the
normal distribution becomes relevant.

There is variation from sample to sample, even with a sample size of 4,000
or 40,000, but that variation can be described using the standard error and
normal distribution.

As far as I know, the random number generator in Mac Excel 2008 is the same
one that was first implemented in Excel 2003, and it passes the standard
tests for randomness.

For more information, see the Microsoft Knowledge Base article 828795
"Description of the RAND function in Excel 2007 and in Excel 2003"

http://support.microsoft.com/kb/828795

- Mike
http://www.MikeMiddleton.com

Mike and Bob -- I'm not dealing with samples or normal distributions. I'm
trying to generate random numbers in a flat distribution.

I did some additional research, and it turns out that Excel generates
"pseudo-random" numbers. Excel is not able to generate true random numbers
that would duplicate coin flips or dice rolls. Nor is any computer. So it
seems that my gut feel was correct.

I googled "random number generator" and found a couple of sites that use
different algorithms to produce random number sets that are closer to true
randomness.

RAND and RANDBETWEEN Workaround........An A Comment	7	Jan 15, 2004
Need Help With YIELDMAT	6	May 4, 2012
I need assistance on Rand() and Combo box in '97	5	Nov 20, 2005
Problem with getting numbers in some range values!?	2	Jul 9, 2007
problem with formulas (2)	7	Apr 12, 2009
Query calculation problem based on 1st and last operation.	1	Mar 13, 2008
Problem with extending list formats and formulas	2	Apr 12, 2007
Mutual Fund Problem	2	Oct 11, 2004

Problem with RAND and RANDBETWEEN?

Iggles14

Mike Middleton

Iggles14

Mike Middleton

Iggles14

Mike Middleton

Ask a Question

Similar Threads