Generate Correlated variable to an existing data set

H

hdf

I would like to generate a column of random variables (-1 to 1) that
are correlated to an existing column of values that are not -1 to 1.

I know how to generate two sets of random correlated variables, but I
can't seem to find a way to generate a column of random variables that
is correlated to an existing column of numbers as opposed to one that
was generated at the same time. It is probably something very simple
or impossible, but I'm no stats expert and thus need help in
identifying a function or formula that will allow me to do this.

To be clear, I have a column of values ranging from 100 to 175. I
want to generate a second column of random variables (-1 to 1) which
are correlated with this column, to whatever correlation coefficient I
choose.

I hope this makes sense.
 
J

joeu2004

I know how to generate two sets of random correlated
variables

What do you mean by that? Do you mean that you know how to generate
two random sets of data and compute their correlation? Or do you mean
that you know how to generate one random set of data and generate
another data set over a different range of numbers that is exactly
correlated (non-random)?

An example would help clarify your meaning.
To be clear, I have a column of values ranging from
100 to 175.  I want to generate a second column of random
variables (-1 to 1) which are correlated with this column,
to whatever correlation coefficient I choose.

Few things are "impossible" to do with computers. It depends on how
long you are willing to wait for a solution.

Off-hand, if I understand your requirements correctly, I believe this
is a very difficult task to accomplish in a reasonable finite amount
of time. The troubleshome requirement is "whatever correlation
coefficient I choose".

You could generate n-1 random values in the range [-1,1], then solve
for the n-th value that gives the required correlation coefficient.

Even that might be difficult due to the nature of the correlation
formula, if you use the same one as that found on the Excel CORREL
help page.

Moreover, there is no guarantee that the n-th value that meets the
correlation requirement will be in the range [-1,1].

Of course, you can always generate n random values in the range [-1,1]
repeatedly until you stumble upon a set that meets your criteria. But
such repetition can take a long time, and there is no guarantee of
"ever" finding a solution in your lifetime.

For additional ideas and broader participation, you might want to
repost your inquiry using the MS Answers Forums at
http://social.answers.microsoft.com/Forums/en-US/category/officeexcel.

It's not that I like that forum. It's just that MS has ceased to
support the Usenet newsgroups; see http://www.microsoft.com/communities/newsgroups/default.mspx.
Hence, participation here is limited to the sites that share a common
newsgroup mirror, which is no longer centralized at MS. Many of the
diehard contributors no longer pay attention to the Usenet newsgroups.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top