Generating 2 variables with different correlations

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Generating 2 variables with different correlations

Jeff A

 

Is there a straightforward way to generate example data with several variables having different degrees of correlation? …having each normally distributed will work well.

 

E.g., if I want X1 and X2 to be correlated at .1, X1 and X3 to be correlated at .3, etc.

 

I can figure out how to do this by creating a few variables, sorting, adding random error, etc. but I’m hoping that there is a more straightforward way.

 

Best,

 

Jeff

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generating 2 variables with different correlations

Jon Peck
There is a custom dialog, File > New > Data with Cases that generates random data with any of a number of correlation patterns.  For the arbitrary pattern, you enter the lower triangle of the correlation matrix you want.  The correlations can be approximate or exact.  See the help in the dialog for exactly how to enter this.

You can find this dialog on the old SPSS Community website, but I can send you the file directly if you want to go this way.  If you do, send me an email, and I will reply with the dialog.

On Mon, Feb 11, 2019 at 3:36 PM Jeff <[hidden email]> wrote:

 

Is there a straightforward way to generate example data with several variables having different degrees of correlation? …having each normally distributed will work well.

 

E.g., if I want X1 and X2 to be correlated at .1, X1 and X3 to be correlated at .3, etc.

 

I can figure out how to do this by creating a few variables, sorting, adding random error, etc. but I’m hoping that there is a more straightforward way.

 

Best,

 

Jeff

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generating 2 variables with different correlations

Ware, William B
In reply to this post by Jeff A
In R, check the mvnorm package...

William B. Ware, Professor Emeritus
Learning Sciences and Psychological Studies
Educational Psychology, Measurement, and Evaluation
School of Social Work, Adjunct Professor
University of North Carolina at Chapel Hill


From: SPSSX(r) Discussion <[hidden email]> on behalf of Jeff <[hidden email]>
Sent: Monday, February 11, 2019 5:35:28 PM
To: [hidden email]
Subject: Generating 2 variables with different correlations
 

 

Is there a straightforward way to generate example data with several variables having different degrees of correlation? …having each normally distributed will work well.

 

E.g., if I want X1 and X2 to be correlated at .1, X1 and X3 to be correlated at .3, etc.

 

I can figure out how to do this by creating a few variables, sorting, adding random error, etc. but I’m hoping that there is a more straightforward way.

 

Best,

 

Jeff

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generating 2 variables with different correlations

Kirill Orlov
In reply to this post by Jeff A
Of my http://www.spsstools.net/en/KO-spssmacros
page MATRIX - END MATRIX collection, function
!mvnorm generates normal data from a correlated population; !tocov transforms data to have exact correlations you want.

12.02.2019 1:35, Jeff пишет:

 

Is there a straightforward way to generate example data with several variables having different degrees of correlation? …having each normally distributed will work well.

 

E.g., if I want X1 and X2 to be correlated at .1, X1 and X3 to be correlated at .3, etc.

 

I can figure out how to do this by creating a few variables, sorting, adding random error, etc. but I’m hoping that there is a more straightforward way.

 

Best,

 

Jeff

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Generating 2 variables with different correlations

Bruce Weaver
Administrator
In reply to this post by Jon Peck
Here is an old-fashioned "no Python required" (NPR) solution.  (Jon, please
note that I did try Data with Cases first, but a website needed to install
part of it was down.)  

Jeff, the main things you need to modify are the lines that set n and the
desired correlation matrix.  HTH.


/* Source:  https://www.uvm.edu/~dhowell/StatPages/More_Stuff/CorrGen2.html
*/

/* This program generates a multivariate random normal sample */
/* of size n from a population described by covariance matrix r */
/* Setting exact to 1 yields a sample that exactly reproduces */
/* the population matrix.  Setting exact to any value other than 1 */
/* produces a sample from the population, which will be subject */
/* to random sample error, meaning that sample covariance */
/* matrix will not be exactly equal to the population matrix */
/* keep the seed constant to reproduce the data from run to run */

/* Written by Andrew F. Hayes */
/* School of Communication */
/* The Ohio State University */
/* [hidden email] */
/* Version 1.1, Sept 15, 2010 */

*set seed = 12343.
* UPDATE:  Use MT and MTINDEX.
SET RNG=MT MTINDEX=120219.

matrix.
  compute n = 500.
  compute exact = 1.
* E.g., if I want X1 and X2 to be correlated at .1, X1 and X3 to be
correlated at .3, etc.
  compute r =
  {1, .1, .3;
  .1,  1, .6;
  .3, .6,  1}.
  compute rn = nrow(r).
  compute x1 =
sqrt(-2*ln(uniform(n,rn)))&*cos((2*3.14159265358979)*uniform(n,rn)).  
  compute x1=x1*chol(r).
  compute ones = make(n,1,1).
  compute sigma = (t(x1)*(ident(n)-(1/n)*ones*t(ones))*x1)*(1/(n-1)).
  do if (exact = 1).
    call eigen(r, vc, vl).
    compute sqrtr = vc*sqrt(mdiag(vl))*t(vc).
    call eigen(sigma, vc, vl).
    compute sqrts = vc*sqrt(mdiag(vl))*t(vc).
    compute x1 = x1*inv(sqrts)*sqrtr.
    compute ones = make(n,1,1).
    compute sigma = (t(x1)*(ident(n)-(1/n)*ones*t(ones))*x1)*(1/(n-1)).
  end if.
  print r/title = "Population Matrix"/format = F16.4.
  print sigma/title = "Sample Matrix"/format = F16.4.
  print n/title = "number of cases created"/format = F16.0.
  save x1/outfile = *.
end matrix.

CORRELATIONS all.

I know there are some other old threads in this forum discussing the same
problem.  You should be able to find them with appropriate search terms.  



Jon Peck wrote
> There is a custom dialog, File > New > Data with Cases that generates
> random data with any of a number of correlation patterns.  For the
> arbitrary pattern, you enter the lower triangle of the correlation matrix
> you want.  The correlations can be approximate or exact.  See the help in
> the dialog for exactly how to enter this.
>
> You can find this dialog on the old SPSS Community website, but I can send
> you the file directly if you want to go this way.  If you do, send me an
> email, and I will reply with the dialog.





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Generating 2 variables with different correlations

Kirill Orlov
Bruce, it looks like that the Hayes' code is almost the same thing as my two functions mentioned above. But my functions are (I think so) better implemented. And it is better to keep the "generate from population" and to "take to exact covariances" parts separate because in fact they are independent tasks.


13.02.2019 1:54, Bruce Weaver пишет:
Here is an old-fashioned "no Python required" (NPR) solution.  (Jon, please
note that I did try Data with Cases first, but a website needed to install
part of it was down.)  



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD