(no subject)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

(no subject)

Paola Chivers-2

HI,

 

I am preparing data for structural equation modelling (data screening).  I want to test for multicollinearity (identifying correlations higher than .85 so I can select out variables).  Easy enough – except I have 200+ variables that I want to run this for.

 

Apart from the huge resulting matrix I would get, or running the collinearity diagnostics through regression (again a huge task), IS THERE A SIMPLER WAY?  Has anyone written any syntax that might be able to do this and only output pairs with correlations greater than .85?

 

Any help or advice most appreciated.

 

Regards,

Paola

 

“Ours has become a time-poor society, fatigued by non-physical demands and trying to compartmentalize daily living tasks.  It is small wonder that physical activity is discarded in this environment” p126 (Steinbeck, 2001)

 

P Please consider the environment before printing this email.

 

 

Reply | Threaded
Open this post in threaded view
|

(no subject)

Art Kendall
Look up RELIABILITY in help.
use the /summary =total option.
If the 200 variables are designed to be used as subscales you can also subset them as scales.

That will tell you what the SMC - squared multiple correlation is.  .85**2 = .7225.
This will be a more thorough search for collinearity.

You might try OMS on the reliability output so you can save the output from the/total option.
If you then sort that on the SMC.  Items with the same high SMC are candidates for further checking.  You might have more than one subset that has a common SMC of 1.00.
Variables var001 var002 var003 var198 var199 and var200 may all have an SMC of 1.00 but be two separate sets of multi collinear variables {1,2,3} {198,199,200}.


Art Kendall
Social Research Consultants

Paola Chivers wrote:

HI,

 

I am preparing data for structural equation modelling (data screening).  I want to test for multicollinearity (identifying correlations higher than .85 so I can select out variables).  Easy enough – except I have 200+ variables that I want to run this for.

 

Apart from the huge resulting matrix I would get, or running the collinearity diagnostics through regression (again a huge task), IS THERE A SIMPLER WAY?  Has anyone written any syntax that might be able to do this and only output pairs with correlations greater than .85?

 

Any help or advice most appreciated.

 

Regards,

Paola

 

“Ours has become a time-poor society, fatigued by non-physical demands and trying to compartmentalize daily living tasks.  It is small wonder that physical activity is discarded in this environment” p126 (Steinbeck, 2001)

 

P Please consider the environment before printing this email.

 

 

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

serial correlation

Keli Saporta

 

Hi

Is there any procedure or other way to calculate serial correlation ?

I would appreciate any help

Thank

kelly

Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

David Greenberg
In reply to this post by Paola Chivers-2
Looking at zero-order correlations of variables is an imperfect procedure for identifying multicollinearity. Two variables can be highly collinear even if their zero-order correlation is not high. You could better identify collinear variables by estimating a regression of one against the others and examining the collinearity diagnostics (such as the variance inflation factor and the tolerance). David Greenberg, Sociology Department, New York University

----- Original Message -----
From: Paola Chivers <[hidden email]>
Date: Monday, May 4, 2009 2:02 am
Subject:
To: [hidden email]


> HI,
>
>
>
> I am preparing data for structural equation modelling (data
> screening).  I
> want to test for multicollinearity (identifying correlations higher
> than .85
> so I can select out variables).  Easy enough - except I have 200+ variables
> that I want to run this for.
>
>
>
> Apart from the huge resulting matrix I would get, or running the
> collinearity diagnostics through regression (again a huge task), IS
> THERE A
> SIMPLER WAY?  Has anyone written any syntax that might be able to do this
> and only output pairs with correlations greater than .85?
>
>
>
> Any help or advice most appreciated.
>
>
>
> Regards,
>
> Paola
>
>
>
> "Ours has become a time-poor society, fatigued by non-physical demands
> and
> trying to compartmentalize daily living tasks.  It is small wonder that
> physical activity is discarded in this environment" p126 (Steinbeck, 2001)
>
>
>
> P Please consider the environment before printing this email.
>
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

E. Bernardo
David wrote:
 
>>>Looking at zero-order correlations of variables is an imperfect procedure for identifying multicollinearity. Two variables can be highly collinear even if their zero-order correlation is not high. You could better identify collinear variables by estimating a regression of one against the others and examining the collinearity diagnostics (such as the variance inflation factor and the tolerance).
 
Do you have sample data showing that two variables can be highly collinear even if their zero-order correlation is not high.?
 


--- On Mon, 5/4/09, David Greenberg <[hidden email]> wrote:

From: David Greenberg <[hidden email]>
Subject: Re: Multicollinearity
To: [hidden email]
Date: Monday, 4 May, 2009, 8:46 PM

Looking at zero-order correlations of variables is an imperfect procedure for identifying multicollinearity. Two variables can be highly collinear even if their zero-order correlation is not high. You could better identify collinear variables by estimating a regression of one against the others and examining the collinearity diagnostics (such as the variance inflation factor and the tolerance). David Greenberg, Sociology Department, New York University

----- Original Message -----
From: Paola Chivers <pc8@...>
Date: Monday, May 4, 2009 2:02 am
Subject:
To: SPSSX-L@...


> HI,
>
>
>
> I am preparing data for structural equation modelling (data
> screening).  I
> want to test for multicollinearity (identifying correlations higher
> than .85
> so I can select out variables).  Easy enough - except I have 200+ variables
> that I want to run this for.
>
>
>
> Apart from the huge resulting matrix I would get, or running the
> collinearity diagnostics through regression (again a huge task), IS
> THERE A
> SIMPLER WAY?  Has anyone written any syntax that might be able to do this
> and only output pairs with correlations greater than .85?
>
>
>
> Any help or advice most appreciated.
>
>
>
> Regards,
>
> Paola
>
>
>
> "Ours has become a time-poor society, fatigued by non-physical demands
> and
> trying to compartmentalize daily living tasks.  It is small wonder that
> physical activity is discarded in this environment" p126 (Steinbeck, 2001)
>
>
>
> P Please consider the environment before printing this email.
>
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Importing contacts has never been easier.
Bring your friends over to Yahoo! Mail today!
Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

Anthony Babinec

Here is a small example. X3 is the sum of X1 and X2, therefore

you cannot regress Y on X1, X2, X3. Yet, if you obtain the

correlations of the Xs, the largest pairwise correlation is

about 0.835. So, the point is that inspection of pairwise correlations

of the Xs is not sufficient.

 

X1    X2    X3    Y

-.91  -.04  -.95  -1.01

-.36  -1.86 -2.22 -4.25

-.32  1.63  1.32  2.93

-.32  -.34  -.66  -1.11

1.43  -.87  .56   -.36

-1.79 -1.52 -3.31 -4.73

.74   .58   1.32  1.90

-.88  1.52  .64   2.13

-.01  1.36  1.35  2.69

-.32  .70   .38   .97

 

Tony Babinec

 

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Eins Bernardo
Sent: Tuesday, May 05, 2009 4:52 AM
To: [hidden email]
Subject: Re: Multicollinearity

 

Reply | Threaded
Open this post in threaded view
|

Re: Multicollinearity

David Greenberg
In reply to this post by E. Bernardo
I have encountered this in the past but do not have an example right at hand. Textbook discussions of this phenomenon are not hard to find, and you could easily confirm it for yourself by creating some data in which strong dependence occurs  among several variables without a high correlation among any particular pairs of variables. David Greenberg, Sociology Department, New York University.

----- Original Message -----
From: Eins Bernardo <[hidden email]>
Date: Tuesday, May 5, 2009 5:52 am
Subject: Re: Multicollinearity
To: [hidden email], David Greenberg <[hidden email]>


> David wrote:
>
> >>>Looking at zero-order correlations of variables is an imperfect
> procedure for identifying multicollinearity. Two variables can be
> highly collinear even if their zero-order correlation is not high. You
> could better identify collinear variables by estimating a regression
> of one against the others and examining the collinearity diagnostics
> (such as the variance inflation factor and the tolerance).
>
> Do you have sample data showing that two variables can be highly
> collinear even if their zero-order correlation is not high.?
>
>
>
> --- On Mon, 5/4/09, David Greenberg <[hidden email]> wrote:
>
>
> From: David Greenberg <[hidden email]>
> Subject: Re: Multicollinearity
> To: [hidden email]
> Date: Monday, 4 May, 2009, 8:46 PM
>
>
> Looking at zero-order correlations of variables is an imperfect
> procedure for identifying multicollinearity. Two variables can be
> highly collinear even if their zero-order correlation is not high. You
> could better identify collinear variables by estimating a regression
> of one against the others and examining the collinearity diagnostics
> (such as the variance inflation factor and the tolerance). David
> Greenberg, Sociology Department, New York University
>
> ----- Original Message -----
> From: Paola Chivers <[hidden email]>
> Date: Monday, May 4, 2009 2:02 am
> Subject:
> To: [hidden email]
>
>
> > HI,
> >
> >
> >
> > I am preparing data for structural equation modelling (data
> > screening).  I
> > want to test for multicollinearity (identifying correlations higher
> > than .85
> > so I can select out variables).  Easy enough - except I have 200+ variables
> > that I want to run this for.
> >
> >
> >
> > Apart from the huge resulting matrix I would get, or running the
> > collinearity diagnostics through regression (again a huge task), IS
> > THERE A
> > SIMPLER WAY?  Has anyone written any syntax that might be able to do
> this
> > and only output pairs with correlations greater than .85?
> >
> >
> >
> > Any help or advice most appreciated.
> >
> >
> >
> > Regards,
> >
> > Paola
> >
> >
> >
> > "Ours has become a time-poor society, fatigued by non-physical demands
> > and
> > trying to compartmentalize daily living tasks.  It is small wonder that
> > physical activity is discarded in this environment" p126 (Steinbeck,
> 2001)
> >
> >
> >
> > P Please consider the environment before printing this email.
> >
> >
> >
> >
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>
>       Design your own exclusive Pingbox today! It's easy to create
> your personal chat space on your blogs. http://ph.messenger.yahoo.com/pingbox

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD