Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2nd or 3rd digit differences. Maybe high skewness,
outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don’t know anything about. Thanks, Gene Maguin |
Much can be said on this topic, but in a nutshell the Spearman is used to measure a monotonic relationship while the Pearson is used to measure a linear relationship (a type of monotonic relationship). I assume the OP knows how one calculates a Pearson and Spearman as this is taught in introductory stats courses. I'll refrain from discussing issues surrounding outliers, skewness etc. for the moment but they clearly can affect the difference in coefficients. Here is a little simulation I just created which should illuminate the *general* point I made above. *Generate data. SET SEED 65923454. NEW FILE. INPUT PROGRAM.
LOOP ID= 1 to 100. COMPUTE x = RV.NORMAL(0,1). COMPUTE y = x**3. END CASE. END LOOP. END FILE. END INPUT PROGRAM. EXECUTE.
GRAPH /SCATTERPLOT(BIVAR)=x WITH y. CORRELATIONS /VARIABLES=x y /PRINT=TWOTAIL NOSIG. NONPAR CORR /VARIABLES=x y /PRINT=SPEARMAN TWOTAIL NOSIG.
On Tue, May 6, 2014 at 3:54 PM, Maguin, Eugene <[hidden email]> wrote:
|
Administrator
|
In reply to this post by Maguin, Eugene
Plot the data.
I can envision any number of situations where the ranked data are perfectly correlated yet the Pearson can be moderate. Consider X={1:10}, Y =EXP(X), Rho=1, Pearson=0.716870
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Ryan
Thank you, all of you.
From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Ryan Black Much can be said on this topic, but in a nutshell the Spearman is used to measure a monotonic relationship while the Pearson is used to measure a linear relationship (a type of monotonic relationship). I assume the OP knows how one calculates
a Pearson and Spearman as this is taught in introductory stats courses. I'll refrain from discussing issues surrounding outliers, skewness etc. for the moment but they clearly can affect the difference in coefficients. Here is a little simulation I just created which should illuminate the *general* point I made above. *Generate data. SET SEED 65923454. NEW FILE. INPUT PROGRAM. LOOP ID= 1 to 100. COMPUTE x = RV.NORMAL(0,1). COMPUTE y = x**3. END CASE. END LOOP. END FILE. END INPUT PROGRAM. EXECUTE. GRAPH /SCATTERPLOT(BIVAR)=x WITH y. CORRELATIONS /VARIABLES=x y /PRINT=TWOTAIL NOSIG. NONPAR CORR /VARIABLES=x y /PRINT=SPEARMAN TWOTAIL NOSIG. On Tue, May 6, 2014 at 3:54 PM, Maguin, Eugene <[hidden email]> wrote: Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2nd
or 3rd digit differences. Maybe high skewness, outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don’t know anything about. Thanks, Gene Maguin |
In reply to this post by David Marso
Neatly demonstrated.
Here are a few logical points about Spearman/Pearson. 1) The simple computing relationship is that the Spearman is what you get when you compute a Pearson r on the rank-transformed versions of the scores. 2) A difference between the two demonstrates, in my experience, that one or both measures should be transformed before later statistical analysis, if that is at all reasonable. 3) When there is extreme skew in both measures, a plot will show you almost all of the data in one corner of the plot. In this sort of example, it should be easy to recognize that the size of the correlation depends on the few score away from that corner ... and thus, in effect, *those* make up all the "degrees of freedom" of the relationship, regardless of how many scores are in that corner. This matters because an r with smaller d.f. has a correspondingly *larger* standard error, and a larger value needed for statistical significance. -- Rich Ulrich > Date: Tue, 6 May 2014 14:06:47 -0700 > From: [hidden email] > Subject: Re: rank order (spearman) correlation vs pearson correlations > To: [hidden email] > > Plot the data. > I can envision any number of situations where the ranked data are perfectly > correlated yet the Pearson can be moderate. > Consider X={1:10}, Y =EXP(X), Rho=1, Pearson=0.716870 > > > Maguin, Eugene wrote > > Can anyone offer comments or a reference on under what conditions rank > > order correlations will differ from pearson correlations? By differ, I > > mean, by at least 20%, not 2nd or 3rd digit differences. Maybe high > > skewness, outlier points even if on the least squares regression line. I > > suppose this touches on robust methods, which I don't know anything about. > > Thanks, Gene Maguin > > > ... |
In reply to this post by Maguin, Eugene
Hi:
I normaly teach the differences between Pearson's and Spearman's correlation coefficients using the Anscombe quartet (Google will give a lot of hits, including graphs&datasets) Regards, Marta GG El 06/05/2014 23:29, Maguin, Eugene escribió:
|
Free forum by Nabble | Edit this page |