SPSSX Discussion

rank order (spearman) correlation vs pearson correlations

Classic

List

Threaded

6 messages Options

Maguin, Eugene

rank order (spearman) correlation vs pearson correlations

Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2^nd or 3^rd digit differences. Maybe high skewness, outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don’t know anything about.

Thanks, Gene Maguin

Ryan

Re: rank order (spearman) correlation vs pearson correlations

Much can be said on this topic, but in a nutshell the Spearman is used to measure a monotonic relationship while the Pearson is used to measure a linear relationship (a type of monotonic relationship). I assume the OP knows how one calculates a Pearson and Spearman as this is taught in introductory stats courses. I'll refrain from discussing issues surrounding outliers, skewness etc. for the moment but they clearly can affect the difference in coefficients.

Here is a little simulation I just created which should illuminate the *general* point I made above.

*Generate data.

SET SEED 65923454.

NEW FILE.

INPUT PROGRAM.

LOOP ID= 1 to 100.

COMPUTE x = RV.NORMAL(0,1).

COMPUTE y = x**3.

END CASE.

END LOOP.

END FILE.

END INPUT PROGRAM.

EXECUTE.

GRAPH

/SCATTERPLOT(BIVAR)=x WITH y.

CORRELATIONS

/VARIABLES=x y

/PRINT=TWOTAIL NOSIG.

NONPAR CORR

/VARIABLES=x y

/PRINT=SPEARMAN TWOTAIL NOSIG.

On Tue, May 6, 2014 at 3:54 PM, Maguin, Eugene <[hidden email]> wrote:

Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2^nd or 3^rd digit differences. Maybe high skewness, outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don’t know anything about.

Thanks, Gene Maguin

David Marso

Re: rank order (spearman) correlation vs pearson correlations

Administrator

In reply to this post by Maguin, Eugene

Plot the data.
I can envision any number of situations where the ranked data are perfectly correlated yet the Pearson can be moderate.
Consider X={1:10}, Y =EXP(X), Rho=1, Pearson=0.716870

Maguin, Eugene wrote

Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2nd or 3rd digit differences. Maybe high skewness, outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don't know anything about.
Thanks, Gene Maguin

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Maguin, Eugene

Re: rank order (spearman) correlation vs pearson correlations

In reply to this post by Ryan

Thank you, all of you.

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ryan Black
Sent: Tuesday, May 06, 2014 5:02 PM
To: [hidden email]
Subject: Re: rank order (spearman) correlation vs pearson correlations

Here is a little simulation I just created which should illuminate the *general* point I made above.

*Generate data.

SET SEED 65923454.

NEW FILE.

INPUT PROGRAM.

LOOP ID= 1 to 100.

COMPUTE x = RV.NORMAL(0,1).

COMPUTE y = x**3.

END CASE.

END LOOP.

END FILE.

END INPUT PROGRAM.

EXECUTE.

GRAPH

/SCATTERPLOT(BIVAR)=x WITH y.

CORRELATIONS

/VARIABLES=x y

/PRINT=TWOTAIL NOSIG.

NONPAR CORR

/VARIABLES=x y

/PRINT=SPEARMAN TWOTAIL NOSIG.

On Tue, May 6, 2014 at 3:54 PM, Maguin, Eugene <[hidden email]> wrote:

Thanks, Gene Maguin

Rich Ulrich

Re: rank order (spearman) correlation vs pearson correlations

In reply to this post by David Marso

Neatly demonstrated.

Here are a few logical points about Spearman/Pearson.

1) The simple computing relationship is that the Spearman is what you
get when you compute a Pearson r on the rank-transformed versions of
the scores.
2) A difference between the two demonstrates, in my experience, that
one or both measures should be transformed before later statistical
analysis, if that is at all reasonable.
3) When there is extreme skew in both measures, a plot will show you
almost all of the data in one corner of the plot. In this sort of example,
it should be easy to recognize that the size of the correlation depends
on the few score away from that corner ... and thus, in effect, *those*
make up all the "degrees of freedom" of the relationship, regardless of
how many scores are in that corner. This matters because an r with smaller
d.f. has a correspondingly *larger* standard error, and a larger value
needed for statistical significance.

--
Rich Ulrich

> Date: Tue, 6 May 2014 14:06:47 -0700

> From: [hidden email]
> Subject: Re: rank order (spearman) correlation vs pearson correlations
> To: [hidden email]
>
> Plot the data.
> I can envision any number of situations where the ranked data are perfectly
> correlated yet the Pearson can be moderate.
> Consider X={1:10}, Y =EXP(X), Rho=1, Pearson=0.716870
>
>
> Maguin, Eugene wrote
> > Can anyone offer comments or a reference on under what conditions rank
> > order correlations will differ from pearson correlations? By differ, I
> > mean, by at least 20%, not 2nd or 3rd digit differences. Maybe high
> > skewness, outlier points even if on the least squares regression line. I
> > suppose this touches on robust methods, which I don't know anything about.
> > Thanks, Gene Maguin
>
>
> ...

Marta Garcia-Granero

Re: rank order (spearman) correlation vs pearson correlations

In reply to this post by Maguin, Eugene

Hi:

I normaly teach the differences between Pearson's and Spearman's correlation coefficients using the Anscombe quartet (Google will give a lot of hits, including graphs&datasets)

Regards,
Marta GG

El 06/05/2014 23:29, Maguin, Eugene escribió:

Thank you, all of you.

�

�

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Ryan Black
Sent: Tuesday, May 06, 2014 5:02 PM
To: [hidden email]
Subject: Re: rank order (spearman) correlation vs pearson correlations

�

Much can be said on this topic, but in a nutshell the Spearman is used to measure a monotonic relationship while the Pearson is used to measure a linear relationship (a type of monotonic relationship). I assume the OP knows how one calculates a Pearson and Spearman as this is taught in introductory stats courses. I'll refrain from discussing issues surrounding outliers, skewness etc. for the moment but they clearly can affect the difference in coefficients.

�

Here is a little simulation I just created which should illuminate the *general* point I made above.

�

*Generate data.

SET SEED 65923454.

NEW FILE.

INPUT PROGRAM.

LOOP ID= 1 to 100.

COMPUTE x = RV.NORMAL(0,1).

COMPUTE � y = x**3.

END CASE.

END LOOP.

END FILE.

END INPUT PROGRAM.

EXECUTE.

�

GRAPH

� /SCATTERPLOT(BIVAR)=x WITH y.

�

CORRELATIONS

� /VARIABLES=x y

� /PRINT=TWOTAIL NOSIG.

NONPAR CORR

� /VARIABLES=x y

� /PRINT=SPEARMAN TWOTAIL NOSIG.

�

�

On Tue, May 6, 2014 at 3:54 PM, Maguin, Eugene <[hidden email]> wrote:

Can anyone offer comments or a reference on under what conditions rank order correlations will differ from pearson correlations? By differ, I mean, by at least 20%, not 2^nd or 3^rd digit differences. Maybe high skewness, outlier points even if on the least squares regression line. I suppose this touches on robust methods, which I don’t know anything about.

Thanks, Gene Maguin

�