SPSSX Discussion

Correlated or What?!

Classic

List

Threaded

3 messages Options

MaxJasper

Correlated or What?!

Hi folks,

Here is a scatterplot of 3 variables one of which is the difference of the first two.

G1 = glucose before breakfast
G2 = glucose 2 hrs after breakfast
G3 = G2 – G1

Plot shows that G1 & G1 have hardly any correlation. But then G1 shows some correlation with G3. And finally, G2 shows the best correlation with G3.

My questions are:
• Is corr of G1 G3 a true correlation?
• Is corr G2 G3 a true correlation?
• where do the two correlations come from?
• Is this a kind of false or biased correlation?
• Is there a stat technical name for such correlations?

Thanks a lot in advance.

Max.

Plasma Glucose: Before/After Breakfast

Andy W

Re: Correlated or What?!

The negative and positive correlations of G1 and G2 with the change score, G2 - G1, would happen with random data and are generically a result of regression to the mean.

It looks to me from your plots that G1 and G2 have about the same variance, and lets pretend they are mean centered as well. What is the expected covariance between the variables in levels and the change scores?


Cov(G2-G1,G2) = E[(G2 - G1)*(G2)]       //E[?] is the expectation of
              = E[(G2*G2) - (G1*G2)]
              = E[(G2*G2)] - E[(G1*G2)] //Bilinearity of expectation
              = Var(G2) - Cov(G1,G2)

Assuming a stationary series, the covariance will always be less than the variance, and so Var(G2) - Cov(G1,G2) should be positive, even with random data (i.e. if the covariance between the levels were zero)!

The same exercise with Cov(G2-G1,G1) produces the result, Cov(G1,G2) - Var(G1). So again, even with random data the covariance between G2-G1 and G1 would be negative (as the covariance should always be less than the variance). Using synonymous logic I show here why differencing a time series will typically introduce a negative autocorrelation.

Campbell and Kenny's book A Primer on Regression Artifacts is really a book about regression to the mean. They may not have this exact example, but it is largely applicable to evaluations of observational panel data designs.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/

MaxJasper

Re: Correlated or What?!

Many thanks Andy for your wonderful explanation & references.

Max.