Correlated or What?!

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Correlated or What?!

MaxJasper
Hi folks,

Here is a scatterplot of 3 variables one of which is the difference of the first two.

G1 = glucose before breakfast
G2 = glucose 2 hrs after breakfast
G3 = G2 – G1

Plot shows that G1 & G1 have hardly any correlation. But then G1 shows some correlation with G3. And finally, G2 shows the best correlation with G3.

My questions are:
• Is corr of G1 G3 a true correlation?
• Is corr G2 G3 a true correlation?
• where do the two correlations come from?
• Is this a kind of false or biased correlation?
• Is there a stat technical name for such correlations?

Thanks a lot in advance.

Max.

 Plasma Glucose: Before/After Breakfast

Reply | Threaded
Open this post in threaded view
|

Re: Correlated or What?!

Andy W

The negative and positive correlations of G1 and G2 with the change score, G2 - G1, would happen with random data and are generically a result of regression to the mean.

It looks to me from your plots that G1 and G2 have about the same variance, and lets pretend they are mean centered as well. What is the expected covariance between the variables in levels and the change scores?


Cov(G2-G1,G2) = E[(G2 - G1)*(G2)]       //E[?] is the expectation of
              = E[(G2*G2) - (G1*G2)]
              = E[(G2*G2)] - E[(G1*G2)] //Bilinearity of expectation
              = Var(G2) - Cov(G1,G2)

Assuming a stationary series, the covariance will always be less than the variance, and so Var(G2) - Cov(G1,G2) should be positive, even with random data (i.e. if the covariance between the levels were zero)!

The same exercise with Cov(G2-G1,G1) produces the result, Cov(G1,G2) - Var(G1). So again, even with random data the covariance between G2-G1 and G1 would be negative (as the covariance should always be less than the variance). Using synonymous logic I show here why differencing a time series will typically introduce a negative autocorrelation.

Campbell and Kenny's book A Primer on Regression Artifacts is really a book about regression to the mean. They may not have this exact example, but it is largely applicable to evaluations of observational panel data designs.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Correlated or What?!

MaxJasper
Many thanks Andy for your wonderful explanation & references.

Max.