I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance?
Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
The closer the absolute value of r is to 1 the stronger the association
between the two variables: you need not square r to determine that. r=.562 is not really "low"; if anything it would indicate a substantial linear association between the two variables. Finally, you can have r=.1 that is significant, if your sample size is large enough. In other words, "statistical significance" (i.e. the correlation between the two variables is unlikely to be due to chance) and "strength" of the correlation are two difference issues. Dominic Lusinchi Statistician Far West Research Statistical Consulting San Francisco, California 415-664-3032 www.farwestresearch.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Sharon Sent: Friday, October 13, 2006 9:11 AM To: [hidden email] Subject: Question regarding Pearson correlation and r-squared I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
In reply to this post by Sharon-10
Sharon,
What Dominic is referring to is effect size. An r of .5 is a large effect size, not to be discounted. You can find more information about the use and reporting of effect size by looking for the keywords 'effect size'. Mark *************************************************************************************************************************************************************** Mark A. Davenport Ph.D. Senior Research Analyst Office of Institutional Research The University of North Carolina at Greensboro 336.256.0395 [hidden email] 'An approximate answer to the right question is worth a good deal more than an exact answer to an approximate question.' --a paraphrase of J. W. Tukey (1962) Sharon <[hidden email]> Sent by: "SPSSX(r) Discussion" <[hidden email]> 10/13/2006 12:10 PM Please respond to Sharon <[hidden email]> To [hidden email] cc Subject Question regarding Pearson correlation and r-squared I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
In reply to this post by Sharon-10
There are a relatively small number of texts that focus on r-squared to the (near) exclusion of r. However, most focus on r.
As others pointed out, statistical significance (unlikely due to chance) and practical significance (a finding worth attending to) are different things. I believe that is what your student is getting at. Not sure why they are focusing on an r > .7--there is simply no single value that demarks practical significance. (Maybe they are thinking about reliability or getting 50% variance accounted for?) "the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance?" I'm assuming that this question refers to situations with two or more predictors. The importance of predictors is an interesting question, but it is also very difficult to answer without considering things like causal structure--and even then it is hardly trivial. You certainly cannot look at correlations or r-squareds and answer this type of question. Michael **************************************************** Michael Granaas [hidden email] Assoc. Prof. Phone: 605 677 5295 Dept. of Psychology FAX: 605 677 3195 University of South Dakota 414 E. Clark St. Vermillion, SD 57069 ***************************************************** -----Original Message----- From: SPSSX(r) Discussion on behalf of Sharon Sent: Fri 10/13/06 11:10 AM To: [hidden email] Subject: Question regarding Pearson correlation and r-squared I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
In reply to this post by Sharon-10
Sharon wrote:
I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Sharon: If you have a correlation between two variables, none of them weighs more than the other. The correlation means that part of the variance of X is correlated with part of the variance of Y. And R2, or r squared, is simply Pearson's correlation coefficient r to the second power, so once you have one you can easily have the other. Sharon also wrote: Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? Sharon: Significance as indicated with p<0.01 is STATISTICAL significance. It means that there is less than 1% chance of obtaining that result by chance, with a sample of that size, if the actual correlation is zero in the total population. With other sample sizes, the same result will show a different probability (higher probability with smaller samples, and conversely). This is not to be confused with SUBSTANTIVE significance, or MEANINGFULNESS. Suppose, for the sake of argument, that you have a very large sample, and you find a correlation r=0.005 between, say, breast cancer and watching repeatedly the entire Sopranos DVD set. Since your sample is actually very high, perhaps millions of cases, a correlation of 0.005 leaves you 99% confident (p<0.01) that the true correlation in the entire population is not zero. But at the same time the correlation found is very low: it means that R2=0.005 x 0.005 = 0.000025; watching the Sopranos may statistically explain only about 2.5 per 100,000 cases of breast cancer, which is not substantively much (especially because you do not have a clue about what on Earth might be the causal link between the two variables). Or perhaps, given the severity and lethality of the disease, even reducing it by that narrow margin is a legitimate research goal. In your student's case, probably she has read that 0.7-0.8 means that something is substantively significant. Or perhaps she has heard that for the small samples usually used in certain studies, only results as high as 0.7-0.8 are found to be (statistically) significant. But nothing of this can be generalized, as ANY level of R2 or r can be statistically significant if you have enough cases in your sample; and ANY level can be substantively meaningful if it fits in your research goals. Hector |
In reply to this post by Sharon-10
Huh! I'm relatively new to the list, but questions much like yours seem to
come up all the time. There's significance, and then there's significance. I am, or at least was, pricipally an economic times series/forecaster guy. First of all, in a two variable model I do not know what you mean by weight in the relationship, that one of them has a greater weight then the other in determining their mutual correlation. You would have to explain that to me. I would only start considering weights when you have 2+ variables on the Right Hand Side (RHS) of some equation. If you are looking at a temporal relationship, which predicts the other, in economics there is something called a Granger Causality test, the purpose of which is to determine whom is really predicting whom if any. If there are any economists out, one of the more interesting findings is that contrary to many popular notions, and economic theories, rising inflation Granger Causes rising wages, rather than the other way around, at least in the ten to twenty year ago range this was the common finding. The term causality is used, but really what it is telling you is which variable gives information on which (who should be on the RHS versus LHS). BTW, there are in fact economic theories that support this proposition. Significance amongst us "experts" is frequently statistical significance of some coeffcient or model, typically is something statistically different from zero with a high level of certainty. In economics as well as other fields, it is not unusual to produce a coefficient with very high t-stats, but in practical terms virtually no measured impact on the real world. Using an economic example, wealth effects and GDP, in particular, if the stock markets go up, increasing the accumulated wealth of stockholders, does this induce an increase in consumption and hence gdp. In the about ten year ago period, I found, and to my knowledge this was typical of the findings of the day, the answer to that question is yes, rising wealth induces increased consumption, but not much. Something on the order of every $trillion increase in stock market capitalization induces about a 2 billion dollar increase in gdp. These things are typically done in logs/growth rates. That means a (late 1990's) change in the stock market of about 20% is raising gdp by 6 billion dollars, but gdp in the US is a $10 trillion variable, do the math yourself, it is very small. However, the coefficient on the stock markets was highly significant statistically, althogh the coefficient was very small. So statistically significant in T-Stat terms, but in terms of impact in the real world small, like if the drug problem in this country was once a week some one smoked a joint in St. Louis would we want to spend billions of dollars on a war on drugs. Also, their is a likelihood ratio test, amongst others, for comparing if two models to see if they are statistically significantly different, and the model we used with a wealth effect formulation was not different than the model without. If you graphed out the forecasts the lines were indistinguishable by eye, and the values forecast were rarely if ever different by even round off error, GDP growth is usually reported to the nearest tenth of a percentage point, like 3.7%. We reported our findings and results but continued using the model without wealth effect. My recollection is that my findings were typical of those of the time, mid 1990's. I still receive various research from the Federal Reserve et al, and In some more recent research some have found that the "wealth effect" is not only statistically significant, but also "significant" in other terms. Why is that? Better techniques, may be, or may be the amounts of stocks held and the characteristics of stockholders changed, etc... My last penny, the "significance" of your findings may be peculiaristic to the topic under consideration, as someone else on this list said, if the correlation between planes that took off and planes that landed were 0.7, is that good? And, if your model forecast GDP with an R2 of 0.7, but the typical model floating in the literature is 0.8, well then you are below the state of the art, so the "significance" of your results depends on what others have found, although the fact that you find a weak relationship where a strong one is expected may be interesting, but if you are in a horse race your stats should be around or hopefully better than what is already out there, or who cares. Then, your model may be much worse than other models perhaps in terms of forecasting GDP, but if none of those other models have a wealth effect, or have exchange rates, or have interst rates etc, then your model may be interesting for that fact. For many economic times series, ARMA(P,Q) can produce forecasts that are excellent (ARMA or ARIMA as appropriate) in terms of technically getting close to the correct values, but these models do not tell a policy maker how to set exchange rate, interest rate etc policy, and so are in some ways analytically useless. Once again, hope that was sufficiently confusing for you. If not, send another question and I will taunt you some more. >From: Sharon <[hidden email]> >Reply-To: Sharon <[hidden email]> >To: [hidden email] >Subject: Question regarding Pearson correlation and r-squared >Date: Fri, 13 Oct 2006 09:10:53 -0700 > >I am trying to find out why there is no information regarding the >computation of r-squared once you've obtained your pearson r coefficient. >It appears to me if I have found a significant correlation between two >variables, the next logical question is which one weighs more or has >greater strength in the reltaionship. Am I asking the wrong question? Does >it stop at significance? > > Also, the student I am working with is producing very creative ways to >report her findings. I have found (but am not sure which to advise) many >ways of reporting correlation coefficients that are significant. For >example, she has a .562 coefficient with a significance p <.01. She is >looking at the coefficient as low because of it's actual value rather than >the statistical significance. The literature shows anything above .7 or .8 >to be statistically significant. So, how do I explain to her that this .561 >is also significant even though it's below the recommended level? > > I apologize if this is remedial, I think I am just mixing thngs up and >need help with clarification. > > Thank you in advance. > > > > > Sharon D. Voirin, RhD > Survey Design Services > Carbondale, IL 62903 > 618-559-2507 > > "Our lives begin to end the day we become silent about things that >matter." The Reverend Martin Luther King _________________________________________________________________ Search—Your way, your world, right now! http://imagine-windowslive.com/minisites/searchlaunch/?locale=en-us&FORM=WLMTAG |
In reply to this post by Hector Maletta
Hector beat me to the draw. I would say that what Hector said is pretty
much exactly what I said, but perhaps he is clearer. >From: Hector Maletta <[hidden email]> >Reply-To: Hector Maletta <[hidden email]> >To: [hidden email] >Subject: Re: Question regarding Pearson correlation and r-squared >Date: Fri, 13 Oct 2006 14:50:25 -0300 > >Sharon wrote: > >I am trying to find out why there is no information regarding the >computation of r-squared once you've obtained your pearson r coefficient. >It >appears to me if I have found a significant correlation between two >variables, the next logical question is which one weighs more or has >greater >strength in the reltaionship. Am I asking the wrong question? Does it stop >at significance? > >Sharon: If you have a correlation between two variables, none of them >weighs >more than the other. The correlation means that part of the variance of X >is >correlated with part of the variance of Y. And R2, or r squared, is simply >Pearson's correlation coefficient r to the second power, so once you have >one you can easily have the other. > >Sharon also wrote: > Also, the student I am working with is producing very creative ways to >report her findings. I have found (but am not sure which to advise) many >ways of reporting correlation coefficients that are significant. For >example, she has a .562 coefficient with a significance p <.01. She is >looking at the coefficient as low because of it's actual value rather than >the statistical significance. The literature shows anything above .7 or .8 >to be statistically significant. So, how do I explain to her that this .561 >is also significant even though it's below the recommended level? > > >Sharon: Significance as indicated with p<0.01 is STATISTICAL significance. >It means that there is less than 1% chance of obtaining that result by >chance, with a sample of that size, if the actual correlation is zero in >the >total population. With other sample sizes, the same result will show a >different probability (higher probability with smaller samples, and >conversely). > >This is not to be confused with SUBSTANTIVE significance, or >MEANINGFULNESS. > > >Suppose, for the sake of argument, that you have a very large sample, and >you find a correlation r=0.005 between, say, breast cancer and watching >repeatedly the entire Sopranos DVD set. Since your sample is actually very >high, perhaps millions of cases, a correlation of 0.005 leaves you 99% >confident (p<0.01) that the true correlation in the entire population is >not >zero. But at the same time the correlation found is very low: it means that >R2=0.005 x 0.005 = 0.000025; watching the Sopranos may statistically >explain >only about 2.5 per 100,000 cases of breast cancer, which is not >substantively much (especially because you do not have a clue about what on >Earth might be the causal link between the two variables). Or perhaps, >given >the severity and lethality of the disease, even reducing it by that narrow >margin is a legitimate research goal. > >In your student's case, probably she has read that 0.7-0.8 means that >something is substantively significant. Or perhaps she has heard that for >the small samples usually used in certain studies, only results as high as >0.7-0.8 are found to be (statistically) significant. But nothing of this >can >be generalized, as ANY level of R2 or r can be statistically significant if >you have enough cases in your sample; and ANY level can be substantively >meaningful if it fits in your research goals. > >Hector _________________________________________________________________ The next generation of Search—say hello! http://imagine-windowslive.com/minisites/searchlaunch/?locale=en-us&FORM=WLMTAG |
In reply to this post by Sharon-10
sharon--
as many responders have mentioned, when interpreting correlations it is important to differentiate between "practical" or "meaningful" and "statistical" significance. it is also important to heed the results of previously-conducted research. while your student's result of r = 0.562 is "statistically" significant, it can be seen to be lower than the results reported in the literature. you can interpret this quite simply to say that there is a positive, moderate, significant relationship between the two variables, however, the strength of this relationship is less than that reported by previous research. the computed correlation isn't necessarily low, only when compared to previous values. at this point, the question is subjective - what does it mean in the context of the current study to have a statistically significant correlation that is lower than previously reported values? did the other studies have larger/smaller sample sizes? was my sample size adequate? is there a floor or ceiling effect in my data which might deflate my correlation? best of luck, --matthew ............................................. matthew m. gushta, m.ed. research associate american institutes for research [hidden email] -- 202.403.5079 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Sharon Sent: Friday, October 13, 2006 12:11 PM To: [hidden email] Subject: Question regarding Pearson correlation and r-squared I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
In reply to this post by Sharon-10
Stephen Brand
www.statisticsdoc.com Sharon, A great deal of value has been said on this thread, particularly concerning the difference between statistical significance and meaningfulness. Effect size is found by squaring the correlation. I would just add a few observations. If your student is interested in knowing whether the correlation she found was lower than levels previously reported in the literature, she should look into testing the difference between correlations using the Fisher R to Z transformation. This provides a structured way to test the hypothesis that her sample came from a population in which the association between variables was not equal to the association reported in other studies. If your student has obtained a somewhat smaller correlation, it is important to look at factors that can attenuate the size of a correlation. Some common factors include restriction of range in the variables due to sampling, and ceiling/floor effects. A very important attenuating factor that is often overlooked is the reliability of the measures involved. To the extent that scores on one or both of the variables are affected by random measurement error, the correlation between them will be reduced. If you are correlating scores between two multi-item scales, inspect the alpha coefficients in your data to assess how internally consistent the measures are in your sample. There are many reasons why measures that are generally quite reliable might not attain adequate reliability in a specific sample. Your student may also want to consider the issue of utility. Correlations below .3 in the context of predicting financial markets may be very useful. In other contexts, nothing less than a very large correlation may be useful. Well, those are just a few things to mull over.... HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Sharon Sent: Friday, October 13, 2006 12:11 PM To: [hidden email] Subject: Question regarding Pearson correlation and r-squared I am trying to find out why there is no information regarding the computation of r-squared once you've obtained your pearson r coefficient. It appears to me if I have found a significant correlation between two variables, the next logical question is which one weighs more or has greater strength in the reltaionship. Am I asking the wrong question? Does it stop at significance? Also, the student I am working with is producing very creative ways to report her findings. I have found (but am not sure which to advise) many ways of reporting correlation coefficients that are significant. For example, she has a .562 coefficient with a significance p <.01. She is looking at the coefficient as low because of it's actual value rather than the statistical significance. The literature shows anything above .7 or .8 to be statistically significant. So, how do I explain to her that this .561 is also significant even though it's below the recommended level? I apologize if this is remedial, I think I am just mixing thngs up and need help with clarification. Thank you in advance. Sharon D. Voirin, RhD Survey Design Services Carbondale, IL 62903 618-559-2507 "Our lives begin to end the day we become silent about things that matter." The Reverend Martin Luther King |
In reply to this post by Gushta, Matthew
Hello all,
Using macro syntax, I produce SPSS output containing upwards of 100 custom tables and charts. Up to now I've used page titles to help structure the printed report. Now, in SPSS 15, one has the ability to export output as a PDF file and the option of embedding bookmarks in the PDF file. These bookmarks are the exact text of the headings ("Custom Tables", "Frequencies"), output types ("Notes", "Log"), and, for example, the table title ("Current Age") which appear in the outline pane of the output viewer. It's easy enough to remove the Notes, Logs, etc. with the OMS panel, but I can't seem to locate a means to specify outline headings in a macro environment, much less organize specific output under specific headings. The macro I'm using in this specific instance selects cases on the basis of a Provider ID and then produces a set of tables and charts associated with that provider. There are multiple providers. Does anyone have a solution for using variable labels as Level I or Level II headings in the outline pane? Secondly, is there a way to organize output so that it is nested under a given heading. Even if the output is printed rather than exported as a PDF, it is nice to be able to use the output headings as page headers in the SPSS Page Setup options. In the example below, I want to have a Level I heading for the provider name and Level II headings for the variables reported for that provider. These typically would be a custom table and a chart under the same heading. The SPSS default is to list each under a separate heading. Hopefully the example below makes this clearer. Now, a typical outline (after stripping out extraneous stuff) looks like this: Custom Table /(Heading Level I)/ table 1 Graph /(Heading Level I)/ chart 1 Custom Table /(Heading Level I)/ table 2 Graph /(Heading Level I)/ chart 2 and so on for many pages . . . I would like to end up with something like this in the outline pane: Provider A /(Heading Level I)/ Length of Stay /(Heading Level II)/ table 1 chart 1 Discharge Status /(Heading Level II)/ table 2 chart 2 Provider B /(Heading Level I)/ Length of Stay /(Heading Level II)/ table 1 chart 1 Discharge Status /(Heading Level II)/ table 2 chart 2 etc. I'm guessing that the answer lies with Python. I'm just starting to work with Python and it looks like there is a way to pass values from a Python program block to SPSS syntax. Even if a full-blown answer doesn't exist in SPSSX-L land, I'd appreciate any thoughts on where the answer might lie. Thanks Victor Kogler |
Free forum by Nabble | Edit this page |