Hello,
I have a question about the interpretation of individual variables using a PCA regression method. And because PCR requires a different interpretation procedure I would like to ask how the following information should be interpreted? First it should be noted that I use a metric DV, revenue. In addition, I have 6 IV, all metric. A correlation matrix of these 6 IV indicate very high pearson correlation coefficients, even above .90. In order to remedy the problem of multicollinearity I have used a principal component analysis to transform the correlated variables into uncorrelated principal components (factor scores) using the VARIMAX rotation method. In sum the 6 IV can be explained by 3 components. Next I run a OLS multiple regression, with revenue as dependent variable and the three factor scores as independent. Results indicate significant R2 changes when a new factor score is added to the first that was included. Overall, the model with 3 factor scores shows an adjusted R2 of .875. Is there a reason why this R2 is so high based on the use of PCA? In addition, all factor scores have large t-values ranging from 2,875 to 14,505 that are significant at p < 0,01. Now I come to the point of interpretation, and I understand that interpreting beta coefficients will only tell me that a one-unit increase in factor 1 will increase revenue by .892. Although I would like to go further and interpret the effect of the individual IV included in the factors. I thought that I need the factor loadings in order to do so, the results are provided below: The beta coefficients for the factors are as follows: Factor score 1 = .892 (sign at p < .001) Factor score 2 = -.246 (sign at p < .01) Factor score 3 = .177 (sign at p < .001) The factor loadings for factor one are as follows: IV1 = .971 IV2 = .985 IV3 = -.952 Example interpretation: Factor score 1 is positively related to revenue, and therefore an increase in factor score 1 will increase revenue by .892. In addition, the positive loadings for IV1 and IV2 indicate that an increase in IV1 and IV2 will cause an increase in revenue. Although the negative loading of IV3 indicate that a decrease of IV3 will cause an increase in revenue. Is this interpretation correct? In addition, I would like to conclude that a one-unit increase of IV1 (IV2 and IV3) will cause an increase (decrease) in revenue of .???? Is it possible to make such an interpretation, and if so how can I do this in SPSS?? Thanks in advance for your help!! |
I will be out of the office on Friday June 22. I will have occasional access to email.
If your message is urgent, please call our switchboard and someone will direct your call: (813) 207-0332.
|
Administrator
|
In reply to this post by RuudM123
I don't have time to tackle your questions below. This post is just to acquaint you with an article by Hadi & Ling (1998) that highlights potential problems with PC regression. If you have institutional access to JSTOR, you can download it here:
http://www.jstor.org/stable/10.2307/2685559 For those without access, here's the abstract. Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known data sets. We also illustrate by theory and example that it is possible for the PCR to fail miserably in the sense that when the response variable is regressed on all of the p principal components (PCs), the first (p - 1) PCs contribute nothing toward the reduction of the residual sum of squares, yet the last PC alone (the one that is always discarded according to PCR methodology) contributes everything. We then give conditions under which the PCR totally fails in the above sense. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by RuudM123
Your interpretation is roughly correct. I might suggest that you consider the CATREG procedure in SPSS instead though. It will allow you to deal with the multicolinearity using ridge regression, lasso, or elastic net for linearization, and doesn't suffer the problems mentioned in the shared article by Bruce Weaver. You can look at the variables independently, or if you prefer, in factor form (You need to do this beforehand, just as you did with the PCR).
The coefficients can be thought of as standardized coefficients in this case, and thus you could treat them as standard deviation units. If you can figure out what 1 standard deviation unit is, you simply multiply the two. As far as I know, there is no way to force SPSS to spit that out for you, you will need to hand calculate the values. However, you can't really make a clear interpretation as you hope with the principle objects you have created, all you can easily say is that a 1 unit increase in that object is equal to a .892*(SD) change in the revenue. It always ends up a bit more ambiguous. Hope that helps. Matthew J Poes Research Data Specialist Center for Prevention Research and Development University of Illinois 510 Devonshire Dr. Champaign, IL 61820 Phone: 217-265-4576 email: [hidden email] -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of RuudM123 Sent: Friday, June 22, 2012 2:47 AM To: [hidden email] Subject: Interpretation of principal component regression results Hello, I have a question about the interpretation of individual variables using a PCA regression method. And because PCR requires a different interpretation procedure I would like to ask how the following information should be interpreted? First it should be noted that I use a metric DV, revenue. In addition, I have 6 IV, all metric. A correlation matrix of these 6 IV indicate very high pearson correlation coefficients, even above .90. In order to remedy the problem of multicollinearity I have used a principal component analysis to transform the correlated variables into uncorrelated principal components (factor scores) using the VARIMAX rotation method. In sum the 6 IV can be explained by 3 components. Next I run a OLS multiple regression, with revenue as dependent variable and the three factor scores as independent. Results indicate significant R2 changes when a new factor score is added to the first that was included. Overall, the model with 3 factor scores shows an adjusted R2 of .875. *Is there a reason why this R2 is so high based on the use of PCA?* In addition, all factor scores have large t-values ranging from 2,875 to 14,505 that are significant at p < 0,01. Now I come to the point of interpretation, and I understand that interpreting beta coefficients will only tell me that a one-unit increase in factor 1 will increase revenue by .892. *Although I would like to go further and interpret the effect of the individual IV included in the factors.* I thought that I need the factor loadings in order to do so, the results are provided below: The beta coefficients for the factors are as follows: Factor score 1 = .892 (sign at p < .001) Factor score 2 = -.246 (sign at p < .01) Factor score 3 = .177 (sign at p < .001) The factor loadings for factor one are as follows: IV1 = .971 IV2 = .985 IV3 = -.952 Example interpretation: Factor score 1 is positively related to revenue, and therefore an increase in factor score 1 will increase revenue by .892. In addition, the positive loadings for IV1 and IV2 indicate that an increase in IV1 and IV2 will cause an increase in revenue. Although the negative loading of IV3 indicate that a decrease of IV3 will cause an increase in revenue. *Is this interpretation correct?* In addition, I would like to conclude that a one-unit increase of IV1 (IV2 and IV3) will cause an increase (decrease) in revenue of .???? Is it possible to make such an interpretation, and if so how can I do this in SPSS?? Thanks in advance for your help!! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interpretation-of-principal-component-regression-results-tp5713752.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Bruce Weaver
You might want to consider Partial Least
Squares for this situation. That is available as an extension command
for Statistics. Or perhaps ridge, lasso or elastic net regression
available in the Categories option.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Bruce Weaver <[hidden email]> To: [hidden email] Date: 06/22/2012 07:11 AM Subject: Re: [SPSSX-L] Interpretation of principal component regression results Sent by: "SPSSX(r) Discussion" <[hidden email]> I don't have time to tackle your questions below. This post is just to acquaint you with an article by Hadi & Ling (1998) that highlights potential problems with PC regression. If you have institutional access to JSTOR, you can download it here: http://www.jstor.org/stable/10.2307/2685559 For those without access, here's the abstract. Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known data sets. We also illustrate by theory and example that it is possible for the PCR to fail miserably in the sense that when the response variable is regressed on all of the p principal components (PCs), the first (p - 1) PCs contribute nothing toward the reduction of the residual sum of squares, yet the last PC alone (the one that is always discarded according to PCR methodology) contributes everything. We then give conditions under which the PCR totally fails in the above sense. HTH. RuudM123 wrote > > Hello, > > I have a question about the interpretation of individual variables using a > PCA regression method. And because PCR requires a different interpretation > procedure I would like to ask how the following information should be > interpreted? > > First it should be noted that I use a metric DV, revenue. In addition, I > have 6 IV, all metric. > A correlation matrix of these 6 IV indicate very high pearson correlation > coefficients, even above .90. > In order to remedy the problem of multicollinearity I have used a > principal component analysis to transform the correlated variables into > uncorrelated principal components (factor scores) using the VARIMAX > rotation method. In sum the 6 IV can be explained by 3 components. > > Next I run a OLS multiple regression, with revenue as dependent variable > and the three factor scores as independent. Results indicate significant > R2 changes when a new factor score is added to the first that was > included. Overall, the model with 3 factor scores shows an adjusted R2 of > .875. > *Is there a reason why this R2 is so high based on the use of PCA?* > > In addition, all factor scores have large t-values ranging from 2,875 to > 14,505 that are significant at p < 0,01. > > Now I come to the point of interpretation, and I understand that > interpreting beta coefficients will only tell me that a one-unit increase > in factor 1 will increase revenue by .892. *Although I would like to go > further and interpret the effect of the individual IV included in the > factors.* I thought that I need the factor loadings in order to do so, the > results are provided below: > > The beta coefficients for the factors are as follows: > Factor score 1 = .892 (sign at p < .001) > Factor score 2 = -.246 (sign at p < .01) > Factor score 3 = .177 (sign at p < .001) > > The factor loadings for factor one are as follows: > IV1 = .971 > IV2 = .985 > IV3 = -.952 > > Example interpretation: Factor score 1 is positively related to revenue, > and therefore an increase in factor score 1 will increase revenue by .892. > In addition, the positive loadings for IV1 and IV2 indicate that an > increase in IV1 and IV2 will cause an increase in revenue. Although the > negative loading of IV3 indicate that a decrease of IV3 will cause an > increase in revenue. *Is this interpretation correct?* > > In addition, I would like to conclude that a one-unit increase of IV1 (IV2 > and IV3) will cause an increase (decrease) in revenue of .???? Is it > possible to make such an interpretation, and if so how can I do this in > SPSS?? > > Thanks in advance for your help!! > ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interpretation-of-principal-component-regression-results-tp5713752p5713757.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by RuudM123
I thought people (should have) stopped doing this sort of thing about 25 years ago in favor of SEM?
--
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by RuudM123
I missed this initially in reviewing your email. I think you should carefully review your steps and be sure you haven't done something silly. The large R2 and large T values all indicate that you may be trying to predict your DV with an IV or set of IV's that are the same thing. The amount of variance explained certainly isn't impossible, but its high enough I would start checking. An R2 that large means correlations of over .9. Any chance you included the DV or something nearly equal to the DV somewhere in there? You didn't include them in the PCA right? It seems to me like maybe your correlation between all IV's and even the DV is very high. While this could be great (you have found the perfect predictors) it could also mean they are all essentially measuring the same thing. For instance, if you know all the factors associated with someone's paid taxes, you can predict to a very high degree their salary, but then, they are so inter-related, what's the point. If y!
ou know someone's BMI and Height, you can predict to a very high degree their weight, again, not really so useful though. Just make sure you haven't done something like that. Matthew J Poes Research Data Specialist Center for Prevention Research and Development University of Illinois 510 Devonshire Dr. Champaign, IL 61820 Phone: 217-265-4576 email: [hidden email] -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of RuudM123 Sent: Friday, June 22, 2012 2:47 AM To: [hidden email] Subject: Interpretation of principal component regression results Hello, I have a question about the interpretation of individual variables using a PCA regression method. And because PCR requires a different interpretation procedure I would like to ask how the following information should be interpreted? First it should be noted that I use a metric DV, revenue. In addition, I have 6 IV, all metric. A correlation matrix of these 6 IV indicate very high pearson correlation coefficients, even above .90. In order to remedy the problem of multicollinearity I have used a principal component analysis to transform the correlated variables into uncorrelated principal components (factor scores) using the VARIMAX rotation method. In sum the 6 IV can be explained by 3 components. Next I run a OLS multiple regression, with revenue as dependent variable and the three factor scores as independent. Results indicate significant R2 changes when a new factor score is added to the first that was included. Overall, the model with 3 factor scores shows an adjusted R2 of .875. *Is there a reason why this R2 is so high based on the use of PCA?* In addition, all factor scores have large t-values ranging from 2,875 to 14,505 that are significant at p < 0,01. Now I come to the point of interpretation, and I understand that interpreting beta coefficients will only tell me that a one-unit increase in factor 1 will increase revenue by .892. *Although I would like to go further and interpret the effect of the individual IV included in the factors.* I thought that I need the factor loadings in order to do so, the results are provided below: The beta coefficients for the factors are as follows: Factor score 1 = .892 (sign at p < .001) Factor score 2 = -.246 (sign at p < .01) Factor score 3 = .177 (sign at p < .001) The factor loadings for factor one are as follows: IV1 = .971 IV2 = .985 IV3 = -.952 Example interpretation: Factor score 1 is positively related to revenue, and therefore an increase in factor score 1 will increase revenue by .892. In addition, the positive loadings for IV1 and IV2 indicate that an increase in IV1 and IV2 will cause an increase in revenue. Although the negative loading of IV3 indicate that a decrease of IV3 will cause an increase in revenue. *Is this interpretation correct?* In addition, I would like to conclude that a one-unit increase of IV1 (IV2 and IV3) will cause an increase (decrease) in revenue of .???? Is it possible to make such an interpretation, and if so how can I do this in SPSS?? Thanks in advance for your help!! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interpretation-of-principal-component-regression-results-tp5713752.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by RuudM123
In a nutshell, the problem with principal components regression is that
the principal components are formed without taking into account the association between the predictors and the target variable. As others have noted, you might consider PLS or methods such as ridge regression, the lasso, or the elastic net. For a reference on these, see The Elements of Statistical Learning 2nd edition by Hastie, Tibshirani, and Friedman. As you learn about these methods, you need to consider whether standardizing the variables makes a difference in the answer you get. A newer method that works well in your situation is correlated component regression. This method is implemented in CORExpress and in the Excel add-in XLSTAT. For tutorials and background papers, see the Statistical Innovations website. Tony Babinec [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by RuudM123
No, the PCA is not the main reason that R2 is so high.
From what you describe, the first three variables are "measuring" almost the same thing as each other; and any one of them is going to (probably) give a very high R2, even ignoring the others. But I expect that you ought to go back to the start, and re-figure the units or measures of your analysis. Why are the r's so high among the predictors? (Can you do something about that from logic, without the mess of PCA?) Hmm... revenue ... I can imagine where the SIZE of the enterprises, or what-have-you, is giving big correlations, and that you could be seeing big r's only as artifacts of failing to translate your measures to remove the uninteresting component of size. Businesses sometimes use "revenue per square foot" in comparing shops. Super-high intercorrelation of predictors generally means that you should be measuring *something* differently, if you want to interpret something like "separate aspects" of prediction. But -- Do you get a simple, interpretive factor from your PCA? If the units are the same for two similar (r > .95) itmes, you would have a simpler story to tell if you take their simple average. And then, you preserve the "second dimension" of the two by using their difference as another predictor. Or, if these were something like "total floor space" and "storage floor space", you might want to use one of those scores alone (rather than the average) and then the difference. Or, "Percent used as storage" would show a difference in a less size-dependent way. Of course, going back to what I mentioned before, if you were predicting Revenues from Floor-space, you might be better advised to divide Revenues *by* Floor-space in order to get a size-independent outcome. I'll repeat: re-figure your units. Eliminate "size" as an artifact, if that's accounting for high r's. Look at the measures rationally and first consider simple averages or ratios that are familiar or that it would be easy to make sense of. -- Rich Ulrich > Date: Fri, 22 Jun 2012 00:47:08 -0700 > From: [hidden email] > Subject: Interpretation of principal component regression results > To: [hidden email] > > Hello, > > I have a question about the interpretation of individual variables using a > PCA regression method. And because PCR requires a different interpretation > procedure I would like to ask how the following information should be > interpreted? > > First it should be noted that I use a metric DV, revenue. In addition, I > have 6 IV, all metric. > A correlation matrix of these 6 IV indicate very high pearson correlation > coefficients, even above .90. > In order to remedy the problem of multicollinearity I have used a principal > component analysis to transform the correlated variables into uncorrelated > principal components (factor scores) using the VARIMAX rotation method. In > sum the 6 IV can be explained by 3 components. > > Next I run a OLS multiple regression, with revenue as dependent variable and > the three factor scores as independent. Results indicate significant R2 > changes when a new factor score is added to the first that was included. > Overall, the model with 3 factor scores shows an adjusted R2 of .875. > *Is there a reason why this R2 is so high based on the use of PCA?* > > In addition, all factor scores have large t-values ranging from 2,875 to > 14,505 that are significant at p < 0,01. > > Now I come to the point of interpretation, and I understand that > interpreting beta coefficients will only tell me that a one-unit increase in > factor 1 will increase revenue by .892. *Although I would like to go further > and interpret the effect of the individual IV included in the factors.* I > thought that I need the factor loadings in order to do so, the results are > provided below: > > The beta coefficients for the factors are as follows: > Factor score 1 = .892 (sign at p < .001) > Factor score 2 = -.246 (sign at p < .01) > Factor score 3 = .177 (sign at p < .001) > > The factor loadings for factor one are as follows: > IV1 = .971 > IV2 = .985 > IV3 = -.952 > > Example interpretation: Factor score 1 is positively related to revenue, and > therefore an increase in factor score 1 will increase revenue by .892. In > addition, the positive loadings for IV1 and IV2 indicate that an increase in > IV1 and IV2 will cause an increase in revenue. Although the negative loading > of IV3 indicate that a decrease of IV3 will cause an increase in revenue. > *Is this interpretation correct?* > > In addition, I would like to conclude that a one-unit increase of IV1 (IV2 > and IV3) will cause an increase (decrease) in revenue of .???? Is it > possible to make such an interpretation, and if so how can I do this in > SPSS?? > > Thanks in advance for your help!! > > > > > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interpretation-of-principal-component-regression-results-tp5713752.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD |
In reply to this post by RuudM123
Dear All,
Thank you for all your explanatory comments. I changed my dependent for one that didn't correlate so high with the independent variables and results are good. For illustration purposes I would like to make sure that I don't do anything wrong, therefore I wanted to ask the following question based on PCR: Results: Factor 1 - beta coefficient of .452 which is significant at p < 0,001 IV1 - factor loading of .985 IV3 - factor loading of -.952 Factor 2 - beta coefficient of -.456 which is significant at p < 0,001 IV5 (interaction of IV1*IV3) - factor loading of .854 Conclusions: - IV1 (in factor 1) has a significant beta coefficient of .452 (for the factor) and a factor loading of .985 (for the variable) = which means that IV1 has a positive linear relation on the dependent. - IV3 (in factor 1) has a significant beta coefficient of .452 (for the factor) and a factor loading of -.952 (for the variable) = which means that IV2 has a negative linear relation on the dependent. - And IV5, which is the interaction term of IV1*IV3, (in factor 2) has a significant beta coefficient -.456 (for factor 2) and a factor loading of .854 (for the variable). Because IV5 is the interaction term of IV1 and IV3, does this mean that the relation of IV1 to the dependent starts off as a positive and significant relation, but based on the fact that the size of IV3 increases the initial relation of IV1 to the dependent decreases (at a negative beta coefficient value that PCR can not specify for IV5). Which would mean that IV3 has a moderator effect, in which if its size increases, it will force the initial direct relation of IV1 to the dependent to shrink. Do you have some elaborating comments on this final interpretation? I am not 100% sure if my interpretation is correct, therefore this final question. Thank you very much in advance. |
In reply to this post by RuudM123
Getting in on the end of this thread so perhaps my question isn't
appropriate but .... You said: "I changed my dependent for one that didn't correlate so high with the independent variables and results are good." This sounds to me that you changed your hypothesis so as to get the results you are looking for or at least to get 'better' results? Not a viable research approach so do explain more as to the reasoning and soundness W On 6/25/2012 10:45:33 AM, RuudM123 ([hidden email]) wrote: > Dear All, > > Thank you for all your explanatory comments. I changed my dependent for > one > that > didn't correlate so high with the independent variables and results are > good. > > For illustration purposes I would like to make sure that I don't > do anything > wrong, therefore I wanted to ask the following question based on PCR: > > Results: > Factor 1 - beta coefficient of .452 which is significant at p < 0,001 > IV1 - factor loading of .985 > IV3 - factor loading of -.952 > > Factor 2 - beta coefficient of -.456 which is significant at p < 0,001 > IV5 (interaction of IV1*IV3) - factor loading of .854 > > Conclusions: > - IV1 (in factor 1) has a significant beta coefficient of .452 (for the > factor) and a factor loading of .985 (for the variable) = which means > that > IV1 has a positive linear relation on the dependent. > > - IV3 (in factor 1) has a significant beta coefficient of .452 (for the > factor) and a factor loading of -.952 (for the variable) = which means > that > IV2 has a negative linear relation on the dependent. > > - And IV5, which is the interaction term of IV1*IV3, (in factor 2) has a > significant ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by RuudM123
Okay. It sounds like you took my recommendation of
examining the units, and simplifying the problem. You originally had predictors 1-6, and high intercorrelations. Now you have two predictor variables, which are created out of IV1 and IV3, plus their interaction; and these are entered and considered from them role in Factors ... which (?) include some fairly trivial weights for the other original IVs? No, I would not be satisfied with the description. What happened to the other variables? Should you focus on saying something -- or, as a first step, *describing* what you have -- in the best terms that you can, for IV1 and IV3? (and incorporate everything else as a subsequent step). What you have presented does not seem like revealing description, even if you could fill in the terms. I'd say that it is pretty impossible to use coefficients alone to interpret the effects of IV1, IV3 and their interaction, given the *high* correlation between them. "Scaling" peculiarities (like basement or ceiling effects, or other differences in intervals) could account for a vast range of results. Explore this. Look at outcomes for various explicit combinations; and look at the *fitted*, predicted outcomes, similarly. This will show you what you are asking, your description of what you have. It may also (by the deviations of fit, if they are systematic) suggest where there is still a problem of fit. It might be possible to advise more concretely if you described your variables concretely. -- Rich Ulrich > Date: Mon, 25 Jun 2012 07:45:33 -0700 > From: [hidden email] > Subject: Re: Interpretation of principal component regression results > To: [hidden email] > > Dear All, > > Thank you for all your explanatory comments. I changed my dependent for one > that didn't correlate so high with the independent variables and results are > good. > > For illustration purposes I would like to make sure that I don't do anything > wrong, therefore I wanted to ask the following question based on PCR: > > Results: > Factor 1 - beta coefficient of .452 which is significant at p < 0,001 > IV1 - factor loading of .985 > IV3 - factor loading of -.952 > > Factor 2 - beta coefficient of -.456 which is significant at p < 0,001 > IV5 (interaction of IV1*IV3) - factor loading of .854 > > Conclusions: > - IV1 (in factor 1) has a significant beta coefficient of .452 (for the > factor) and a factor loading of .985 (for the variable) = which means that > IV1 has a positive linear relation on the dependent. > > - IV3 (in factor 1) has a significant beta coefficient of .452 (for the > factor) and a factor loading of -.952 (for the variable) = which means that > IV2 has a negative linear relation on the dependent. > > - And IV5, which is the interaction term of IV1*IV3, (in factor 2) has a > significant beta coefficient -.456 (for factor 2) and a factor loading of > .854 (for the variable). Because IV5 is the interaction term of IV1 and IV3, > does this mean that the relation of IV1 to the dependent starts off as a > positive and significant relation, but based on the fact that the size of > IV3 increases the initial relation of IV1 to the dependent decreases (at a > negative beta coefficient value that PCR can not specify for IV5). Which > would mean that IV3 has a moderator effect, in which if its size increases, > it will force the initial direct relation of IV1 to the dependent to shrink. > > Do you have some elaborating comments on this final interpretation? I am not > 100% sure if my interpretation is correct, therefore this final question. > > Thank you very much in advance. > |
Free forum by Nabble | Edit this page |