|
Hello, everybody.
I have queries about interaction. Here is model; Y (Y1-Y4) = b0 + b1X1 + b2X2 + b3X1*X2 + e In one model, both X1 (4 levels) and X2 (5 levels) are categorical, when Y is continuous. Proc glm gives me lots of lines from all combinations of levels. For illustration purpose I thought it might be better to have one estimate than displaying estimates from all combinations of levels, and I put X1 and X2 as continuous variable. I am not sure whether this is a right approach. In another model, Y and X1 is continous and X2 is categorical(5 levels). When I put this model, without saying to SAS X2 is categorical, then all p-value for each Y (Y1-Y4) were significant (P-value was based on Type III SS). However, if I model X2 as categorical, then all but one Y were not significant. When I looked at the data and plotted them, the latter looks to be more sensible. But, to be consistent with previous model in presentation, I prefer to have one (overall) estimates. So the question is; 1) whether introducing a categorical data as a continuous variable to create interaction term is correct and if there is difference what would be correct, 2) In case that categorical variable(s) consist of interaction term, P value from type III SS can be used for overall assessment of interaction term, 3) If (2) is case, then what would be better way to display so many estimates and if there is any alternaitve way, Any suggestion and guidance to relevant references will be appreciated. Thanks in advance. Myung ki, PhD University College London ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
If X1 and X2 are categorical then you need recode them in order to enter
them into a linear regression. Dummy coding or effect coding. Otherwise you're treating the categories in X1 and X2 as if they were continuous intervals on a scale, which probably doesn't make sense for categorical variables. Then to look at interactions you'd look at interactions between each dummy/ effect coded variable and each other dummy/ effect coded variable. My favorite reference for interaction effects in regression is Jaccard & Turrisi (2003). It's a little green Sage University Paper. Very thorough. Good luck. matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Myung Ki Sent: Thursday, June 03, 2010 12:31 PM To: [hidden email] Subject: interaction in a linear regression model Hello, everybody. I have queries about interaction. Here is model; Y (Y1-Y4) = b0 + b1X1 + b2X2 + b3X1*X2 + e In one model, both X1 (4 levels) and X2 (5 levels) are categorical, when Y is continuous. Proc glm gives me lots of lines from all combinations of levels. For illustration purpose I thought it might be better to have one estimate than displaying estimates from all combinations of levels, and I put X1 and X2 as continuous variable. I am not sure whether this is a right approach. In another model, Y and X1 is continous and X2 is categorical(5 levels). When I put this model, without saying to SAS X2 is categorical, then all p-value for each Y (Y1-Y4) were significant (P-value was based on Type III SS). However, if I model X2 as categorical, then all but one Y were not significant. When I looked at the data and plotted them, the latter looks to be more sensible. But, to be consistent with previous model in presentation, I prefer to have one (overall) estimates. So the question is; 1) whether introducing a categorical data as a continuous variable to create interaction term is correct and if there is difference what would be correct, 2) In case that categorical variable(s) consist of interaction term, P value from type III SS can be used for overall assessment of interaction term, 3) If (2) is case, then what would be better way to display so many estimates and if there is any alternaitve way, Any suggestion and guidance to relevant references will be appreciated. Thanks in advance. Myung ki, PhD University College London ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Myung seems to be using *SAS*/GLM. SPSS/GLM can handle categorical variables
directly without need for dummy codings. Whether it is appropriate to treat the independents as continuous depends on what they represent and how they are encoded. Often, Likert scale responses (which strictly speaking are ordinal) are treated as continuous if there are 5 or more categories and no-one seems to object. Garry -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Pirritano, Matthew Sent: 03 June 2010 21:35 To: [hidden email] Subject: Re: interaction in a linear regression model If X1 and X2 are categorical then you need recode them in order to enter them into a linear regression. Dummy coding or effect coding. Otherwise you're treating the categories in X1 and X2 as if they were continuous intervals on a scale, which probably doesn't make sense for categorical variables. Then to look at interactions you'd look at interactions between each dummy/ effect coded variable and each other dummy/ effect coded variable. My favorite reference for interaction effects in regression is Jaccard & Turrisi (2003). It's a little green Sage University Paper. Very thorough. Good luck. matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Myung Ki Sent: Thursday, June 03, 2010 12:31 PM To: [hidden email] Subject: interaction in a linear regression model Hello, everybody. I have queries about interaction. Here is model; Y (Y1-Y4) = b0 + b1X1 + b2X2 + b3X1*X2 + e In one model, both X1 (4 levels) and X2 (5 levels) are categorical, when Y is continuous. Proc glm gives me lots of lines from all combinations of levels. For illustration purpose I thought it might be better to have one estimate than displaying estimates from all combinations of levels, and I put X1 and X2 as continuous variable. I am not sure whether this is a right approach. In another model, Y and X1 is continous and X2 is categorical(5 levels). When I put this model, without saying to SAS X2 is categorical, then all p-value for each Y (Y1-Y4) were significant (P-value was based on Type III SS). However, if I model X2 as categorical, then all but one Y were not significant. When I looked at the data and plotted them, the latter looks to be more sensible. But, to be consistent with previous model in presentation, I prefer to have one (overall) estimates. So the question is; 1) whether introducing a categorical data as a continuous variable to create interaction term is correct and if there is difference what would be correct, 2) In case that categorical variable(s) consist of interaction term, P value from type III SS can be used for overall assessment of interaction term, 3) If (2) is case, then what would be better way to display so many estimates and if there is any alternaitve way, Any suggestion and guidance to relevant references will be appreciated. Thanks in advance. Myung ki, PhD University College London ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by mpirritano
A nit: There is also the more general case of orthogonal contrast coding that should be considered. Off the top of my head Keppel and Wickens have a two chapter introduction that is highly readable. There should be other texts that do a nice job with the topic as well.
Michael
**************************************************** Michael Granaas [hidden email] Assoc. Prof. Phone: 605 677 5295 Dept. of Psychology FAX: 605 677 3195 University of South Dakota 414 E. Clark St. Vermillion, SD 57069 ***************************************************** From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Pirritano, Matthew [[hidden email]] Sent: Thursday, June 03, 2010 3:35 PM To: [hidden email] Subject: Re: interaction in a linear regression model If X1 and X2 are categorical then you need recode them in order to enter
them into a linear regression. Dummy coding or effect coding. Otherwise you're treating the categories in X1 and X2 as if they were continuous intervals on a scale, which probably doesn't make sense for categorical variables. Then to look at interactions you'd look at interactions between each dummy/ effect coded variable and each other dummy/ effect coded variable. My favorite reference for interaction effects in regression is Jaccard & Turrisi (2003). It's a little green Sage University Paper. Very thorough. Good luck. matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 -----Original Message----- From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Myung Ki Sent: Thursday, June 03, 2010 12:31 PM To: [hidden email] Subject: interaction in a linear regression model Hello, everybody. I have queries about interaction. Here is model; Y (Y1-Y4) = b0 + b1X1 + b2X2 + b3X1*X2 + e In one model, both X1 (4 levels) and X2 (5 levels) are categorical, when Y is continuous. Proc glm gives me lots of lines from all combinations of levels. For illustration purpose I thought it might be better to have one estimate than displaying estimates from all combinations of levels, and I put X1 and X2 as continuous variable. I am not sure whether this is a right approach. In another model, Y and X1 is continous and X2 is categorical(5 levels). When I put this model, without saying to SAS X2 is categorical, then all p-value for each Y (Y1-Y4) were significant (P-value was based on Type III SS). However, if I model X2 as categorical, then all but one Y were not significant. When I looked at the data and plotted them, the latter looks to be more sensible. But, to be consistent with previous model in presentation, I prefer to have one (overall) estimates. So the question is; 1) whether introducing a categorical data as a continuous variable to create interaction term is correct and if there is difference what would be correct, 2) In case that categorical variable(s) consist of interaction term, P value from type III SS can be used for overall assessment of interaction term, 3) If (2) is case, then what would be better way to display so many estimates and if there is any alternaitve way, Any suggestion and guidance to relevant references will be appreciated. Thanks in advance. Myung ki, PhD University College London ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
