|
Hi,
I'm running a linear regression model using cathegorical dummy (0/1 coding) as well as continuous predicotrs. My questing is regarding interpretation of standardized beta coefficients and using them for comparisons between dummies and continuous predictors' effects. I understand the standardized beta values tell us the number of standard deviations the outcome will change as a result of one standard deviation change in predictor. Now, is this applicable to cathegorical/dummy variables as well? My cathegorical and continuous predictors are measured in different units so standardized beta would be great for comparing their impact on the outcome, however I'm not sure that, say .51 standardized beta for cathegorical predictor and .51 standardized beta for continuous predictor have the same "impact". Can anyone confirm? Thanks ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I wouldn't rely on standardized betas to judge the difference in impacts across variables. Too many things have to be perfect before they give you good answers. Secondly, the standard deviation for the categorical variable is not of much use since the difference is in the raw unit, ie. your in one group or another. Assuming equal groups and a large n, the standard deviation for the categorical variable will be about .5. So how much do you care about the change in the outcome associated with being halfway between the two groups? Why not just look at the difference in the adjusted means for each group. That controls for all the other variables in the model.
Paul R. Swank, Ph.D Professor and Director of Research Children's Learning Institute University of Texas Health Science Center Houston, TX 77038 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of paul wilson Sent: Thursday, October 09, 2008 10:48 AM To: [hidden email] Subject: Dummy vs Continuous predictors in regression Hi, I'm running a linear regression model using cathegorical dummy (0/1 coding) as well as continuous predicotrs. My questing is regarding interpretation of standardized beta coefficients and using them for comparisons between dummies and continuous predictors' effects. I understand the standardized beta values tell us the number of standard deviations the outcome will change as a result of one standard deviation change in predictor. Now, is this applicable to cathegorical/dummy variables as well? My cathegorical and continuous predictors are measured in different units so standardized beta would be great for comparing their impact on the outcome, however I'm not sure that, say .51 standardized beta for cathegorical predictor and .51 standardized beta for continuous predictor have the same "impact". Can anyone confirm? Thanks ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by paul wilson-7
With CATREG in the SPSS CATEGORIES add-on module you can analyze categorical variables (unordered and ordered) as well as numeric variables. For categorical variables you do not need dummies and you will obtain beta coeffcient for the variable as a whole. (CATREG is in menu under Regression, Optimal Scaling).
If you do not have CATEGORIES, you can compute CATREG results for numeric dependent variable and nominal (unordered categorical) and numeric independent variables as follows: Use standardized dependent variable in the regression with dummies. Recode the categories of a variable with the B coefficients. For example for variable v1 with 5 categories: compute quantv1 = v1. recode quantv1 (1=B1) (2=B2) (3=B3) (4=B4) (5=0). Then run descriptives, saving the standardized recoded variables. DESCRIPTIVES VARIABLES=quantv1 /STATISTICS=STDDEV /SAVE. The Std. dev. is the beta for the variable. The saved standardized variable zquantv1 is the quantified variable: category values replaced with optimal nominal quantifications. To create transformation plot: GRAPH /LINE(SIMPLE)=MEAN(zquantv1) BY v1. NB: for one category you don't need a dummy. The left out category is recoded to 0. It does not matter which category is left out. You obtain different b's when different left out category, but the quantifications (standardized B's) will not be different. Regards, Anita van der Kooij Data Theory Group Leiden University ________________________________ From: SPSSX(r) Discussion on behalf of paul wilson Sent: Thu 09/10/2008 17:47 To: [hidden email] Subject: Dummy vs Continuous predictors in regression Hi, I'm running a linear regression model using cathegorical dummy (0/1 coding) as well as continuous predicotrs. My questing is regarding interpretation of standardized beta coefficients and using them for comparisons between dummies and continuous predictors' effects. I understand the standardized beta values tell us the number of standard deviations the outcome will change as a result of one standard deviation change in predictor. Now, is this applicable to cathegorical/dummy variables as well? My cathegorical and continuous predictors are measured in different units so standardized beta would be great for comparing their impact on the outcome, however I'm not sure that, say .51 standardized beta for cathegorical predictor and .51 standardized beta for continuous predictor have the same "impact". Can anyone confirm? Thanks ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. ********************************************************************** ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi all,
I hope that someone will be able to help me here. I am trying to import a big excel spreadsheet into SPSS using the "create a new query using Database Wizard" tool in SPSS. The following warning appears: GET DATA /TYPE=ODBC /CONNECT= 'DSN=Excel Files;DBQ=L:\INF\SOD details\INF_1 ASF monitoring SOD rats\Digi'+ 'gait\SPSS-Cohort 1 and 2.xls;DriverId=790;' 'MaxBufferSize=2048;PageTimeout=5;' /SQL = "SELECT `Animal ID` AS Animal_ID, Genotype AS Genotype_, Sex, P"+ "ND_wk, Limb, Swing AS `@_Swing`, Brake AS `@_Brake`, " " Propel AS `@_Propel`, Stance AS `@_Stance`, Stride AS `@_Stride`, `#"+ "Steps` AS `@_#Steps`, `Gait Symmetry` AS " "`@_Gait_Symmetry`, `Belt Speed` AS `@_Belt_Speed` FROM `'15cms$'`" /ASSUMEDSTRWIDTH=255 . >Warning. Command name: GET DATA >SQLExecDirect failed :[Microsoft][ODBC Excel Driver] Too few parameters. Expected 8. CACHE. DATASET NAME DataSet1 WINDOW=FRONT. I can import other excel spreadsheet but this one seems to be a problem... If anyone could help that will be greatly appreciated. Thanks, Valerie ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Have you tried to just read the file with GET DATA /TYPE=XLS (or File/Open/Data)?
HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Valerie Guille Sent: Thursday, October 09, 2008 10:24 PM To: [hidden email] Subject: [SPSSX-L] Import problem (from excel into SPSS) Hi all, I hope that someone will be able to help me here. I am trying to import a big excel spreadsheet into SPSS using the "create a new query using Database Wizard" tool in SPSS. The following warning appears: GET DATA /TYPE=ODBC /CONNECT= 'DSN=Excel Files;DBQ=L:\INF\SOD details\INF_1 ASF monitoring SOD rats\Digi'+ 'gait\SPSS-Cohort 1 and 2.xls;DriverId=790;' 'MaxBufferSize=2048;PageTimeout=5;' /SQL = "SELECT `Animal ID` AS Animal_ID, Genotype AS Genotype_, Sex, P"+ "ND_wk, Limb, Swing AS `@_Swing`, Brake AS `@_Brake`, " " Propel AS `@_Propel`, Stance AS `@_Stance`, Stride AS `@_Stride`, `#"+ "Steps` AS `@_#Steps`, `Gait Symmetry` AS " "`@_Gait_Symmetry`, `Belt Speed` AS `@_Belt_Speed` FROM `'15cms$'`" /ASSUMEDSTRWIDTH=255 . >Warning. Command name: GET DATA >SQLExecDirect failed :[Microsoft][ODBC Excel Driver] Too few parameters. Expected 8. CACHE. DATASET NAME DataSet1 WINDOW=FRONT. I can import other excel spreadsheet but this one seems to be a problem... If anyone could help that will be greatly appreciated. Thanks, Valerie ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Kooij, A.J. van der
This is just a minor bug bear of mind with regard to dummy variables, and
people using them as a fixing solution, ( I agree they are useful, but if you understand why you are using them) Honestly don't mean to rant... "NB: for one category you don't need a dummy. The left out category is recoded to 0. It does not matter which category is left out. You obtain different b's when different left out category, but the quantifications (standardized B's) will not be different." Mathematically speaking it is very important to understand which category you are leaving out, even more so in ordinal data. Let's use a specification that uses dummies for test scores up to 10 (an ordinal variable). As mentioned above, we need to drop one of the dummies. Let's drop the dummy which takes on the value 1. Note again that the constant includes both baseline test scores and the lift due to the test scores of 1, since we cannot separately identify those. The analysis may lead to interesting insights, however. The results may show that only having scored 4 or 7 lead to a significant increase in pay say. Now let's drop the 10th, which will keep the model identified and enable us to quantify the effect of scores of one (which generate more than half the scores in our dataset lets say and the score 10 accounts for 0.5%). Exclusion of the dummy with infrequent 1's bring the covariate matrix to near singularity, which pushes up the standard errors of paramaters and can make the estimates insignificant. It's just another case of think what you are doing rather than just doing it Mike On Fri, Oct 10, 2008 at 2:22 AM, Kooij, A.J. van der < [hidden email]> wrote: > With CATREG in the SPSS CATEGORIES add-on module you can analyze > categorical variables (unordered and ordered) as well as numeric variables. > For categorical variables you do not need dummies and you will obtain beta > coeffcient for the variable as a whole. (CATREG is in menu under Regression, > Optimal Scaling). > > If you do not have CATEGORIES, you can compute CATREG results for numeric > dependent variable and nominal (unordered categorical) and numeric > independent variables as follows: > > Use standardized dependent variable in the regression with dummies. Recode > the categories of a variable with the B coefficients. > > For example for variable v1 with 5 categories: > > compute quantv1 = v1. > > recode quantv1 (1=B1) (2=B2) (3=B3) (4=B4) (5=0). > > Then run descriptives, saving the standardized recoded variables. > > DESCRIPTIVES VARIABLES=quantv1 /STATISTICS=STDDEV /SAVE. > > The Std. dev. is the beta for the variable. The saved standardized variable > zquantv1 is the quantified variable: category values replaced with optimal > nominal quantifications. To create transformation plot: > > GRAPH /LINE(SIMPLE)=MEAN(zquantv1) BY v1. > > > > NB: for one category you don't need a dummy. The left out category is > recoded to 0. It does not matter which category is left out. You obtain > different b's when different left out category, but the quantifications > (standardized B's) will not be different. > > > > Regards, > > Anita van der Kooij > > Data Theory Group > > Leiden University > > > ________________________________ > > From: SPSSX(r) Discussion on behalf of paul wilson > Sent: Thu 09/10/2008 17:47 > To: [hidden email] > Subject: Dummy vs Continuous predictors in regression > > > > Hi, > > I'm running a linear regression model using cathegorical dummy (0/1 coding) > as well as continuous predicotrs. > My questing is regarding interpretation of standardized beta coefficients > and using them for comparisons between dummies and continuous > predictors' effects. I understand the standardized beta values tell us the > number of standard deviations the outcome will change as a result of one > standard deviation change in predictor. Now, is this applicable to > cathegorical/dummy variables as well? > My cathegorical and continuous predictors are measured in different units > so standardized beta would be great for comparing their > impact on the outcome, however I'm not sure that, say .51 standardized beta > for cathegorical predictor and .51 standardized beta for continuous > predictor have the same "impact". Can anyone confirm? > > Thanks > > > > > ======= > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > ********************************************************************** > > > > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- Michael Pearmain Senior Statistical Analyst Google UK Ltd Belgrave House 76 Buckingham Palace Road London SW1W 9TQ United Kingdom t +44 (0) 2032191684 [hidden email] If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Michael:
I, in turn, beg your pardon if I am not understanding you correctly. But I believe that Anita meant just what she said, and that she is correct: If you only have 2 categories (e.g., 0,1) comprising one variable. Your example used a 10-categories (or 10-level) variable. What you say makes perfect sense. Clearly if one codes the 2-level dummy as 0,1 versus 1,0, then the beta is reversed in value. Hence, for a 2-level variable, it is already dummied. That is my impression of what she meant. Joe Burleson -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Michael Pearmain Sent: Friday, October 17, 2008 10:00 AM To: [hidden email] Subject: Re: Dummy vs Continuous predictors in regression This is just a minor bug bear of mind with regard to dummy variables, and people using them as a fixing solution, ( I agree they are useful, but if you understand why you are using them) Honestly don't mean to rant... "NB: for one category you don't need a dummy. The left out category is recoded to 0. It does not matter which category is left out. You obtain different b's when different left out category, but the quantifications (standardized B's) will not be different." Mathematically speaking it is very important to understand which category you are leaving out, even more so in ordinal data. Let's use a specification that uses dummies for test scores up to 10 (an ordinal variable). As mentioned above, we need to drop one of the dummies. Let's drop the dummy which takes on the value 1. Note again that the constant includes both baseline test scores and the lift due to the test scores of 1, since we cannot separately identify those. The analysis may lead to interesting insights, however. The results may show that only having scored 4 or 7 lead to a significant increase in pay say. Now let's drop the 10th, which will keep the model identified and enable us to quantify the effect of scores of one (which generate more than half the scores in our dataset lets say and the score 10 accounts for 0.5%). Exclusion of the dummy with infrequent 1's bring the covariate matrix to near singularity, which pushes up the standard errors of paramaters and can make the estimates insignificant. It's just another case of think what you are doing rather than just doing it Mike On Fri, Oct 10, 2008 at 2:22 AM, Kooij, A.J. van der < [hidden email]> wrote: > With CATREG in the SPSS CATEGORIES add-on module you can analyze > categorical variables (unordered and ordered) as well as numeric variables. > For categorical variables you do not need dummies and you will obtain beta > coeffcient for the variable as a whole. (CATREG is in menu under Regression, > Optimal Scaling). > > If you do not have CATEGORIES, you can compute CATREG results for numeric > dependent variable and nominal (unordered categorical) and numeric > independent variables as follows: > > Use standardized dependent variable in the regression with dummies. Recode > the categories of a variable with the B coefficients. > > For example for variable v1 with 5 categories: > > compute quantv1 = v1. > > recode quantv1 (1=B1) (2=B2) (3=B3) (4=B4) (5=0). > > Then run descriptives, saving the standardized recoded variables. > > DESCRIPTIVES VARIABLES=quantv1 /STATISTICS=STDDEV /SAVE. > > The Std. dev. is the beta for the variable. The saved standardized > zquantv1 is the quantified variable: category values replaced with optimal > nominal quantifications. To create transformation plot: > > GRAPH /LINE(SIMPLE)=MEAN(zquantv1) BY v1. > > > > NB: for one category you don't need a dummy. The left out category is > recoded to 0. It does not matter which category is left out. You obtain > different b's when different left out category, but the quantifications > (standardized B's) will not be different. > > > > Regards, > > Anita van der Kooij > > Data Theory Group > > Leiden University > > > ________________________________ > > From: SPSSX(r) Discussion on behalf of paul wilson > Sent: Thu 09/10/2008 17:47 > To: [hidden email] > Subject: Dummy vs Continuous predictors in regression > > > > Hi, > > I'm running a linear regression model using cathegorical dummy (0/1 > as well as continuous predicotrs. > My questing is regarding interpretation of standardized beta coefficients > and using them for comparisons between dummies and continuous > predictors' effects. I understand the standardized beta values tell us the > number of standard deviations the outcome will change as a result of one > standard deviation change in predictor. Now, is this applicable to > cathegorical/dummy variables as well? > My cathegorical and continuous predictors are measured in different units > so standardized beta would be great for comparing their > impact on the outcome, however I'm not sure that, say .51 standardized beta > for cathegorical predictor and .51 standardized beta for continuous > predictor have the same "impact". Can anyone confirm? > > Thanks > > > > > ======= > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > ********************************************************************** > > > > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- Michael Pearmain Senior Statistical Analyst Google UK Ltd Belgrave House 76 Buckingham Palace Road London SW1W 9TQ United Kingdom t +44 (0) 2032191684 [hidden email] If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Michael Pearmain-2
I have to disagree a bit with this post. Mathematically it makes absolutely no difference which dummy you omit (absent exact singularity in a procedure not using a generalized inverse). Computationally, this is also almost true, since the estimation procedures use numerically stable algorithms to control computational error that could be introduced by near singularity.
Substantively, of course, it makes a big difference, because it changes the meaning of the coefficients. But you can calculate any estimable effect from any parameterization, and while it may be a little harder to get the standard errors if the effect isn't directly represented in the equation, calculating the standard errors from the coefficient covariance matrix will give the same result as any other variant. Regards, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Michael Pearmain Sent: Friday, October 17, 2008 8:00 AM To: [hidden email] Subject: Re: [SPSSX-L] Dummy vs Continuous predictors in regression This is just a minor bug bear of mind with regard to dummy variables, and people using them as a fixing solution, ( I agree they are useful, but if you understand why you are using them) Honestly don't mean to rant... "NB: for one category you don't need a dummy. The left out category is recoded to 0. It does not matter which category is left out. You obtain different b's when different left out category, but the quantifications (standardized B's) will not be different." Mathematically speaking it is very important to understand which category you are leaving out, even more so in ordinal data. Let's use a specification that uses dummies for test scores up to 10 (an ordinal variable). As mentioned above, we need to drop one of the dummies. Let's drop the dummy which takes on the value 1. Note again that the constant includes both baseline test scores and the lift due to the test scores of 1, since we cannot separately identify those. The analysis may lead to interesting insights, however. The results may show that only having scored 4 or 7 lead to a significant increase in pay say. Now let's drop the 10th, which will keep the model identified and enable us to quantify the effect of scores of one (which generate more than half the scores in our dataset lets say and the score 10 accounts for 0.5%). Exclusion of the dummy with infrequent 1's bring the covariate matrix to near singularity, which pushes up the standard errors of paramaters and can make the estimates insignificant. It's just another case of think what you are doing rather than just doing it Mike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Burleson,Joseph A.
Hi Joe,
Yes, you are right w.r.t Anita's example, i think it was a case of me reading the post to quickly, Apologies all, However it holds, as Joe suggest for multi-level variables. Mike On Fri, Oct 17, 2008 at 3:21 PM, Burleson,Joseph A. <[hidden email]>wrote: > Michael: > > I, in turn, beg your pardon if I am not understanding you correctly. > > But I believe that Anita meant just what she said, and that she is > correct: If you only have 2 categories (e.g., 0,1) comprising one > variable. > > Your example used a 10-categories (or 10-level) variable. What you say > makes perfect sense. > > Clearly if one codes the 2-level dummy as 0,1 versus 1,0, then the beta > is reversed in value. > > Hence, for a 2-level variable, it is already dummied. That is my > impression of what she meant. > > Joe Burleson > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Michael Pearmain > Sent: Friday, October 17, 2008 10:00 AM > To: [hidden email] > Subject: Re: Dummy vs Continuous predictors in regression > > This is just a minor bug bear of mind with regard to dummy variables, > and > people using them as a fixing solution, ( I agree they are useful, but > if > you understand why you are using them) > Honestly don't mean to rant... > > > "NB: for one category you don't need a dummy. The left out category is > recoded to 0. It does not matter which category is left out. You obtain > different b's when different left out category, but the quantifications > (standardized B's) will not be different." > > Mathematically speaking it is very important to understand which > category > you are leaving out, even more so in ordinal data. > > Let's use a specification that uses dummies for test scores up to 10 (an > ordinal variable). As mentioned above, we need to drop one of the > dummies. > Let's drop the dummy which takes on the value 1. > Note again that the constant includes both baseline test scores and the > lift > due to the test scores of 1, since we cannot separately identify those. > The analysis may lead to interesting insights, however. The results may > show that only having scored 4 or 7 lead to a significant increase in > pay > say. > > Now let's drop the 10th, which will keep the model identified and enable > us > to quantify the effect of scores of one (which generate more than half > the > scores in our dataset lets say and the score 10 accounts for 0.5%). > Exclusion of the dummy with infrequent 1's bring the covariate matrix to > near singularity, which pushes up the standard errors of paramaters and > can > make the estimates insignificant. > > It's just another case of think what you are doing rather than just > doing it > > Mike > > > > On Fri, Oct 10, 2008 at 2:22 AM, Kooij, A.J. van der < > [hidden email]> wrote: > > > With CATREG in the SPSS CATEGORIES add-on module you can analyze > > categorical variables (unordered and ordered) as well as numeric > variables. > > For categorical variables you do not need dummies and you will obtain > beta > > coeffcient for the variable as a whole. (CATREG is in menu under > Regression, > > Optimal Scaling). > > > > If you do not have CATEGORIES, you can compute CATREG results for > numeric > > dependent variable and nominal (unordered categorical) and numeric > > independent variables as follows: > > > > Use standardized dependent variable in the regression with dummies. > Recode > > the categories of a variable with the B coefficients. > > > > For example for variable v1 with 5 categories: > > > > compute quantv1 = v1. > > > > recode quantv1 (1=B1) (2=B2) (3=B3) (4=B4) (5=0). > > > > Then run descriptives, saving the standardized recoded variables. > > > > DESCRIPTIVES VARIABLES=quantv1 /STATISTICS=STDDEV /SAVE. > > > > The Std. dev. is the beta for the variable. The saved standardized > variable > > zquantv1 is the quantified variable: category values replaced with > optimal > > nominal quantifications. To create transformation plot: > > > > GRAPH /LINE(SIMPLE)=MEAN(zquantv1) BY v1. > > > > > > > > NB: for one category you don't need a dummy. The left out category is > > recoded to 0. It does not matter which category is left out. You > obtain > > different b's when different left out category, but the > quantifications > > (standardized B's) will not be different. > > > > > > > > Regards, > > > > Anita van der Kooij > > > > Data Theory Group > > > > Leiden University > > > > > > ________________________________ > > > > From: SPSSX(r) Discussion on behalf of paul wilson > > Sent: Thu 09/10/2008 17:47 > > To: [hidden email] > > Subject: Dummy vs Continuous predictors in regression > > > > > > > > Hi, > > > > I'm running a linear regression model using cathegorical dummy (0/1 > coding) > > as well as continuous predicotrs. > > My questing is regarding interpretation of standardized beta > coefficients > > and using them for comparisons between dummies and continuous > > predictors' effects. I understand the standardized beta values tell us > the > > number of standard deviations the outcome will change as a result of > one > > standard deviation change in predictor. Now, is this applicable to > > cathegorical/dummy variables as well? > > My cathegorical and continuous predictors are measured in different > units > > so standardized beta would be great for comparing their > > impact on the outcome, however I'm not sure that, say .51 standardized > beta > > for cathegorical predictor and .51 standardized beta for continuous > > predictor have the same "impact". Can anyone confirm? > > > > Thanks > > > > > > > > > > ======= > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except > the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send the command > > INFO REFCARD > > > > > > > > ********************************************************************** > > This email and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they > > are addressed. If you have received this email in error please notify > > the system manager. > > ********************************************************************** > > > > > > > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except > the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send the command > > INFO REFCARD > > > > > > -- > Michael Pearmain > Senior Statistical Analyst > > > Google UK Ltd > Belgrave House > 76 Buckingham Palace Road > London SW1W 9TQ > United Kingdom > t +44 (0) 2032191684 > [hidden email] > > If you received this communication by mistake, please don't forward it > to > anyone else (it may contain confidential or privileged information), > please > erase all copies of it, including all attachments, and please let the > sender > know it went to the wrong person. Thanks. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- Michael Pearmain Senior Statistical Analyst Google UK Ltd Belgrave House 76 Buckingham Palace Road London SW1W 9TQ United Kingdom t +44 (0) 2032191684 [hidden email] If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
No, Joe was not right, I didn't mean variables with just 2 categories. To illustrate that it does not matter which category is left out, I inserted syntax below for 2 predictors with 3 and 4 categories. In the 2 regressions with dummies, different categories for both were left out. Both result in the same beta's: the Std.dev. of the recoded standardized variables (called quantified or transformed variables in CATREG). Also, the transformed variables are the same.
btw: if you use dummy regression for an ordinal variable you are treating the variable as nominal. Using dummies and treat it as ordinal, you need "Non-negative least squares" regression. With CATREG you can choose to treat variables as nominal or ordinal. Regards, Anita data list free/a b y. begin data. 1 1 6 1 1 7 1 1 5 1 2 7 1 2 6 1 2 8 1 3 4 1 3 5 1 4 5 1 4 3 1 4 2 1 4 4 2 1 4 2 1 5 2 2 4 2 2 3 2 2 5 2 3 9 2 3 8 2 3 7 2 4 7 2 4 8 3 1 10 3 1 8 3 1 9 3 2 5 3 2 3 3 3 7 3 3 6 3 4 10 3 4 8 3 4 9 end data. VECTOR a(3F8.0). LOOP #i = 1 TO 3. COMPUTE a(#i)= (a = (#i)) . END LOOP. EXECUTE. VECTOR b(4F8.0). LOOP #i = 1 TO 4. COMPUTE b(#i)= (b = (#i)) . END LOOP. EXECUTE. DESCRIPTIVES VARIABLES=y /SAVE . REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT zy /METHOD=ENTER a1 a2 b1 b2 b3 . compute quanta3 = a. recode quanta3 (1 = -1.0297) (2=-.6323) (3=0). compute quantb4 = b. recode quantb4 (1 = .1874) (2=-.4767) (3=.1267) (4=0). DESCRIPTIVES VARIABLES=quanta3 quantb4 /SAVE /STATISTICS STDDEV . REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT zy /METHOD=ENTER a1 a3 b1 b2 b4 . compute quanta2 = a. recode quanta2 (1 = -.3974) (2= 0) (3= .6323). compute quantb3 = b. recode quantb3 (1 = .0607) (2=-.6034) (3=0) (4=-.1267). DESCRIPTIVES VARIABLES=quanta2 quantb3 /SAVE /STATISTICS STDDEV . *transformation plots. GRAPH /LINE(SIMPLE)=MEAN(Zquanta3) BY a . GRAPH /LINE(SIMPLE)=MEAN(Zquanta2) BY a . GRAPH /LINE(SIMPLE)=MEAN(Zquantb4) BY b . GRAPH /LINE(SIMPLE)=MEAN(Zquantb3) BY b . ________________________________ From: SPSSX(r) Discussion on behalf of Michael Pearmain Sent: Fri 17/10/2008 16:30 To: [hidden email] Subject: Re: Dummy vs Continuous predictors in regression Hi Joe, Yes, you are right w.r.t Anita's example, i think it was a case of me reading the post to quickly, Apologies all, However it holds, as Joe suggest for multi-level variables. Mike On Fri, Oct 17, 2008 at 3:21 PM, Burleson,Joseph A. <[hidden email]>wrote: > Michael: > > I, in turn, beg your pardon if I am not understanding you correctly. > > But I believe that Anita meant just what she said, and that she is > correct: If you only have 2 categories (e.g., 0,1) comprising one > variable. > > Your example used a 10-categories (or 10-level) variable. What you say > makes perfect sense. > > Clearly if one codes the 2-level dummy as 0,1 versus 1,0, then the beta > is reversed in value. > > Hence, for a 2-level variable, it is already dummied. That is my > impression of what she meant. > > Joe Burleson > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Michael Pearmain > Sent: Friday, October 17, 2008 10:00 AM > To: [hidden email] > Subject: Re: Dummy vs Continuous predictors in regression > > This is just a minor bug bear of mind with regard to dummy variables, > and > people using them as a fixing solution, ( I agree they are useful, but > if > you understand why you are using them) > Honestly don't mean to rant... > > > "NB: for one category you don't need a dummy. The left out category is > recoded to 0. It does not matter which category is left out. You obtain > different b's when different left out category, but the quantifications > (standardized B's) will not be different." > > Mathematically speaking it is very important to understand which > category > you are leaving out, even more so in ordinal data. > > Let's use a specification that uses dummies for test scores up to 10 (an > ordinal variable). As mentioned above, we need to drop one of the > dummies. > Let's drop the dummy which takes on the value 1. > Note again that the constant includes both baseline test scores and the > lift > due to the test scores of 1, since we cannot separately identify those. > The analysis may lead to interesting insights, however. The results may > show that only having scored 4 or 7 lead to a significant increase in > pay > say. > > Now let's drop the 10th, which will keep the model identified and enable > us > to quantify the effect of scores of one (which generate more than half > the > scores in our dataset lets say and the score 10 accounts for 0.5%). > Exclusion of the dummy with infrequent 1's bring the covariate matrix to > near singularity, which pushes up the standard errors of paramaters and > can > make the estimates insignificant. > > It's just another case of think what you are doing rather than just > doing it > > Mike > > > > On Fri, Oct 10, 2008 at 2:22 AM, Kooij, A.J. van der < > [hidden email]> wrote: > > > With CATREG in the SPSS CATEGORIES add-on module you can analyze > > categorical variables (unordered and ordered) as well as numeric > variables. > > For categorical variables you do not need dummies and you will obtain > beta > > coeffcient for the variable as a whole. (CATREG is in menu under > Regression, > > Optimal Scaling). > > > > If you do not have CATEGORIES, you can compute CATREG results for > numeric > > dependent variable and nominal (unordered categorical) and numeric > > independent variables as follows: > > > > Use standardized dependent variable in the regression with dummies. > Recode > > the categories of a variable with the B coefficients. > > > > For example for variable v1 with 5 categories: > > > > compute quantv1 = v1. > > > > recode quantv1 (1=B1) (2=B2) (3=B3) (4=B4) (5=0). > > > > Then run descriptives, saving the standardized recoded variables. > > > > DESCRIPTIVES VARIABLES=quantv1 /STATISTICS=STDDEV /SAVE. > > > > The Std. dev. is the beta for the variable. The saved standardized > variable > > zquantv1 is the quantified variable: category values replaced with > optimal > > nominal quantifications. To create transformation plot: > > > > GRAPH /LINE(SIMPLE)=MEAN(zquantv1) BY v1. > > > > > > > > NB: for one category you don't need a dummy. The left out category is > > recoded to 0. It does not matter which category is left out. You > obtain > > different b's when different left out category, but the > quantifications > > (standardized B's) will not be different. > > > > > > > > Regards, > > > > Anita van der Kooij > > > > Data Theory Group > > > > Leiden University > > > > > > ________________________________ > > > > From: SPSSX(r) Discussion on behalf of paul wilson > > Sent: Thu 09/10/2008 17:47 > > To: [hidden email] > > Subject: Dummy vs Continuous predictors in regression > > > > > > > > Hi, > > > > I'm running a linear regression model using cathegorical dummy (0/1 > coding) > > as well as continuous predicotrs. > > My questing is regarding interpretation of standardized beta > coefficients > > and using them for comparisons between dummies and continuous > > predictors' effects. I understand the standardized beta values tell us > the > > number of standard deviations the outcome will change as a result of > one > > standard deviation change in predictor. Now, is this applicable to > > cathegorical/dummy variables as well? > > My cathegorical and continuous predictors are measured in different > units > > so standardized beta would be great for comparing their > > impact on the outcome, however I'm not sure that, say .51 standardized > beta > > for cathegorical predictor and .51 standardized beta for continuous > > predictor have the same "impact". Can anyone confirm? > > > > Thanks > > > > > > > > > > ======= > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except > the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send the command > > INFO REFCARD > > > > > > > > ********************************************************************** > > This email and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they > > are addressed. If you have received this email in error please notify > > the system manager. > > ********************************************************************** > > > > > > > > To manage your subscription to SPSSX-L, send a message to > > [hidden email] (not to SPSSX-L), with no body text except > the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send the command > > INFO REFCARD > > > > > > -- > Michael Pearmain > Senior Statistical Analyst > > > Google UK Ltd > Belgrave House > 76 Buckingham Palace Road > London SW1W 9TQ > United Kingdom > t +44 (0) 2032191684 > [hidden email] > > If you received this communication by mistake, please don't forward it > to > anyone else (it may contain confidential or privileged information), > please > erase all copies of it, including all attachments, and please let the > sender > know it went to the wrong person. Thanks. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- Michael Pearmain Senior Statistical Analyst Google UK Ltd Belgrave House 76 Buckingham Palace Road London SW1W 9TQ United Kingdom t +44 (0) 2032191684 [hidden email] If you received this communication by mistake, please don't forward it to anyone else (it may contain confidential or privileged information), please erase all copies of it, including all attachments, and please let the sender know it went to the wrong person. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
