I've run into a mystery running a Generalized Linear Model
under SPSS 15.0.1. I have a grouped binomial dependent trial/response variable (Event of Trial) and a logit link function. I have two numerical scale variables, C and L, and two dummy variables, S and P, coded 1 or 0 based on presence or absence of certain chaacteristics. The model I'm checking has both main effects and interactions. To handle over-dispersion, I'm estimating parameters using a scale parameter based on the model's Pearson Chi-Square. See command syntax below. The problem is that I get slightly *different results* when I run the model using ascending category order for factors (the default) and using a descending category order (this makes the output show the effect of the presence of the factor, rather than of its absence). Specifically, the coefficients and the standard errors on the numerical scale variables are different. As a result, the confidence intervals, Wald Chi-Squares, and significances also vary. The coefficients and standard errors for the factors and for the interaction variables do not change (other than to change sign). The Intercept also changes (no doubt, in order to account for the change in sign of the factor variable included as a main effect). Also, neither the Goodness of Fit table nor the calculated scale parameter changes. Oddly enough, for the *ascending* model, the entries in the Test of Model Effects table (ToME) (Wald Chi-Square and significance) --are different from the corresponding entries in the Parameter Estimates table (PE), but --are identical to ToME and PE tables for the *descending* model. So. Why is this happening? Is it me or have I found an "easter-egg" in SPSS 15.0.1? Thanks. Gary Rosin <[hidden email]> ----------------------------------- Command Syntax: the same in both models, except for (ORDER=DESCENDING): * Generalized Linear Models. GENLIN Event OF Trial BY S P (ORDER=ASCENDING) WITH C L /MODEL S C S*C P*C L P*L INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3 CILEVEL=95 LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION(EXPONENTIATED) COVB CORB HISTORY(1). |
Hi Gary,
What you've found is an "Easter Egg" of model design. If A is a 0-1 factor and X is a covariate, then the model: Intercept A X A*X produces two redundant parameters in the estimates table: [A=1] [A=1]*X The [A=1] is identified with the intercept and [A=1]*X is identified with the X term. When you change the order of A, then the model produces two redundant parameters: [A=0] [A=0]*X The [A=0] is identified with the intercept and [A=0]*X is identified with the X term. The factor and factor-covariate interaction coefficients simply change sign, but the coefficient for the X term in the second model is the sum of the coefficients for X and [A=0]*X terms in the first model. For example: GET FILE='1991 U.S. General Social Survey.sav'. select if (happy=1 or happy=2). * Ascending. GENLIN happy BY sex (ORDER=ASCENDING) WITH life /MODEL sex life sex*life INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. * For SEX=1: .905 + .186 - 1.029*LIFE - .175*LIFE = 1.091 - 1.204*LIFE. * For SEX=2: .905 - 1.029*LIFE. * Descending. GENLIN happy BY sex (ORDER=DESCENDING) WITH life /MODEL sex life sex*life INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. * For SEX=1: 1.091 - 1.204*LIFE. * For SEX=2: 1.091 - .186 - 1.204*LIFE + .175*LIFE = .905 - 1.029*LIFE. Note that this will happen in any procedure on any statistical product; we've just made it easier to reproduce this result in Genlin. Best look to the Tests of Model Effects table in any model to help you determine whether the model term is significant. Cheers, Alex -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gary Rosin Sent: Thursday, January 11, 2007 4:16 PM To: [hidden email] Subject: Problems with GzLM Category Order for Factors I've run into a mystery running a Generalized Linear Model under SPSS 15.0.1. I have a grouped binomial dependent trial/response variable (Event of Trial) and a logit link function. I have two numerical scale variables, C and L, and two dummy variables, S and P, coded 1 or 0 based on presence or absence of certain chaacteristics. The model I'm checking has both main effects and interactions. To handle over-dispersion, I'm estimating parameters using a scale parameter based on the model's Pearson Chi-Square. See command syntax below. The problem is that I get slightly *different results* when I run the model using ascending category order for factors (the default) and using a descending category order (this makes the output show the effect of the presence of the factor, rather than of its absence). Specifically, the coefficients and the standard errors on the numerical scale variables are different. As a result, the confidence intervals, Wald Chi-Squares, and significances also vary. The coefficients and standard errors for the factors and for the interaction variables do not change (other than to change sign). The Intercept also changes (no doubt, in order to account for the change in sign of the factor variable included as a main effect). Also, neither the Goodness of Fit table nor the calculated scale parameter changes. Oddly enough, for the *ascending* model, the entries in the Test of Model Effects table (ToME) (Wald Chi-Square and significance) --are different from the corresponding entries in the Parameter Estimates table (PE), but --are identical to ToME and PE tables for the *descending* model. So. Why is this happening? Is it me or have I found an "easter-egg" in SPSS 15.0.1? Thanks. Gary Rosin <[hidden email]> ----------------------------------- Command Syntax: the same in both models, except for (ORDER=DESCENDING): * Generalized Linear Models. GENLIN Event OF Trial BY S P (ORDER=ASCENDING) WITH C L /MODEL S C S*C P*C L P*L INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3 CILEVEL=95 LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION(EXPONENTIATED) COVB CORB HISTORY(1). |
Thanks, Alex. A couple of questions, if you would.
1. If the significances given in the parameter estimates are for testing the hypothesis that the parameter (coefficient) is 0, what does the Test of Model Effects show (test?) 2. I notice that the Wald Chi-squared given for the covariate effects and parameters are different in both ascending and descending models. That said, the values in the descending model are closer to those in the ToME than those in the ascending model. Is that why descending is the default? By extension, does that make descending the preferred, or at most usual, approach? Gary At 01:03 PM 1/12/2007, you wrote: >Hi Gary, > >What you've found is an "Easter Egg" of model design. >If A is a 0-1 factor and X is a covariate, then the model: > > Intercept A X A*X > >produces two redundant parameters in the estimates table: > > [A=1] > [A=1]*X > >The [A=1] is identified with the intercept and [A=1]*X is identified >with the X term. When you change the order of A, then the model >produces two redundant parameters: > > [A=0] > [A=0]*X > >The [A=0] is identified with the intercept and [A=0]*X is identified >with the X term. The factor and factor-covariate interaction coefficients >simply change sign, but the coefficient for the X term in the second >model is the sum of the coefficients for X and [A=0]*X terms in the >irst model. For example: > >GET FILE='1991 U.S. General Social Survey.sav'. >select if (happy=1 or happy=2). > >* Ascending. >GENLIN happy BY sex (ORDER=ASCENDING) WITH life > /MODEL sex life sex*life > INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. >* For SEX=1: .905 + .186 - 1.029*LIFE - .175*LIFE = 1.091 - 1.204*LIFE. >* For SEX=2: .905 - 1.029*LIFE. > >* Descending. >GENLIN happy BY sex (ORDER=DESCENDING) WITH life > /MODEL sex life sex*life > INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. >* For SEX=1: 1.091 - 1.204*LIFE. >* For SEX=2: 1.091 - .186 - 1.204*LIFE + .175*LIFE = .905 - 1.029*LIFE. > >Note that this will happen in any procedure on any statistical product; we've >just made it easier to reproduce this result in Genlin. Best look to the >Tests of Model Effects table in any model to help you determine whether >the model term is significant. > >Cheers, >Alex > > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of >Gary Rosin >Sent: Thursday, January 11, 2007 4:16 PM >To: [hidden email] >Subject: Problems with GzLM Category Order for Factors > >I've run into a mystery running a Generalized Linear Model >under SPSS 15.0.1. > >I have a grouped binomial dependent trial/response variable >(Event of Trial) and a logit link function. I have two numerical >scale variables, C and L, and two dummy variables, S and P, >coded 1 or 0 based on presence or absence of certain >chaacteristics. The model I'm checking has both main effects >and interactions. To handle over-dispersion, I'm estimating >parameters using a scale parameter based on the model's >Pearson Chi-Square. See command syntax below. > >The problem is that I get slightly *different results* when I run >the model using ascending category order for factors (the default) >and using a descending category order (this makes the output >show the effect of the presence of the factor, rather than of its >absence). > >Specifically, the coefficients and the standard errors on the numerical >scale variables are different. As a result, the confidence intervals, >Wald Chi-Squares, and significances also vary. > >The coefficients and standard errors for the factors and for the >interaction variables do not change (other than to change sign). >The Intercept also changes (no doubt, in order to account for the >change in sign of the factor variable included as a main effect). > >Also, neither the Goodness of Fit table nor the calculated scale >parameter changes. > >Oddly enough, for the *ascending* model, the entries in the Test of >Model Effects table (ToME) (Wald Chi-Square and significance) > > --are different from the corresponding entries in the Parameter > Estimates table (PE), but > > --are identical to ToME and PE tables for the *descending* > model. > >So. Why is this happening? Is it me or have I found an "easter-egg" >in SPSS 15.0.1? > >Thanks. > >Gary Rosin <[hidden email]> > > >----------------------------------- > >Command Syntax: the same in both models, except for >(ORDER=DESCENDING): > > >* Generalized Linear Models. >GENLIN > Event OF Trial > BY S P > (ORDER=ASCENDING) > WITH C L > /MODEL > S C S*C P*C L P*L > INTERCEPT=YES > DISTRIBUTION=BINOMIAL > LINK=LOGIT > /CRITERIA METHOD=FISHER(1) SCALE=PEARSON > COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 > PCONVERGE=1E-006(ABSOLUTE) > SINGULAR=1E-012 > ANALYSISTYPE=3 CILEVEL=95 LIKELIHOOD=FULL > /MISSING CLASSMISSING=EXCLUDE > /PRINT CPS DESCRIPTIVES MODELINFO FIT > SUMMARY SOLUTION(EXPONENTIATED) COVB CORB > HISTORY(1). |
1. Add a /PRINT SUMMARY SOLUTION LMATRIX to the commands below (or add the LMATRIX keyword to your PRINT subcommand) and see the output under "Type III Estimable Functions". These are the contrasts used for the Tests of Model Effects.
2. Both tests use a chi-square statistic, but they're testing different things. Neither descending nor ascending is preferred, AFAIK. Alex -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gary Rosin Sent: Friday, January 12, 2007 3:09 PM To: [hidden email] Subject: Re: Problems with GzLM Category Order for Factors Thanks, Alex. A couple of questions, if you would. 1. If the significances given in the parameter estimates are for testing the hypothesis that the parameter (coefficient) is 0, what does the Test of Model Effects show (test?) 2. I notice that the Wald Chi-squared given for the covariate effects and parameters are different in both ascending and descending models. That said, the values in the descending model are closer to those in the ToME than those in the ascending model. Is that why descending is the default? By extension, does that make descending the preferred, or at most usual, approach? Gary At 01:03 PM 1/12/2007, you wrote: >Hi Gary, > >What you've found is an "Easter Egg" of model design. >If A is a 0-1 factor and X is a covariate, then the model: > > Intercept A X A*X > >produces two redundant parameters in the estimates table: > > [A=1] > [A=1]*X > >The [A=1] is identified with the intercept and [A=1]*X is identified >with the X term. When you change the order of A, then the model >produces two redundant parameters: > > [A=0] > [A=0]*X > >The [A=0] is identified with the intercept and [A=0]*X is identified >with the X term. The factor and factor-covariate interaction coefficients >simply change sign, but the coefficient for the X term in the second >model is the sum of the coefficients for X and [A=0]*X terms in the >irst model. For example: > >GET FILE='1991 U.S. General Social Survey.sav'. >select if (happy=1 or happy=2). > >* Ascending. >GENLIN happy BY sex (ORDER=ASCENDING) WITH life > /MODEL sex life sex*life > INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. >* For SEX=1: .905 + .186 - 1.029*LIFE - .175*LIFE = 1.091 - 1.204*LIFE. >* For SEX=2: .905 - 1.029*LIFE. > >* Descending. >GENLIN happy BY sex (ORDER=DESCENDING) WITH life > /MODEL sex life sex*life > INTERCEPT=YES DISTRIBUTION=BINOMIAL LINK=LOGIT. >* For SEX=1: 1.091 - 1.204*LIFE. >* For SEX=2: 1.091 - .186 - 1.204*LIFE + .175*LIFE = .905 - 1.029*LIFE. > >Note that this will happen in any procedure on any statistical product; we've >just made it easier to reproduce this result in Genlin. Best look to the >Tests of Model Effects table in any model to help you determine whether >the model term is significant. > >Cheers, >Alex > > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of >Gary Rosin >Sent: Thursday, January 11, 2007 4:16 PM >To: [hidden email] >Subject: Problems with GzLM Category Order for Factors > >I've run into a mystery running a Generalized Linear Model >under SPSS 15.0.1. > >I have a grouped binomial dependent trial/response variable >(Event of Trial) and a logit link function. I have two numerical >scale variables, C and L, and two dummy variables, S and P, >coded 1 or 0 based on presence or absence of certain >chaacteristics. The model I'm checking has both main effects >and interactions. To handle over-dispersion, I'm estimating >parameters using a scale parameter based on the model's >Pearson Chi-Square. See command syntax below. > >The problem is that I get slightly *different results* when I run >the model using ascending category order for factors (the default) >and using a descending category order (this makes the output >show the effect of the presence of the factor, rather than of its >absence). > >Specifically, the coefficients and the standard errors on the numerical >scale variables are different. As a result, the confidence intervals, >Wald Chi-Squares, and significances also vary. > >The coefficients and standard errors for the factors and for the >interaction variables do not change (other than to change sign). >The Intercept also changes (no doubt, in order to account for the >change in sign of the factor variable included as a main effect). > >Also, neither the Goodness of Fit table nor the calculated scale >parameter changes. > >Oddly enough, for the *ascending* model, the entries in the Test of >Model Effects table (ToME) (Wald Chi-Square and significance) > > --are different from the corresponding entries in the Parameter > Estimates table (PE), but > > --are identical to ToME and PE tables for the *descending* > model. > >So. Why is this happening? Is it me or have I found an "easter-egg" >in SPSS 15.0.1? > >Thanks. > >Gary Rosin <[hidden email]> > > >----------------------------------- > >Command Syntax: the same in both models, except for >(ORDER=DESCENDING): > > >* Generalized Linear Models. >GENLIN > Event OF Trial > BY S P > (ORDER=ASCENDING) > WITH C L > /MODEL > S C S*C P*C L P*L > INTERCEPT=YES > DISTRIBUTION=BINOMIAL > LINK=LOGIT > /CRITERIA METHOD=FISHER(1) SCALE=PEARSON > COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 > PCONVERGE=1E-006(ABSOLUTE) > SINGULAR=1E-012 > ANALYSISTYPE=3 CILEVEL=95 LIKELIHOOD=FULL > /MISSING CLASSMISSING=EXCLUDE > /PRINT CPS DESCRIPTIVES MODELINFO FIT > SUMMARY SOLUTION(EXPONENTIATED) COVB CORB > HISTORY(1). |
Free forum by Nabble | Edit this page |