Percent Change Crosstabs

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Percent Change Crosstabs

DKUKEC
Greetings, Is it possible to calculate percent change in 2x2 crosstab? If yes, can you please point me in the right direction. I have a crosstab with two vars (treatment group (1) and control group (0)) x (recidivist (1) non-recidivist (0)). I would like to be able to generate the percent difference in recivisim between the treatment group and the control group, and then percent change between the two vars. I have reviewed the archives and internet; however, I was unsuccessful in my search. Thank you, Damir
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
It sounds to me as if you want the risk difference (rather than the odds ratio, for example).  If so, you might want to look at this article by Cheung (2007):

   http://aje.oxfordjournals.org/content/166/11/1337.full

I've also included below some syntax I wrote after looking at that article myself a couple years ago.  

HTH.



* ==================================================================
*  File:   risk difference with robust SEs.SPS .
*  Date:   12-May-2011 .
*  Author:   Bruce Weaver, bweaver@lakeheadu.ca
* ================================================================== .

* Compute Huber-White robust variance when using
  using linear regression to obtain risk difference.

* Macro to define folder with SPSS sample data files.
define !spssdata () "C:\SPSSdata\" !enddefine.

new file.
dataset close all.
get file = !spssdata + "bankloan.sav".

* Variables in model:
  Y = default (1=Yes, 0=No)
  X1 = Age
  X2 = Education (3 indicator variables)
  X3 = Debt to Income Ratio
.

select if nmiss(default, age, ed, debtinc) EQ 0.

freq ed.
compute ed1 = (ed EQ 1).
compute ed2 = (ed EQ 2).
compute ed3 = (ed EQ 3).
compute ed45 = any(ed,4,5).
format ed1 to ed45 (f1.0).
crosstabs ed by ed1 to ed45.

graph histogram age.
* Center AGE on 20 (value near the minimum), and make unit 5 years.
compute AGE.20.5 = (age - 20)/5.
graph histogram age.20.5 .

graph histogram debtinc.
* Set unit = 5 for DEBTINC .
compute debtinc.0.5 = debtinc/5.
graph histogram debtinc.0.5 .


* First try GENLIN with LINK=IDENTITY and ERROR=BINOMIAL.

* Generalized Linear Models.
GENLIN default (REFERENCE=FIRST) WITH age ed2 ed3 ed45 debtinc
  /MODEL age ed2 ed3 ed45 debtinc INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=IDENTITY
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL    
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

* There are problems, which is not unusual according to Cheung (2007).

* Now try OLS linear regression -- save fitted values & residuals.

REGRESSION
  /STATISTICS COEFF
  /DEPENDENT default
  /METHOD=ENTER age.20.5 ed2 ed3 ed45 debtinc.0.5
  /SAVE PRED (fitted_y) RESID (residual).

* The constant = the risk (of Y being 1) when all explanatory variable = 0.
* The coefficients for the other variables are risk differences associated
* with a one-unit increase in that explanatory variable, controlling for the others.
* But because the Y-variable is conditionally dichotmous (rather than contitionally
* normal), the standard errors are not correct. This is why the Huber-White robust
* errors are needed.

* Run the same model via GENLIN.
GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=MODEL PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* Results match those from REGRESSION very closely--slight
  differences are due to use of MLE rather than OLS.
* Note that this model still has incorrect standard errors.

* ---------------------------------------------------------------------- .
* Same model, but with COVB=ROBUST.
* This should give the Huber-White standard errors.
* Reference: https://www-304.ibm.com/support/docview.wss?uid=swg21477323 .
* ---------------------------------------------------------------------- .

GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* The coefficients are the same as before, but
* now we have the correct, robust standard errors.

* ================================================================== .



DKUKEC wrote
Greetings,

Is it possible to calculate percent change in 2x2 crosstab?  If yes, can you please point me in the right direction.

I have a crosstab with two vars (treatment group (1) and control group (0)) x (recidivist (1) non-recidivist (0)).  I would like to be able to generate the percent difference in recivisim between the treatment group and the control group, and then percent change between the two vars.   I have reviewed the archives and internet; however, I was unsuccessful in my search.

Thank you,
Damir
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
Bruce,
With the simple 2X2 table discussed by the OP and a decent sample size, I would expect a standard binary logistic regression model fit via the GENLIN procedure to yield a reasonable standard error. The risk difference test of interest is provided in the EMMEANS Table. A little simulation experiment is provided below my name.
 
Admittedly, I haven't given this a tremendous amount of thought. I'll examine the article and your code when time permits.
Ryan
--
*Generate Data.
SET SEED 98765432.
NEW FILE.
INPUT PROGRAM.
LOOP ID= 1 to 100.
COMPUTE x= rv.bernoulli(0.5).
COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

* Generalized Linear Models.
GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
/MODEL x INTERCEPT=YES
DISTRIBUTION=BINOMIAL LINK=LOGIT
/EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE PADJUST=LSD
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
On Tue, May 28, 2013 at 6:43 PM, Bruce Weaver <[hidden email]> wrote:
It sounds to me as if you want the /risk difference/ (rather than the odds
ratio, for example).  If so, you might want to look at this article by
Cheung (2007):

   http://aje.oxfordjournals.org/content/166/11/1337.full

I've also included below some syntax I wrote after looking at that article
myself a couple years ago.

HTH.



* ==================================================================
*  File:        risk difference with robust SEs.SPS .
*  Date:        12-May-2011 .
*  Author:   Bruce Weaver, [hidden email]
* ================================================================== .

* Compute Huber-White robust variance when using
  using linear regression to obtain risk difference.

* Macro to define folder with SPSS sample data files.
define !spssdata () "C:\SPSSdata\" !enddefine.

new file.
dataset close all.
get file = !spssdata + "bankloan.sav".

* Variables in model:
  Y = default (1=Yes, 0=No)
  X1 = Age
  X2 = Education (3 indicator variables)
  X3 = Debt to Income Ratio
.

select if nmiss(default, age, ed, debtinc) EQ 0.

freq ed.
compute ed1 = (ed EQ 1).
compute ed2 = (ed EQ 2).
compute ed3 = (ed EQ 3).
compute ed45 = any(ed,4,5).
format ed1 to ed45 (f1.0).
crosstabs ed by ed1 to ed45.

graph histogram age.
* Center AGE on 20 (value near the minimum), and make unit 5 years.
compute AGE.20.5 = (age - 20)/5.
graph histogram age.20.5 .

graph histogram debtinc.
* Set unit = 5 for DEBTINC .
compute debtinc.0.5 = debtinc/5.
graph histogram debtinc.0.5 .


* First try GENLIN with LINK=IDENTITY and ERROR=BINOMIAL.

* Generalized Linear Models.
GENLIN default (REFERENCE=FIRST) WITH age ed2 ed3 ed45 debtinc
  /MODEL age ed2 ed3 ed45 debtinc INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=IDENTITY
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100
MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR) CILEVEL=95
CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

* There are problems, which is not unusual according to Cheung (2007).

* Now try OLS linear regression -- save fitted values & residuals.

REGRESSION
  /STATISTICS COEFF
  /DEPENDENT default
  /METHOD=ENTER age.20.5 ed2 ed3 ed45 debtinc.0.5
  /SAVE PRED (fitted_y) RESID (residual).

* The constant = the risk (of Y being 1) when all explanatory variable = 0.
* The coefficients for the other variables are risk differences associated
* with a one-unit increase in that explanatory variable, controlling for the
others.
* But because the Y-variable is conditionally dichotmous (rather than
contitionally
* normal), the standard errors are not correct. This is why the Huber-White
robust
* errors are needed.

* Run the same model via GENLIN.
GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=MODEL PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* Results match those from REGRESSION very closely--slight
  differences are due to use of MLE rather than OLS.
* Note that this model still has incorrect standard errors.

* ---------------------------------------------------------------------- .
* Same model, but with COVB=ROBUST.
* This should give the Huber-White standard errors.
* Reference: https://www-304.ibm.com/support/docview.wss?uid=swg21477323 .
* ---------------------------------------------------------------------- .

GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* The coefficients are the same as before, but
* now we have the correct, robust standard errors.

* ================================================================== .




DKUKEC wrote
> Greetings,
>
> Is it possible to calculate percent change in 2x2 crosstab?  If yes, can
> you please point me in the right direction.
>
> I have a crosstab with two vars (treatment group (1) and control group
> (0)) x (recidivist (1) non-recidivist (0)).  I would like to be able to
> generate the percent difference in recivisim between the treatment group
> and the control group, and then percent change between the two vars.   I
> have reviewed the archives and internet; however, I was unsuccessful in my
> search.
>
> Thank you,
> Damir





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720443.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

DKUKEC
In reply to this post by DKUKEC
Thank you Ryan & Bruce, Very much appreciate your replies and suggested syntax. I am looking for the following computations... for example: % Difference in recidivism rate = Treatment Recidivism % - Control Recidivism %. % Change in the recidivism = Difference % / Control Recidivism %. *************************************************************************** 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0% Control 48.0% 52.0% 100.0% ****** Would like to compute %************************* Difference -6.1% % Change -12.7% ***************************************************** Sincerely, Damir
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
In reply to this post by Ryan
Hi Bruce,
 
I decided to fit the final model you suggested in your post using a single dichotomous IV ("x"), and as far as I can tell, the results are identical to those produced by the standard binary logistic regression model. Take a look at the line for "x" in the Parameter Estimates output, and compare those results to EMMEANS output from the binary logistic regression analysis. The simulation experiment is below.
 
Curious. Did I set up the model you suggested correctly? Are you finding the same results as I am when you run the simulation code below?
 
Best,
 
Ryan
 
*Generate Data.
 SET SEED 98765432.
NEW FILE.
INPUT PROGRAM.
 LOOP ID= 1 to 100.
 COMPUTE x= rv.bernoulli(0.5).
 COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
 END CASE.
 END LOOP.
 END FILE.
END INPUT PROGRAM.
EXECUTE.
 
GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
 /MODEL x INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOGIT
 /EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE PADJUST=LSD
/MISSING CLASSMISSING=EXCLUDE
 /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
 
GENLIN y BY x
   /MODEL x INTERCEPT=YES
  DISTRIBUTION=NORMAL LINK=IDENTITY
   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
 ANALYSISTYPE=3(LR)
     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
   /MISSING CLASSMISSING=EXCLUDE
   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.


On Tue, May 28, 2013 at 8:34 PM, Ryan Black <[hidden email]> wrote:
Bruce,
With the simple 2X2 table discussed by the OP and a decent sample size, I would expect a standard binary logistic regression model fit via the GENLIN procedure to yield a reasonable standard error. The risk difference test of interest is provided in the EMMEANS Table. A little simulation experiment is provided below my name.
 
Admittedly, I haven't given this a tremendous amount of thought. I'll examine the article and your code when time permits.
Ryan
--
*Generate Data.
SET SEED 98765432.
NEW FILE.
INPUT PROGRAM.
LOOP ID= 1 to 100.
COMPUTE x= rv.bernoulli(0.5).
COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

* Generalized Linear Models.
GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
/MODEL x INTERCEPT=YES
DISTRIBUTION=BINOMIAL LINK=LOGIT
/EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE PADJUST=LSD

/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
On Tue, May 28, 2013 at 6:43 PM, Bruce Weaver <[hidden email]> wrote:
It sounds to me as if you want the /risk difference/ (rather than the odds
ratio, for example).  If so, you might want to look at this article by
Cheung (2007):

   http://aje.oxfordjournals.org/content/166/11/1337.full

I've also included below some syntax I wrote after looking at that article
myself a couple years ago.

HTH.



* ==================================================================
*  File:        risk difference with robust SEs.SPS .
*  Date:        12-May-2011 .
*  Author:   Bruce Weaver, [hidden email]
* ================================================================== .

* Compute Huber-White robust variance when using
  using linear regression to obtain risk difference.

* Macro to define folder with SPSS sample data files.
define !spssdata () "C:\SPSSdata\" !enddefine.

new file.
dataset close all.
get file = !spssdata + "bankloan.sav".

* Variables in model:
  Y = default (1=Yes, 0=No)
  X1 = Age
  X2 = Education (3 indicator variables)
  X3 = Debt to Income Ratio
.

select if nmiss(default, age, ed, debtinc) EQ 0.

freq ed.
compute ed1 = (ed EQ 1).
compute ed2 = (ed EQ 2).
compute ed3 = (ed EQ 3).
compute ed45 = any(ed,4,5).
format ed1 to ed45 (f1.0).
crosstabs ed by ed1 to ed45.

graph histogram age.
* Center AGE on 20 (value near the minimum), and make unit 5 years.
compute AGE.20.5 = (age - 20)/5.
graph histogram age.20.5 .

graph histogram debtinc.
* Set unit = 5 for DEBTINC .
compute debtinc.0.5 = debtinc/5.
graph histogram debtinc.0.5 .


* First try GENLIN with LINK=IDENTITY and ERROR=BINOMIAL.

* Generalized Linear Models.
GENLIN default (REFERENCE=FIRST) WITH age ed2 ed3 ed45 debtinc
  /MODEL age ed2 ed3 ed45 debtinc INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=IDENTITY
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100
MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR) CILEVEL=95
CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

* There are problems, which is not unusual according to Cheung (2007).

* Now try OLS linear regression -- save fitted values & residuals.

REGRESSION
  /STATISTICS COEFF
  /DEPENDENT default
  /METHOD=ENTER age.20.5 ed2 ed3 ed45 debtinc.0.5
  /SAVE PRED (fitted_y) RESID (residual).

* The constant = the risk (of Y being 1) when all explanatory variable = 0.
* The coefficients for the other variables are risk differences associated
* with a one-unit increase in that explanatory variable, controlling for the
others.
* But because the Y-variable is conditionally dichotmous (rather than
contitionally
* normal), the standard errors are not correct. This is why the Huber-White
robust
* errors are needed.

* Run the same model via GENLIN.
GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=MODEL PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* Results match those from REGRESSION very closely--slight
  differences are due to use of MLE rather than OLS.
* Note that this model still has incorrect standard errors.

* ---------------------------------------------------------------------- .
* Same model, but with COVB=ROBUST.
* This should give the Huber-White standard errors.
* Reference: https://www-304.ibm.com/support/docview.wss?uid=swg21477323 .
* ---------------------------------------------------------------------- .

GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
  /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
 DISTRIBUTION=NORMAL LINK=IDENTITY
  /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
ANALYSISTYPE=3(LR)
    CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
  /SAVE MEANPRED RESID.

* The coefficients are the same as before, but
* now we have the correct, robust standard errors.

* ================================================================== .




DKUKEC wrote
> Greetings,
>
> Is it possible to calculate percent change in 2x2 crosstab?  If yes, can
> you please point me in the right direction.
>
> I have a crosstab with two vars (treatment group (1) and control group
> (0)) x (recidivist (1) non-recidivist (0)).  I would like to be able to
> generate the percent difference in recivisim between the treatment group
> and the control group, and then percent change between the two vars.   I
> have reviewed the archives and internet; however, I was unsuccessful in my
> search.
>
> Thank you,
> Damir





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720443.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
Hi Ryan.  Yes, for the 2x2 case, the Contrast Estimate and SE shown in the Individual Test Results box from your logistic regression model match the coefficient and SE for X from the second model.  

But I don't see any great advantage to using your approach if one wants the risk difference as the measure of effect.  One advantage of the second approach (with DISTRIBUTION=NORMAL, LINK=IDENTITY and COVB=ROBUST) is that it gives the CI for the coefficient.  (One could calculate the CI easily enough from the Individual Test Results output, but it's extra work, and an opportunity for error to slip in.)  I suspect the second approach is also much more straightforward when additional predictors are added to the model.

Cheers,
Bruce

Ryan Black wrote
Hi Bruce,

I decided to fit the final model you suggested in your post using a single
dichotomous IV ("x"), and as far as I can tell, the results are identical
to those produced by the standard binary logistic regression model. Take a
look at the line for "x" in the Parameter Estimates output, and compare
those results to EMMEANS output from the binary logistic regression
analysis. The simulation experiment is below.

Curious. Did I set up the model you suggested correctly? Are you finding
the same results as I am when you run the simulation code below?

Best,

Ryan

*Generate Data.
 SET SEED 98765432.
NEW FILE.
INPUT PROGRAM.
 LOOP ID= 1 to 100.
 COMPUTE x= rv.bernoulli(0.5).
 COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
 END CASE.
 END LOOP.
 END FILE.
END INPUT PROGRAM.
EXECUTE.

GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
 /MODEL x INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOGIT
 /EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE PADJUST=LSD
/MISSING CLASSMISSING=EXCLUDE
 /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

GENLIN y BY x
   /MODEL x INTERCEPT=YES
  DISTRIBUTION=NORMAL LINK=IDENTITY
   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE)
SINGULAR=1E-012
 ANALYSISTYPE=3(LR)
     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
   /MISSING CLASSMISSING=EXCLUDE
   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.


On Tue, May 28, 2013 at 8:34 PM, Ryan Black <[hidden email]>wrote:

> Bruce,
> With the simple 2X2 table discussed by the OP and a decent sample size, I
> would expect a standard binary logistic regression model fit via the GENLIN
> procedure to yield a reasonable standard error. The risk difference test of
> interest is provided in the EMMEANS Table. A little simulation experiment
> is provided below my name.
>
> Admittedly, I haven't given this a tremendous amount of thought. I'll
> examine the article and your code when time permits.
> Ryan
> --
> *Generate Data.
> SET SEED 98765432.
> NEW FILE.
> INPUT PROGRAM.
> LOOP ID= 1 to 100.
> COMPUTE x= rv.bernoulli(0.5).
> COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
> END CASE.
> END LOOP.
> END FILE.
> END INPUT PROGRAM.
> EXECUTE.
>
> * Generalized Linear Models.
> GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
> /MODEL x INTERCEPT=YES
> DISTRIBUTION=BINOMIAL LINK=LOGIT
> /EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE PADJUST=LSD
>
> /MISSING CLASSMISSING=EXCLUDE
> /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
> On Tue, May 28, 2013 at 6:43 PM, Bruce Weaver <[hidden email]>wrote:
>
>> It sounds to me as if you want the /risk difference/ (rather than the odds
>> ratio, for example).  If so, you might want to look at this article by
>> Cheung (2007):
>>
>>    http://aje.oxfordjournals.org/content/166/11/1337.full
>>
>> I've also included below some syntax I wrote after looking at that article
>> myself a couple years ago.
>>
>> HTH.
>>
>>
>>
>> * ==================================================================
>> *  File:        risk difference with robust SEs.SPS .
>> *  Date:        12-May-2011 .
>> *  Author:   Bruce Weaver, [hidden email]
>> * ================================================================== .
>>
>> * Compute Huber-White robust variance when using
>>   using linear regression to obtain risk difference.
>>
>> * Macro to define folder with SPSS sample data files.
>> define !spssdata () "C:\SPSSdata\" !enddefine.
>>
>> new file.
>> dataset close all.
>> get file = !spssdata + "bankloan.sav".
>>
>> * Variables in model:
>>   Y = default (1=Yes, 0=No)
>>   X1 = Age
>>   X2 = Education (3 indicator variables)
>>   X3 = Debt to Income Ratio
>> .
>>
>> select if nmiss(default, age, ed, debtinc) EQ 0.
>>
>> freq ed.
>> compute ed1 = (ed EQ 1).
>> compute ed2 = (ed EQ 2).
>> compute ed3 = (ed EQ 3).
>> compute ed45 = any(ed,4,5).
>> format ed1 to ed45 (f1.0).
>> crosstabs ed by ed1 to ed45.
>>
>> graph histogram age.
>> * Center AGE on 20 (value near the minimum), and make unit 5 years.
>> compute AGE.20.5 = (age - 20)/5.
>> graph histogram age.20.5 .
>>
>> graph histogram debtinc.
>> * Set unit = 5 for DEBTINC .
>> compute debtinc.0.5 = debtinc/5.
>> graph histogram debtinc.0.5 .
>>
>>
>> * First try GENLIN with LINK=IDENTITY and ERROR=BINOMIAL.
>>
>> * Generalized Linear Models.
>> GENLIN default (REFERENCE=FIRST) WITH age ed2 ed3 ed45 debtinc
>>   /MODEL age ed2 ed3 ed45 debtinc INTERCEPT=YES
>>  DISTRIBUTION=BINOMIAL LINK=IDENTITY
>>   /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100
>> MAXSTEPHALVING=5
>>     PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR)
>> CILEVEL=95
>> CITYPE=WALD LIKELIHOOD=FULL
>>   /MISSING CLASSMISSING=EXCLUDE
>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
>>
>> * There are problems, which is not unusual according to Cheung (2007).
>>
>> * Now try OLS linear regression -- save fitted values & residuals.
>>
>> REGRESSION
>>   /STATISTICS COEFF
>>   /DEPENDENT default
>>   /METHOD=ENTER age.20.5 ed2 ed3 ed45 debtinc.0.5
>>   /SAVE PRED (fitted_y) RESID (residual).
>>
>> * The constant = the risk (of Y being 1) when all explanatory variable =
>> 0.
>> * The coefficients for the other variables are risk differences associated
>> * with a one-unit increase in that explanatory variable, controlling for
>> the
>> others.
>> * But because the Y-variable is conditionally dichotmous (rather than
>> contitionally
>> * normal), the standard errors are not correct. This is why the
>> Huber-White
>> robust
>> * errors are needed.
>>
>> * Run the same model via GENLIN.
>> GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
>>   /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
>>  DISTRIBUTION=NORMAL LINK=IDENTITY
>>   /CRITERIA SCALE=MLE COVB=MODEL PCONVERGE=1E-006(ABSOLUTE)
>> SINGULAR=1E-012
>> ANALYSISTYPE=3(LR)
>>     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
>>   /MISSING CLASSMISSING=EXCLUDE
>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
>>   /SAVE MEANPRED RESID.
>>
>> * Results match those from REGRESSION very closely--slight
>>   differences are due to use of MLE rather than OLS.
>> * Note that this model still has incorrect standard errors.
>>
>> * ---------------------------------------------------------------------- .
>> * Same model, but with COVB=ROBUST.
>> * This should give the Huber-White standard errors.
>> * Reference: https://www-304.ibm.com/support/docview.wss?uid=swg21477323.
>> * ---------------------------------------------------------------------- .
>>
>> GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
>>   /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
>>  DISTRIBUTION=NORMAL LINK=IDENTITY
>>   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE)
>> SINGULAR=1E-012
>> ANALYSISTYPE=3(LR)
>>     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
>>   /MISSING CLASSMISSING=EXCLUDE
>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
>>   /SAVE MEANPRED RESID.
>>
>> * The coefficients are the same as before, but
>> * now we have the correct, robust standard errors.
>>
>> * ================================================================== .
>>
>>
>>
>>
>> DKUKEC wrote
>> > Greetings,
>> >
>> > Is it possible to calculate percent change in 2x2 crosstab?  If yes, can
>> > you please point me in the right direction.
>> >
>> > I have a crosstab with two vars (treatment group (1) and control group
>> > (0)) x (recidivist (1) non-recidivist (0)).  I would like to be able to
>> > generate the percent difference in recivisim between the treatment group
>> > and the control group, and then percent change between the two vars.   I
>> > have reviewed the archives and internet; however, I was unsuccessful in
>> my
>> > search.
>> >
>> > Thank you,
>> > Damir
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>> [hidden email]
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> View this message in context:
>> http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720443.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
In reply to this post by DKUKEC
Damir,
 
% change in risk  = (RR - 1)*100=[(41.9/48.0) - 1]*100 = -12.7
 
As you can see from the formula above, the test on the [absolute] risk difference, which is what I brought up in previous posts, is not the same as the test on % change in risk.
 
Ryan

On Wed, May 29, 2013 at 8:52 AM, DKUKEC <[hidden email]> wrote:
Thank you Ryan & Bruce, Very much appreciate your replies and suggested syntax. I am looking for the following computations... for example: % Difference in recidivism rate = Treatment Recidivism % - Control Recidivism %. % Change in the recidivism = Difference % / Control Recidivism %. *************************************************************************** 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0% Control 48.0% 52.0% 100.0% ****** Would like to compute %************************* Difference -6.1% % Change -12.7% ***************************************************** Sincerely, Damir

View this message in context: Re: Percent Change Crosstabs

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
In reply to this post by Bruce Weaver
Thanks for confirming, Bruce. The fact that they are identical in this situation makes sense to me, but I wanted to be certain I had not made an error in the code.
 
When time permits, perhaps we can have further conversation regarding this issue. I've spent a great deal of time in my work grappling with these issues. I'm interested in learning and sharing ideas on this topic.
 
Best,
 
Ryan


On Wed, May 29, 2013 at 11:08 AM, Bruce Weaver <[hidden email]> wrote:
Hi Ryan.  Yes, for the 2x2 case, the Contrast Estimate and SE shown in the
Individual Test Results box from your logistic regression model match the
coefficient and SE for X from the second model.

But I don't see any great advantage to using your approach if one wants the
risk difference as the measure of effect.  One advantage of the second
approach (with DISTRIBUTION=NORMAL, LINK=IDENTITY and COVB=ROBUST) is that
it gives the CI for the coefficient.  (One could calculate the CI easily
enough from the Individual Test Results output, but it's extra work, and an
opportunity for error to slip in.)  I suspect the second approach is also
much more straightforward when additional predictors are added to the model.

Cheers,
Bruce


Ryan Black wrote
> Hi Bruce,
>
> I decided to fit the final model you suggested in your post using a single
> dichotomous IV ("x"), and as far as I can tell, the results are identical
> to those produced by the standard binary logistic regression model. Take a
> look at the line for "x" in the Parameter Estimates output, and compare
> those results to EMMEANS output from the binary logistic regression
> analysis. The simulation experiment is below.
>
> Curious. Did I set up the model you suggested correctly? Are you finding
> the same results as I am when you run the simulation code below?
>
> Best,
>
> Ryan
>
> *Generate Data.
>  SET SEED 98765432.
> NEW FILE.
> INPUT PROGRAM.
>  LOOP ID= 1 to 100.
>  COMPUTE x= rv.bernoulli(0.5).
>  COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
>  END CASE.
>  END LOOP.
>  END FILE.
> END INPUT PROGRAM.
> EXECUTE.
>
> GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
>  /MODEL x INTERCEPT=YES
>  DISTRIBUTION=BINOMIAL LINK=LOGIT
>  /EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE
> PADJUST=LSD
> /MISSING CLASSMISSING=EXCLUDE
>  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
>
> GENLIN y BY x
>    /MODEL x INTERCEPT=YES
>   DISTRIBUTION=NORMAL LINK=IDENTITY
>    /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE)
> SINGULAR=1E-012
>  ANALYSISTYPE=3(LR)
>      CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
>    /MISSING CLASSMISSING=EXCLUDE
>    /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
>
>
> On Tue, May 28, 2013 at 8:34 PM, Ryan Black <

> ryan.andrew.black@

> >wrote:
>
>> Bruce,
>> With the simple 2X2 table discussed by the OP and a decent sample size, I
>> would expect a standard binary logistic regression model fit via the
>> GENLIN
>> procedure to yield a reasonable standard error. The risk difference test
>> of
>> interest is provided in the EMMEANS Table. A little simulation experiment
>> is provided below my name.
>>
>> Admittedly, I haven't given this a tremendous amount of thought. I'll
>> examine the article and your code when time permits.
>> Ryan
>> --
>> *Generate Data.
>> SET SEED 98765432.
>> NEW FILE.
>> INPUT PROGRAM.
>> LOOP ID= 1 to 100.
>> COMPUTE x= rv.bernoulli(0.5).
>> COMPUTE y = rv.bernoulli(exp(-1.5 + 0.9*x) / (1+ exp(-1.5 + 0.9*x))).
>> END CASE.
>> END LOOP.
>> END FILE.
>> END INPUT PROGRAM.
>> EXECUTE.
>>
>> * Generalized Linear Models.
>> GENLIN y (REFERENCE=LAST) BY x (ORDER=ASCENDING)
>> /MODEL x INTERCEPT=YES
>> DISTRIBUTION=BINOMIAL LINK=LOGIT
>> /EMMEANS TABLES=x SCALE=ORIGINAL COMPARE=x CONTRAST=DIFFERENCE
>> PADJUST=LSD
>>
>> /MISSING CLASSMISSING=EXCLUDE
>> /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
>> On Tue, May 28, 2013 at 6:43 PM, Bruce Weaver <

> bruce.weaver@

> >wrote:
>>
>>> It sounds to me as if you want the /risk difference/ (rather than the
>>> odds
>>> ratio, for example).  If so, you might want to look at this article by
>>> Cheung (2007):
>>>
>>>    http://aje.oxfordjournals.org/content/166/11/1337.full
>>>
>>> I've also included below some syntax I wrote after looking at that
>>> article
>>> myself a couple years ago.
>>>
>>> HTH.
>>>
>>>
>>>
>>> * ==================================================================
>>> *  File:        risk difference with robust SEs.SPS .
>>> *  Date:        12-May-2011 .
>>> *  Author:   Bruce Weaver,

> bweaver@

>>> * ================================================================== .
>>>
>>> * Compute Huber-White robust variance when using
>>>   using linear regression to obtain risk difference.
>>>
>>> * Macro to define folder with SPSS sample data files.
>>> define !spssdata () "C:\SPSSdata\" !enddefine.
>>>
>>> new file.
>>> dataset close all.
>>> get file = !spssdata + "bankloan.sav".
>>>
>>> * Variables in model:
>>>   Y = default (1=Yes, 0=No)
>>>   X1 = Age
>>>   X2 = Education (3 indicator variables)
>>>   X3 = Debt to Income Ratio
>>> .
>>>
>>> select if nmiss(default, age, ed, debtinc) EQ 0.
>>>
>>> freq ed.
>>> compute ed1 = (ed EQ 1).
>>> compute ed2 = (ed EQ 2).
>>> compute ed3 = (ed EQ 3).
>>> compute ed45 = any(ed,4,5).
>>> format ed1 to ed45 (f1.0).
>>> crosstabs ed by ed1 to ed45.
>>>
>>> graph histogram age.
>>> * Center AGE on 20 (value near the minimum), and make unit 5 years.
>>> compute AGE.20.5 = (age - 20)/5.
>>> graph histogram age.20.5 .
>>>
>>> graph histogram debtinc.
>>> * Set unit = 5 for DEBTINC .
>>> compute debtinc.0.5 = debtinc/5.
>>> graph histogram debtinc.0.5 .
>>>
>>>
>>> * First try GENLIN with LINK=IDENTITY and ERROR=BINOMIAL.
>>>
>>> * Generalized Linear Models.
>>> GENLIN default (REFERENCE=FIRST) WITH age ed2 ed3 ed45 debtinc
>>>   /MODEL age ed2 ed3 ed45 debtinc INTERCEPT=YES
>>>  DISTRIBUTION=BINOMIAL LINK=IDENTITY
>>>   /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100
>>> MAXSTEPHALVING=5
>>>     PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(LR)
>>> CILEVEL=95
>>> CITYPE=WALD LIKELIHOOD=FULL
>>>   /MISSING CLASSMISSING=EXCLUDE
>>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
>>>
>>> * There are problems, which is not unusual according to Cheung (2007).
>>>
>>> * Now try OLS linear regression -- save fitted values & residuals.
>>>
>>> REGRESSION
>>>   /STATISTICS COEFF
>>>   /DEPENDENT default
>>>   /METHOD=ENTER age.20.5 ed2 ed3 ed45 debtinc.0.5
>>>   /SAVE PRED (fitted_y) RESID (residual).
>>>
>>> * The constant = the risk (of Y being 1) when all explanatory variable =
>>> 0.
>>> * The coefficients for the other variables are risk differences
>>> associated
>>> * with a one-unit increase in that explanatory variable, controlling for
>>> the
>>> others.
>>> * But because the Y-variable is conditionally dichotmous (rather than
>>> contitionally
>>> * normal), the standard errors are not correct. This is why the
>>> Huber-White
>>> robust
>>> * errors are needed.
>>>
>>> * Run the same model via GENLIN.
>>> GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
>>>   /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
>>>  DISTRIBUTION=NORMAL LINK=IDENTITY
>>>   /CRITERIA SCALE=MLE COVB=MODEL PCONVERGE=1E-006(ABSOLUTE)
>>> SINGULAR=1E-012
>>> ANALYSISTYPE=3(LR)
>>>     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
>>>   /MISSING CLASSMISSING=EXCLUDE
>>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
>>>   /SAVE MEANPRED RESID.
>>>
>>> * Results match those from REGRESSION very closely--slight
>>>   differences are due to use of MLE rather than OLS.
>>> * Note that this model still has incorrect standard errors.
>>>
>>> * ----------------------------------------------------------------------
>>> .
>>> * Same model, but with COVB=ROBUST.
>>> * This should give the Huber-White standard errors.
>>> * Reference:
>>> https://www-304.ibm.com/support/docview.wss?uid=swg21477323.
>>> * ----------------------------------------------------------------------
>>> .
>>>
>>> GENLIN default WITH AGE.20.5 ed2 ed3 ed45 debtinc.0.5
>>>   /MODEL AGE.20.5 ed2 ed3 ed45 debtinc.0.5 INTERCEPT=YES
>>>  DISTRIBUTION=NORMAL LINK=IDENTITY
>>>   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE)
>>> SINGULAR=1E-012
>>> ANALYSISTYPE=3(LR)
>>>     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
>>>   /MISSING CLASSMISSING=EXCLUDE
>>>   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
>>>   /SAVE MEANPRED RESID.
>>>
>>> * The coefficients are the same as before, but
>>> * now we have the correct, robust standard errors.
>>>
>>> * ================================================================== .
>>>
>>>
>>>
>>>
>>> DKUKEC wrote
>>> > Greetings,
>>> >
>>> > Is it possible to calculate percent change in 2x2 crosstab?  If yes,
>>> can
>>> > you please point me in the right direction.
>>> >
>>> > I have a crosstab with two vars (treatment group (1) and control group
>>> > (0)) x (recidivist (1) non-recidivist (0)).  I would like to be able
>>> to
>>> > generate the percent difference in recivisim between the treatment
>>> group
>>> > and the control group, and then percent change between the two vars.
>>> I
>>> > have reviewed the archives and internet; however, I was unsuccessful
>>> in
>>> my
>>> > search.
>>> >
>>> > Thank you,
>>> > Damir
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>>

> bweaver@

>>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> View this message in context:
>>> http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720443.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>>





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720455.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

John F Hall
In reply to this post by DKUKEC

Forgive a naive question, but with a 2 x 2 table why can’t you simply calculate epsilon (% diff) by hand?  This technique is called elaboration,  See: M Rosenberg,  The Logic of Survey Analysis (Basic Books, 1968)

 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html

  

  

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of DKUKEC
Sent: 29 May 2013 14:53
To: [hidden email]
Subject: Re: Percent Change Crosstabs

 

Thank you Ryan & Bruce, Very much appreciate your replies and suggested syntax. I am looking for the following computations... for example: % Difference in recidivism rate = Treatment Recidivism % - Control Recidivism %. % Change in the recidivism = Difference % / Control Recidivism %. *************************************************************************** 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0% Control 48.0% 52.0% 100.0% ****** Would like to compute %************************* Difference -6.1% % Change -12.7% ***************************************************** Sincerely, Damir


View this message in context: Re: Percent Change Crosstabs
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
John,
 
You certainly can. I thought the OP was interested in statistical testing/confidence intervals. 
 
With that said, one can use the approach outlined by Bruce or myself to test if the absolute difference is significantly different from zero.  
 
However, if one is interested in a statistical test regarding percent change in risk, then I would argue that it would be more appropriate to test whether the relative risk (RR) is significantly different from 1.0. With the model parameterized correctly, one could directly obtain the RR confidence limits, which could be converted to % change in risk.
 
Best,
 
Ryan


On Wed, May 29, 2013 at 12:22 PM, John F Hall <[hidden email]> wrote:

Forgive a naive question, but with a 2 x 2 table why can’t you simply calculate epsilon (% diff) by hand?  This technique is called elaboration,  See: M Rosenberg,  The Logic of Survey Analysis (Basic Books, 1968)

 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html

  

  

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of DKUKEC
Sent: 29 May 2013 14:53
To: [hidden email]
Subject: Re: Percent Change Crosstabs

 

Thank you Ryan & Bruce, Very much appreciate your replies and suggested syntax. I am looking for the following computations... for example: % Difference in recidivism rate = Treatment Recidivism % - Control Recidivism %. % Change in the recidivism = Difference % / Control Recidivism %. *************************************************************************** 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0% Control 48.0% 52.0% 100.0% ****** Would like to compute %************************* Difference -6.1% % Change -12.7% ***************************************************** Sincerely, Damir


View this message in context: Re: Percent Change Crosstabs
Sent from the SPSSX Discussion mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
I too thought that the OP was interested in having confidence intervals -- and I agree with Ryan's final paragraph below.  If one is limited to the 2x2 situation, then CROSSTABS will give the RR and its confidence interval.  And whatever transformation one applies to the RR to change the way of expressing the results can also be applied to the limits of the CI.  

If one wishes to go beyond the 2x2 case, then GENLIN can be used as in the syntax pasted below.

HTH.


* =============================================================
*  File:   RR via GENLIN with log-link.SPS .
*  Date:   17-Feb-2010 .
*  Author:  Bruce Weaver, bweaver@lakeheadu.ca .
* ============================================================= .

* This file shows how to obtain the Risk Ratio (aka Relative Risk)
* by using a Generalized Linear Model with a binary outcome
* variable, a log link function, and a binomial error distribution.

* ---------------------------------------------------------------- .

new file.
dataset close all.

GET FILE='C:\Program Files\SPSSInc\PASWStatistics17\Samples\bankloan.sav'.

freq ed default.
select if nmiss(ed, default) EQ 0.
exe.

recode ed
 (1 2 = 1)
 (3 4 5 = 2) into ed2.
recode default (0=2) (else=copy) into default2.

var lab
 default2 "Defaulted on loan"
 ed2 "Education level"
.
val lab
 ed2 1 "High school or less"
     2 "Some post-secondary" /
 default2 1 "Yes" 2 "No"
.
crosstabs ed by ed2 / default by default2.

crosstabs ed2 by default2 / stat = risk.

* Generalized Linear Model with a logit link function & binomial error.
* This should give the same odds ratio as above.

GENLIN default2 (REFERENCE=LAST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOGIT
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(LR) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL    
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* As expected, this model does give me the same OR as above.

* Generalized Linear Model with log-link & binomial error .
* This should give me the 1st relative risk shown above (.699) .

GENLIN default2 (REFERENCE=LAST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOG
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD
    LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* To get the second RR shown above (1.159), I think I need
* to make set the referent to the FIRST category of DEFAULT2.

GENLIN default2 (REFERENCE=FIRST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOG
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD
    LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* Yes, that's done it.

crosstabs ed2 by default2 / stat = risk.

* In another file, I'll see if I can model the Risk Difference
* by using GENLIN with an Identity link function and a
* binomial error .

* ============================================================= .



Ryan Black wrote
John,

You certainly can. I thought the OP was interested in statistical
testing/confidence intervals.

With that said, one can use the approach outlined by Bruce or myself to
test if the absolute difference is significantly different from zero.

However, if one is interested in a statistical test regarding percent
change in risk, then I would argue that it would be more appropriate
to test whether the relative risk (RR) is significantly different from 1.0.
With the model parameterized correctly, one could directly obtain the RR
confidence limits, which could be converted to % change in risk.

Best,

Ryan


On Wed, May 29, 2013 at 12:22 PM, John F Hall <[hidden email]> wrote:

> Forgive a naive question, but with a 2 x 2 table why can’t you simply
> calculate epsilon (% diff) by hand?  This technique is called elaboration,
> See: M Rosenberg,  The Logic of Survey Analysis (Basic Books, 1968)****
>
> ** **
>
> ** **
>
> John F Hall (Mr)****
>
> [Retired academic survey researcher]****
>
> ** **
>
> Email:   [hidden email]  ****
>
> Website: www.surveyresearch.weebly.com ****
>
> SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html **
> **
>
>   ****
>
>   ****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *From:* SPSSX(r) Discussion [mailto:[hidden email]] *On Behalf
> Of *DKUKEC
> *Sent:* 29 May 2013 14:53
> *To:* [hidden email]
> *Subject:* Re: Percent Change Crosstabs****
>
> ** **
>
> Thank you Ryan & Bruce, Very much appreciate your replies and suggested
> syntax. I am looking for the following computations... for example: %
> Difference in recidivism rate = Treatment Recidivism % - Control Recidivism
> %. % Change in the recidivism = Difference % / Control Recidivism %.
> ***************************************************************************
> 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133
> 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0%
> Control 48.0% 52.0% 100.0% ****** Would like to compute
> %************************* Difference -6.1% % Change -12.7%
> ***************************************************** Sincerely, Damir ***
> *
> ------------------------------
>
> View this message in context: Re: Percent Change Crosstabs<http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720451.html>
> Sent from the SPSSX Discussion mailing list archive<http://spssx-discussion.1045642.n5.nabble.com/>at Nabble.com.
> ****
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Rich Ulrich
In reply to this post by Ryan
This whole thread seems far too complicated, for discussing
a 2x2 table.  You have a table, you have a test, and you can
invert the p-level of the simple test, as needed. Thus, you may
describe a CI  in whatever contrastive terms that you choose.

The fact you have to face is that "relative risk" tends to be a really
crappy contrast to generalize from, since  it is so strongly affected
by base-rates.  That is why it seldom is to be preferred over the Odds
Ratio.  But its lousy generality does not justify using a lousy computation
to describe its "significance" for a particular table.

If you get opposite results from looking at "relative risk" when you
swap 0 and 1 ... as someone did, earlier in this thread ... that outcome
should be used to disqualify the method that gives those results.
There is *still* only one basic hypothesis about the differences. 

The test of the basic hypothesis?
For moderately large N in all cells, Fisher's Exact Test and both the
corrected and uncorrected Pearson chi-squared all give near-identical
results.  For 2x2, I think that the Likelihood Chi squared is also the same.
If you have tiny cells or extreme proportions, you can get some
different results from different ways to approximate computations
or corrections.  Continuity correction?  Unequal variances, and use d.f.?

--
Rich Ulrich


Date: Wed, 29 May 2013 12:31:40 -0400
From: [hidden email]
Subject: Re: Percent Change Crosstabs
To: [hidden email]

John,
 
You certainly can. I thought the OP was interested in statistical testing/confidence intervals. 
 
With that said, one can use the approach outlined by Bruce or myself to test if the absolute difference is significantly different from zero.  
 
However, if one is interested in a statistical test regarding percent change in risk, then I would argue that it would be more appropriate to test whether the relative risk (RR) is significantly different from 1.0. With the model parameterized correctly, one could directly obtain the RR confidence limits, which could be converted to % change in risk.
 
Best,
 
Ryan


On Wed, May 29, 2013 at 12:22 PM, John F Hall <[hidden email]> wrote:

Forgive a naive question, but with a 2 x 2 table why can’t you simply calculate epsilon (% diff) by hand?  This technique is called elaboration,  See: M Rosenberg,  The Logic of Survey Analysis (Basic Books, 1968)

 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html

  

  

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of DKUKEC
Sent: 29 May 2013 14:53
To: [hidden email]
Subject: Re: Percent Change Crosstabs

 

Thank you Ryan & Bruce, Very much appreciate your replies and suggested syntax. I am looking for the following computations... for example: % Difference in recidivism rate = Treatment Recidivism % - Control Recidivism %. % Change in the recidivism = Difference % / Control Recidivism %. *************************************************************************** 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0% Control 48.0% 52.0% 100.0% ****** Would like to compute %************************* Difference -6.1% % Change -12.7% ***************************************************** Sincerely, Damir


View this message in context: Re: Percent Change Crosstabs
Sent from the SPSSX Discussion mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Maguin, Eugene

I enjoyed this discussion because I learned about risk difference and risk ratio, neither of which I’d heard of before. So, thanks to Bruce, Ryan, and Rich.

 

Gene Maguin

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
Thanks Gene.  For completeness, the reciprocal of the absolute risk reduction (at least in studies involving treatment vs control) is called the "number needed to treat" (NNT).  Clinicians seem to like it as a measure of effect.  It tells them how many patients they need to treat in order for one to benefit.  There is also number needed to harm (NNH) if one is talking about unwanted side effects, for example.

@Rich -- I agree that the thread probably did become more complicated than necessary if the situation is limited to 2x2 tables.  It's my fault it got that complicated.  My excuse (if you will) is that I was pretty certain the OP wanted the risk difference with its CI, and they won't get that (at least not easily) from CROSSTABS output.  

@Jon -- Is there any possibility (in a future release) of having the risk difference (and maybe NNT) added to the output one gets when including RISK on the STATISTICS sub-command?  That would be quite useful, I think.

Cheers,
Bruce


Maguin, Eugene wrote
I enjoyed this discussion because I learned about risk difference and risk ratio, neither of which I'd heard of before. So, thanks to Bruce, Ryan, and Rich.

Gene Maguin
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Anthony Babinec
In the 2x2 table, the measure Somers' d equals the difference in
proportions.
You have to choose the right Somers' d, since Crosstabs prints several
calculated
versions, but you do get accompanying standard errors and significance
levels.


Tony Babinec
[hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

bdates
In reply to this post by Bruce Weaver
I've been using NNT for years, together with the Common Language Effect Size and Cohen's U1, U2, and U3 because they all speak to people who are more statistically naïve.  It's too much to ask for everything, so I'll second Bruce's motion that NNT be added in a future release.

Brian

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Wednesday, May 29, 2013 4:33 PM
To: [hidden email]
Subject: Re: Percent Change Crosstabs

Thanks Gene.  For completeness, the reciprocal of the absolute risk reduction
(at least in studies involving treatment vs control) is called the "number
needed to treat" (NNT).  Clinicians seem to like it as a measure of effect.
It tells them how many patients they need to treat in order for one to
benefit.  There is also number needed to harm (NNH) if one is talking about
unwanted side effects, for example.

@Rich -- I agree that the thread probably did become more complicated than
necessary if the situation is limited to 2x2 tables.  It's my fault it got
that complicated.  My excuse (if you will) is that I was pretty certain the
OP wanted the risk difference with its CI, and they won't get that (at least
not easily) from CROSSTABS output.

@Jon -- Is there any possibility (in a future release) of having the risk
difference (and maybe NNT) added to the output one gets when including RISK
on the STATISTICS sub-command?  That would be quite useful, I think.

Cheers,
Bruce



Maguin, Eugene wrote
> I enjoyed this discussion because I learned about risk difference and risk
> ratio, neither of which I'd heard of before. So, thanks to Bruce, Ryan,
> and Rich.
>
> Gene Maguin





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720471.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
In reply to this post by Anthony Babinec
Thanks Tony, I didn't know that.  Here's a demo using the OP's data.  

DATA LIST list / Group Recid kount (3f5.0).
BEGIN DATA
1 1 816
1 2 1133
2 1 936
2 2 1013
END DATA.

WEIGHT by kount.

* Get the risk difference via GENLIN first.

GENLIN Recid BY Group
   /MODEL Group INTERCEPT=YES
  DISTRIBUTION=NORMAL LINK=IDENTITY
   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
 ANALYSISTYPE=3(LR)
     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
   /MISSING CLASSMISSING=EXCLUDE
   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

* Now use CROSSTABS, with RISK and Somer's D reported.

CROSSTABS
  /TABLES=Group BY Recid
  /FORMAT=AVALUE TABLES
  /STATISTICS=D RISK
  /CELLS=COUNT ROW
  /COUNT ROUND CELL.

RESULTS

Genlin: B = .062, SE = .0159, 95% CI, .030 to .093

CROSSTABS gives the following values for d and its SE:
   d     SE
-.062 .0160 -- Symmetric
-.062 .0161 -- Group Dependent
-.062 .0159 -- Recid Dependent

So, the Recid Dependent result is the one that matches what I got from GENLIN.  (The difference in sign could be fixed by changing the reference category in one of the analyses.)  

Once again, though, I prefer the GENLIN solution, because I don't have to know which of 3 results to choose, and I get a CI for the risk difference (which is not given for Somer's d).


Anthony Babinec wrote
In the 2x2 table, the measure Somers' d equals the difference in
proportions.
You have to choose the right Somers' d, since Crosstabs prints several
calculated
versions, but you do get accompanying standard errors and significance
levels.


Tony Babinec
[hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Ryan
In reply to this post by Bruce Weaver
Bruce raises an important point that should not go unnoticed. It is common practice that when testing:
 
 
H0: p1 - p2 = 0
 
 
a generalized linear model is employed where the distribution is "normal," along with an "identity" link function.
 
 
It is also common practice when testing:
 
 
H0: p1/p2 = 1
 
 
to fit a model which assumes al "binomial" distribution with a "log" link function via a generalized linear model (a.k.a. log-binomial regression model), and therefore one is testing:
 
 
log(p1/p2) = log(p1) - log(p2)
 
 
As a result, the p-values associated with both tests will likely be different, even if slight.
 
 
Ryan


On Wed, May 29, 2013 at 2:32 PM, Bruce Weaver <[hidden email]> wrote:
I too thought that the OP was interested in having confidence intervals --
and I agree with Ryan's final paragraph below.  If one is limited to the 2x2
situation, then CROSSTABS will give the RR and its confidence interval.  And
whatever transformation one applies to the RR to change the way of
expressing the results can also be applied to the limits of the CI.

If one wishes to go beyond the 2x2 case, then GENLIN can be used as in the
syntax pasted below.

HTH.


* =============================================================
*  File:         RR via GENLIN with log-link.SPS .
*  Date:         17-Feb-2010 .
*  Author:  Bruce Weaver, [hidden email] .
* ============================================================= .

* This file shows how to obtain the Risk Ratio (aka Relative Risk)
* by using a Generalized Linear Model with a binary outcome
* variable, a log link function, and a binomial error distribution.

* ---------------------------------------------------------------- .

new file.
dataset close all.

GET FILE='C:\Program Files\SPSSInc\PASWStatistics17\Samples\bankloan.sav'.

freq ed default.
select if nmiss(ed, default) EQ 0.
exe.

recode ed
 (1 2 = 1)
 (3 4 5 = 2) into ed2.
recode default (0=2) (else=copy) into default2.

var lab
 default2 "Defaulted on loan"
 ed2 "Education level"
.
val lab
 ed2 1 "High school or less"
     2 "Some post-secondary" /
 default2 1 "Yes" 2 "No"
.
crosstabs ed by ed2 / default by default2.

crosstabs ed2 by default2 / stat = risk.

* Generalized Linear Model with a logit link function & binomial error.
* This should give the same odds ratio as above.

GENLIN default2 (REFERENCE=LAST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOGIT
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(LR) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* As expected, this model does give me the same OR as above.

* Generalized Linear Model with log-link & binomial error .
* This should give me the 1st relative risk shown above (.699) .

GENLIN default2 (REFERENCE=LAST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOG
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD
    LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* To get the second RR shown above (1.159), I think I need
* to make set the referent to the FIRST category of DEFAULT2.

GENLIN default2 (REFERENCE=FIRST) BY ed2 (ORDER=ASCENDING)
  /MODEL ed2 INTERCEPT=YES
 DISTRIBUTION=BINOMIAL LINK=LOG
  /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
    MAXITERATIONS=100 MAXSTEPHALVING=5
    PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
    ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD
    LIKELIHOOD=FULL
  /MISSING CLASSMISSING=EXCLUDE
  /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION (EXPONENTIATED).

* Yes, that's done it.

crosstabs ed2 by default2 / stat = risk.

* In another file, I'll see if I can model the Risk Difference
* by using GENLIN with an Identity link function and a
* binomial error .

* ============================================================= .




Ryan Black wrote
> John,
>
> You certainly can. I thought the OP was interested in statistical
> testing/confidence intervals.
>
> With that said, one can use the approach outlined by Bruce or myself to
> test if the absolute difference is significantly different from zero.
>
> However, if one is interested in a statistical test regarding percent
> change in risk, then I would argue that it would be more appropriate
> to test whether the relative risk (RR) is significantly different from
> 1.0.
> With the model parameterized correctly, one could directly obtain the RR
> confidence limits, which could be converted to % change in risk.
>
> Best,
>
> Ryan
>
>
> On Wed, May 29, 2013 at 12:22 PM, John F Hall <

> johnfhall@

> > wrote:
>
>> Forgive a naive question, but with a 2 x 2 table why can’t you simply
>> calculate epsilon (% diff) by hand?  This technique is called
>> elaboration,
>> See: M Rosenberg,  The Logic of Survey Analysis (Basic Books, 1968)****
>>
>> ** **
>>
>> ** **
>>
>> John F Hall (Mr)****
>>
>> [Retired academic survey researcher]****
>>
>> ** **
>>
>> Email:

> johnfhall@

>   ****
>>
>> Website: www.surveyresearch.weebly.com ****
>>
>> SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html
>> **
>> **
>>
>>   ****
>>
>>   ****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> *From:* SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] *On Behalf
>> Of *DKUKEC
>> *Sent:* 29 May 2013 14:53
>> *To:*

> SPSSX-L@.UGA

>> *Subject:* Re: Percent Change Crosstabs****
>>
>> ** **
>>
>> Thank you Ryan & Bruce, Very much appreciate your replies and suggested
>> syntax. I am looking for the following computations... for example: %
>> Difference in recidivism rate = Treatment Recidivism % - Control
>> Recidivism
>> %. % Change in the recidivism = Difference % / Control Recidivism %.
>> ***************************************************************************
>> 2X2 CROSSTAB EXAMPLE Recidivist Non-Recidivist Total Treatment 816 1133
>> 1949 Control 936 1013 1949 Row % Row % Row % Treatment 41.9% 58.1% 100.0%
>> Control 48.0% 52.0% 100.0% ****** Would like to compute
>> %************************* Difference -6.1% % Change -12.7%
>> ***************************************************** Sincerely, Damir
>> ***
>> *
>> ------------------------------
>>
>> View this message in context: Re: Percent Change
>> Crosstabs<http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720451.html>
>> Sent from the SPSSX Discussion mailing list
>> archive<http://spssx-discussion.1045642.n5.nabble.com/>at
>> Nabble.com.
>> ****
>>





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Percent-Change-Crosstabs-tp5720440p5720467.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

DKUKEC
In reply to this post by Ryan
Thank you all for your contributions.  

Ryan, with respect to RR=-12.7; how would I report this.  Would I report that the treatment group reduced the risk of recidivism by 12.7% when compared to the control group?

Bruce/Tony, I am still unsure how GLM and Crosstab (Somer's d) computes the difference in proportions...  what does the output mean in both examples and how does it compute the difference?  

I apologize in advance for my ignorance and thank you all for your imput.

Sincerely,
Damir
Reply | Threaded
Open this post in threaded view
|

Re: Percent Change Crosstabs

Bruce Weaver
Administrator
Ryan did not say RR = -12.7.  He said:

   % change in risk  = (RR - 1)*100=[(41.9/48.0) - 1]*100 = -12.7

RR = 41.9%/48.0% = 0.873
1 - RR = 1 - .873 = -.127 = -12.7%


Re the GENLIN syntax I gave (see below), the coefficient for Group gives the risk difference (with SE and CI).  Tony pointed out that one can get the same thing from the Somer's d table generated by CROSSTABS.  But as he said, and my example showed, you have to know which of the 3 results you want.  That's one reason why I prefer the GENLIN approach.  (Another reason is that you can add other explanatory variables.)


GENLIN Recid BY Group
   /MODEL Group INTERCEPT=YES
  DISTRIBUTION=NORMAL LINK=IDENTITY
   /CRITERIA SCALE=MLE COVB=ROBUST PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012
 ANALYSISTYPE=3(LR)
     CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL
   /MISSING CLASSMISSING=EXCLUDE
   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

HTH.


DKUKEC wrote
Thank you all for your contributions.  

Ryan, with respect to RR=-12.7; how would I report this.  Would I report that the treatment group reduced the risk of recidivism by 12.7% when compared to the control group?

Bruce/Tony, I am still unsure how GLM and Crosstab (Somer's d) computes the difference in proportions...  what does the output mean in both examples and how does it compute the difference?  

I apologize in advance for my ignorance and thank you all for your imput.

Sincerely,
Damir
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
12