hello all,
I was wondering whether someone could enlighten me regarding the following problem. I have data on counts of death by drug use and gender, the drugs are four, let's say H, M, C, B. Deaths is the count/continuous variable, all other variables are binary i.e. use/no use for drug variables, male/female for gender. I know that I can use either GENLOG, which I first weight by the "deaths" variable which is the counts, i.e. the number of deaths, for each drug-gender combination, or GENLIN which covers loglinear models for count data. I don't have an offset variable, e.g. population at risk. I did both GENLIN and GENLOG. I fit all the main effects, all the 2-way interactions between drugs, all the 2-way interactions between each drug and gender, all the 3-way interactions between drugs, all the 3-way interactions between two drugs and gender, the 4-way interaction between all drugs, all the 4-way interactions between three drugs and gender, the 5-way interaction between all drugs and gender. I get different results for the interactions and I wonder why this is. For example, if I use the 0 group in each variable as a reference, GENLIN gives me a parameter estimate for the interaction H (group 1)*gender (group 1) and considers all other combinations redundant (01, 10, 00), while GENLOG considers all four combinations in this interaction redundant (00, 01, 10, 11), i.e. the parameter estimates for this interaction are zero. Below is the syntax. GENLIN deaths BY H M C B sex (ORDER=ASCENDING) /MODEL H M C B sex H*sex M*sex C*sex B*sex H*M H*C H*B M*C M*B C*B H*M*C H*M*B M*C*B H*C*B H*M*sex H*C*sex H*B*sex M*C*sex M*B*sex C*B*sex H*M*C*sex H*M*B*sex H*C*B*sex M*C*B*sex H*M*C*B H*M*C*B*sex INTERCEPT=YES DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=NEWTON SCALE=1 COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL. and WEIGHT BY deaths. GENLOG H M C B sex /MODEL=POISSON /PRINT=FREQ RESID ADJRESID ZRESID DEV ESTIM CORR COV /PLOT=RESID(ADJRESID) NORMPROB(ADJRESID) /CRITERIA=CIN(95) ITERATE(20) CONVERGE(0.001) DELTA(.5) /DESIGN H M C B sex B*C*H*M*sex B*C B*H B*M B*sex C*H C*M C*sex H*M H*sex M*sex B*C*H B*C*M B*C*sex B*H*M B*H*sex B*M*sex C*H*M C*H*sex C*M*sex H*M*sex B*C*H*M B*C*H*sex B*C*M*sex B*H*M*sex C*H*M*sex. (despite the fact that I entered the drugs in the same order through the interactive mode, the program ordered them its own way eventually). Is there something in the algorithms used for each procedure that causes the differences in results? Thank you all |
I would guess it's because in the GENLIN
syntax, H*sex
appears before B*C*H*M*sex.
What happens when you run GENLOG with the same design as GENLIN?
That is,
WEIGHT BY deaths. GENLOG H M C B sex /MODEL=POISSON /PRINT=FREQ RESID ADJRESID ZRESID DEV ESTIM CORR COV /PLOT=RESID(ADJRESID) NORMPROB(ADJRESID) /CRITERIA=CIN(95) ITERATE(20) CONVERGE(0.001) DELTA(.5) /DESIGN H M C B sex H*sex M*sex C*sex B*sex H*M H*C H*B M*C M*B C*B H*M*C H*M*B M*C*B H*C*B H*M*sex H*C*sex H*B*sex M*C*sex M*B*sex C*B*sex H*M*C*sex H*M*B*sex H*C*B*sex M*C*B*sex H*M*C*B H*M*C*B*sex. Alex |
Administrator
|
In reply to this post by xenia
Why are you treating Deaths as a count? A count regression model, as I understand it, would be used when each individual has a count of events. But death is a binary variable (0 or 1) for each individual -- and when you WEIGHT by DEATHS, you are in essence getting a row for each individual where the variable is either a 0 or a 1, not a count. I should think you want a logistic regression model, or some other model for a binary outcome (e.g., a model yielding the relative risk or risk difference).
HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
DEATHS could be a count of the number of
deaths observed for that covariate pattern.
Alex From: Bruce Weaver <[hidden email]> To: [hidden email], Date: 07/03/2013 02:08 PM Subject: Re: GENLOG vs GENLIN Sent by: "SPSSX(r) Discussion" <[hidden email]> Why are you treating Deaths as a count? A count regression model, as I understand it, would be used when each individual has a count of events. But death is a binary variable (0 or 1) for each individual -- and when you WEIGHT by DEATHS, you are in essence getting a row for each individual where the variable is either a 0 or a 1, not a count. I should think you want a logistic regression model, or some other model for a binary outcome (e.g., a model yielding the relative risk or risk difference). HTH. |
It looks to me, too, like DEATHS could be the counts for each pattern. In addition to the counts of DEATHS, the usual epidemiological study would want to have the counts, for each pattern, of NOT-dying. You have great difficulty in drawing many inferences if you don't have the "denominators" for risk. Or, to put it another way, a whole lot of those factors and interactions should be treated as "given" proportions and not interesting for testing. - Maybe that makes it feasible to care about some specific 4-way and 5-way interactions, which are ordinarily too complicated and prone to artifact to be at all interesting. -- Rich Ulrich Date: Wed, 3 Jul 2013 14:53:33 -0500 From: [hidden email] Subject: Re: GENLOG vs GENLIN To: [hidden email] DEATHS could be a count of the number of deaths observed for that covariate pattern. Alex From: Bruce Weaver <[hidden email]> To: [hidden email], Date: 07/03/2013 02:08 PM Subject: Re: GENLOG vs GENLIN Sent by: "SPSSX(r) Discussion" <[hidden email]> Why are you treating Deaths as a count? A count regression model, as I understand it, would be used when each individual has a count of events. But death is a binary variable (0 or 1) for each individual -- and when you WEIGHT by DEATHS, you are in essence getting a row for each individual where the variable is either a 0 or a 1, not a count. I should think you want a logistic regression model, or some other model for a binary outcome (e.g., a model yielding the relative risk or risk difference). HTH. |
Administrator
|
In reply to this post by Alex Reutter
I understand that. But note what the OP said (emphasis added): "I know that I can use either GENLOG, which I first weight by the "deaths" variable which is the counts, i.e. the number of deaths, for each drug-gender combination..." When one "WEIGHTS by deaths", a dataset that has one row per covariate pattern is in essence becomes a dataset with one case per person, with the total number of rows = the sum of the counts. Here's an example of what I think is going on. NEW FILE. DATASET CLOSE all. * Generate a summary data set with counts. DATA LIST LIST / Male Exposed Disease kount (4f5.0) . BEGIN DATA. 1 1 1 160 1 1 0 80 1 0 1 440 1 0 0 320 0 1 1 240 0 1 0 330 0 0 1 160 0 0 0 270 END DATA. DATASET Name Summary. VALUE LABELS Male 1 'Male' 0 'Female' / Exposed 1 'Yes' 0 'No' / Disease 1 'Yes (case)' 0 'No (control)' . WEIGHT by kount. LOGISTIC REGRESSION VAR=exposed /METHOD=ENTER disease male /PRINT=CI(95) /CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) . WEIGHT off. * Now write a file with one row per person * and run the same model with no WEIGHTING. LOOP id = 1 to kount. - XSAVE OUTFILE = "C:\Temp\Junk.sav" / Keep = id Male to Disease. END LOOP. EXECUTE. GET FILE = "C:\Temp\Junk.sav". DATASET NAME raw. DATASET ACTIVATE raw. LOGISTIC REGRESSION VAR=exposed /METHOD=ENTER disease male /PRINT=CI(95) /CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) . * Clean up the junk. DATASET ACTIVATE summary. DATASET CLOSE raw. ERASE FILE "C:\Temp\Junk.sav". * End of example. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Rich Ulrich
Right. If the patterns are roughly
equally distributed, I think this should be okay, in the sense that you'll
be able to discern the relative effect of each factor on DEATHS.
If they are unequally distributed, then you'd need an offset variable (like "Aggregate months service" in the ship damage example; pg 204 of McCullagh & Nelder's _Generalized Linear Models_; example showing the use of GENLIN to fit this data at http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/topic/com.ibm.spss.statistics.cs/genlin_ships_intro.htm) Alex From: Rich Ulrich <[hidden email]> To: Alex Reutter/Burlington/IBM@IBMUS, SPSS list <[hidden email]>, Date: 07/03/2013 03:27 PM Subject: RE: GENLOG vs GENLIN It looks to me, too, like DEATHS could be the counts for each pattern. In addition to the counts of DEATHS, the usual epidemiological study would want to have the counts, for each pattern, of NOT-dying. You have great difficulty in drawing many inferences if you don't have the "denominators" for risk. Or, to put it another way, a whole lot of those factors and interactions should be treated as "given" proportions and not interesting for testing. - Maybe that makes it feasible to care about some specific 4-way and 5-way interactions, which are ordinarily too complicated and prone to artifact to be at all interesting. -- Rich Ulrich Date: Wed, 3 Jul 2013 14:53:33 -0500 From: [hidden email] Subject: Re: GENLOG vs GENLIN To: [hidden email] DEATHS could be a count of the number of deaths observed for that covariate pattern. Alex From: Bruce Weaver <[hidden email]> To: [hidden email], Date: 07/03/2013 02:08 PM Subject: Re: GENLOG vs GENLIN Sent by: "SPSSX(r) Discussion" <[hidden email]> Why are you treating Deaths as a count? A count regression model, as I understand it, would be used when each individual has a count of events. But death is a binary variable (0 or 1) for each individual -- and when you WEIGHT by DEATHS, you are in essence getting a row for each individual where the variable is either a 0 or a 1, not a count. I should think you want a logistic regression model, or some other model for a binary outcome (e.g., a model yielding the relative risk or risk difference). HTH. |
In reply to this post by Bruce Weaver
Based on the OP's statement "I
don't have an offset variable, e.g. population at risk. "
and subsequent analysis, and using your example below, I don't think the
OP has the equivalent of cases where Exposed = 0.
Alex From: Bruce Weaver <[hidden email]> To: [hidden email], Date: 07/03/2013 03:39 PM Subject: Re: GENLOG vs GENLIN Sent by: "SPSSX(r) Discussion" <[hidden email]> Alex Reutter wrote > DEATHS could be a count of the number of deaths observed for that > covariate pattern. I understand that. But note what the OP said (emphasis added): "I know that I can use either GENLOG, *which I first weight by the "deaths" variable* which is the counts, i.e. the number of deaths, for each drug-gender combination..." When one "WEIGHTS by deaths", a dataset that has one row per covariate pattern is in essence becomes a dataset with one case per person, with the total number of rows = the sum of the counts. Here's an example of what I think is going on. NEW FILE. DATASET CLOSE all. * Generate a summary data set with counts. DATA LIST LIST / Male Exposed Disease kount (4f5.0) . BEGIN DATA. 1 1 1 160 1 1 0 80 1 0 1 440 1 0 0 320 0 1 1 240 0 1 0 330 0 0 1 160 0 0 0 270 END DATA. DATASET Name Summary. VALUE LABELS Male 1 'Male' 0 'Female' / Exposed 1 'Yes' 0 'No' / Disease 1 'Yes (case)' 0 'No (control)' . *WEIGHT by kount.* LOGISTIC REGRESSION VAR=exposed /METHOD=ENTER disease male /PRINT=CI(95) /CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) . WEIGHT off. * Now write a file with one row per person * and run the same model with no WEIGHTING. LOOP id = 1 to kount. - XSAVE OUTFILE = "C:\Temp\Junk.sav" / Keep = id Male to Disease. END LOOP. EXECUTE. GET FILE = "C:\Temp\Junk.sav". DATASET NAME raw. DATASET ACTIVATE raw. LOGISTIC REGRESSION VAR=exposed /METHOD=ENTER disease male /PRINT=CI(95) /CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) . * Clean up the junk. DATASET ACTIVATE summary. DATASET CLOSE raw. ERASE FILE "C:\Temp\Junk.sav". * End of example. HTH. Alex Reutter wrote > DEATHS could be a count of the number of deaths observed for that > covariate pattern. > > Alex > > > > From: Bruce Weaver < > bruce.weaver@ > > > To: > SPSSX-L@.uga > , > Date: 07/03/2013 02:08 PM > Subject: Re: GENLOG vs GENLIN > Sent by: "SPSSX(r) Discussion" < > SPSSX-L@.uga > > > > > > Why are you treating Deaths as a count? A count regression model, as I > understand it, would be used when each individual has a count of events. > But death is a binary variable (0 or 1) for each individual -- and when > you > WEIGHT by DEATHS, you are in essence getting a row for each individual > where > the variable is either a 0 or a 1, not a count. I should think you want a > logistic regression model, or some other model for a binary outcome (e.g., > a > model yielding the relative risk or risk difference). > > HTH. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/GENLOG-vs-GENLIN-tp5720986p5720992.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Ah, I see. Well that makes things more difficult, doesn't it! ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Bruce Weaver
hello, thank you for the reply,
I'm treating deaths as a count because I have aggregated data, maybe I should have explicitly said that. In all the stuff I read about general loglinear regression, and in the spss tutorial "Using General Loglinear Analysis to Model Accident Rates" it is mentioned that "Since the accidents (deaths here) have been aggregated, you first need to weight the cases by Accidents. From the menus choose: Data > Weight Cases... Select Weight cases by. Select Accidents as the frequency variable. â–º Click OK." . The syntax ends up as: WEIGHT BY accid . GENLOG agecat gender /CSTRUCTURE = pop /MODEL = POISSON etc. So, since I have aggregated data I thought I should weight cases by deaths and run GENLOG, as the example shows. In the accidents.sav used for that tutorial the accidents variable shows how many accidents there were for individuals who belong in each combination of age and gender categories. This is what my file looks like approximately, but with more rows as I have more factors, and without the population at risk variable. |
In reply to this post by Alex Reutter
Yes, as I said in my original post, it is the number of deaths for each drug-gender category combination or as you say, for each specific covariate pattern, e.g. 120 deaths of males using heroin, not using any other drug, 15 deaths of females using heroin and cocaine and not using any other drug etc.
|
In reply to this post by Rich Ulrich
Thank you,
however I don't have the case-control design which would include the numbers of not-dying, in which case it would be a matter of making a dead/not dead binary dependent variable and carry out logistic regression. I don't have the denominators or population at risk in the data I've been given, I don't know if it would be possible to get them from when the data was collected or by some other means, or if it is not possible. |
This post was updated on .
In reply to this post by Bruce Weaver
CONTENTS DELETED
The author has deleted this message.
|
Administrator
|
That link goes to the wrong tutorial, I think. To find the one you want (with accident rates for ships), I had to navigate to: Loglinear Modeling > General Loglinear Analysis > Using General Loglinear Analysis to Model Accident Rates The accidents.sav file has only one row per covariate pattern, which is why it uses "WEIGHT by accid". But if you write the data to a file with one row per accident (see below), you can run the model without using WEIGHT, and get exactly the same results. See the example below. Given that the accidents.sav data file is so small, I recreated it here with a DATA LIST command so that folks who don't have ready access to the sample files can play along if they wish. HTH. NEW FILE. DATASET CLOSE all. * The following DATA LIST command reproduces * the data in sample file accidents.sav. DATA LIST list / agecat gender (2f1) accid pop (2f8.0). BEGIN DATA 1 1 57997 198522 2 1 57113 203200 3 1 54123 200744 1 0 63936 187791 2 0 64835 195714 3 0 66804 208239 END DATA. * The tutorial suggests the following analysis. WEIGHT BY accid . GENLOG agecat gender /CSTRUCTURE = pop /MODEL = POISSON /PRINT = FREQ RESID DEV ADJRESID ESTIM CORR COV /CRITERIA = CIN(95) ITERATE(20) CONVERGE(.001) DELTA(.5) /DESIGN . WEIGHT off. * Use XSAVE to write the data to a file that has one row per accident, * which is presumably one row per person. LOOP id = 1 to accid. - XSAVE OUTFILE = "C:\Temp\Junk.sav" / Keep = agecat gender pop. END LOOP. EXECUTE. * Open the file with one row per accident. GET FILE = "C:\Temp\Junk.sav". CROSSTABS agecat by gender. * Notice that the cell counts match the values * of variable accid in the original file. * No WEIGHT command this time. GENLOG agecat gender /CSTRUCTURE = pop /MODEL = POISSON /PRINT = FREQ RESID DEV ADJRESID ESTIM CORR COV /CRITERIA = CIN(95) ITERATE(20) CONVERGE(.001) DELTA(.5) /DESIGN . * Results match those from the tutorial using WEIGHTED data. * Clean up the junk. NEW FILE. DATASET CLOSE all. ERASE FILE "C:\Temp\Junk.sav".
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Alex Reutter
Thank you, I have checked and this was the problem.
|
In reply to this post by Bruce Weaver
Thank you.
So, let's suppose that my file also has one row per covariate pattern. I should be able to weight by deaths and run a GENLOG. I do not wish to make a file with one row per person or per accident, I want to be able to run the analysis with the aggregated data as it is. So, if my aggregate data file has one row per covariate pattern I can weight by deaths and run the GENLOG. |
In reply to this post by xenia
Hello all and thank you for all the contributions,
after making sure that the file had one row per covariate pattern, and doing WEIGHT by deaths and GENLOG, making sure that the model was ordered in GENLOG in the same way as it was in GENLIN (as Alex suggested), I did get the same results for the interactions from the two procedures, so my initial query was answered. However, after reading all the posts and the discussion by everyone, I have one more question: is it wrong to run GENLOG or GENLIN for count data if I don't have an offset variable? Many thanks to all |
Administrator
|
I don't claim great expertise in count regression models, and I can't give a nice concise answer to your question. But here's how I would approach trying to understand what the OFFSET value is doing. This is based on the tutorial at:
http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.cs%2Fgenlin_ships_intro.htm NEW FILE. DATASET CLOSE all. GET FILE = "C:\SPSSdata\ships.sav". /* Change path if necessary. * Run model with OFFSET = log_months_service, as in the tutorial. GENLIN damage_incidents BY type construction operation (ORDER=DESCENDING) /MODEL type construction operation INTERCEPT=YES OFFSET=log_months_service DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT MODELINFO FIT SUMMARY SOLUTION . * Now see what happens when different OFFSET values are used. * Run model with OFFSET option removed. GENLIN damage_incidents BY type construction operation (ORDER=DESCENDING) /MODEL type construction operation INTERCEPT=YES DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT MODELINFO FIT SUMMARY SOLUTION . * Now try various fixed OFFSET values. descriptives log_months_service . * OFFSET = MIN(log_months_service), or about 4. GENLIN damage_incidents BY type construction operation (ORDER=DESCENDING) /MODEL type construction operation INTERCEPT=YES OFFSET=4 DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT MODELINFO FIT SUMMARY SOLUTION . * OFFSET = MEAN(log_months_service), or about 7. GENLIN damage_incidents BY type construction operation (ORDER=DESCENDING) /MODEL type construction operation INTERCEPT=YES OFFSET=7 DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT MODELINFO FIT SUMMARY SOLUTION. * OFFSET = MAX(log_months_service), or about 11. GENLIN damage_incidents BY type construction operation (ORDER=DESCENDING) /MODEL type construction operation INTERCEPT=YES OFFSET=11 DISTRIBUTION=POISSON LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=PEARSON COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE /PRINT MODELINFO FIT SUMMARY SOLUTION. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by xenia
[one more try on posting - through Nabble this time.]
[My four attempts directly to the List on July 5 failed.] You can run the analysis without an "offset variable" of exposure or counts, but you can't draw many conclusions about "significant" tests. Does the high (or low) count reflect the population size, or does it reflect "risk"? If most of the population is White and Female, you could see main effects for high Deaths for W and F without any valid implication about higher Risk. Or, the excess in these population totals could weaken or mask the evidence that a group has low risk. I presume that Risk is more interesting, but you only have data for "frequency." I think that the most useful report from these data might be the univariate counts: What does characterize deaths? Beyond that, if there are *no* interactions that show up as significant, you might take that as evidence that these Main effect, univariate results are sufficient; that is the easiest conclusion to defend. Any narrative that you want to create about some interaction has to take into account the chance that the at-risk population shows the same disproportion in exposure, and so there is nothing special about this interaction. -- Rich Ulrich
|
Free forum by Nabble | Edit this page |