It is not very important what the distribution of the raw scores on
INHALER2 is. Regression assumes that the
residuals
are not severely discrepant from interval.
If you treat the IVs and DV as nominal you have a 7 way crosstab as the
basis of your analysis. Necessarily most cells are empty.
If you treat the IVs as nominal and the DV as continuous,
you have a 6 way crosstab as your IV side with 1800 cells (a 6 way
ANOVA). The ANOVA approach necessarily has empty cells so higher order
interactions need to be pooled.
If the values of all of the IVs are at least ordinal,I would suggest
starting with the most straightforward analysis.i.e., treat them as not
very discrepant from interval.
do ordinary regression first. Examine the residuals see if they are
very far from normal looking.
Then use CATREG and see how much difference it makes in the substantive
conclusions if you relax the level of measurement.
If level of measurement assumptions seem to be important, try an ANOVA
via GLM. There should be info in the archives on ANOVA with empty
design cells.
Art Kendall
Social Research Consultants
SoS Statistical Services wrote:
Thanks, Art.
Below is some additional info about my variables - the dependent
variables is INHALER2 with the 6 independent (ordinal) variables shown
below. The original INHALER variable is coded 0 to 9 with the intention
of using multiple regression but I decided to categorise it into 4
groups because it was heavily skewed and hence use ordinal regressin
instead.
Case Processing Summary
|
N
|
Marginal Percentage
|
|
INHALER2
|
0
|
328
|
43.7%
|
|
1-3
|
62
|
8.3%
|
|
4-6
|
141
|
18.8%
|
|
7-9
|
219
|
29.2%
|
|
BREATH2
|
1.00
|
220
|
29.3%
|
|
2.00
|
124
|
16.5%
|
|
3.00
|
140
|
18.7%
|
|
4.00
|
109
|
14.5%
|
|
5.00
|
157
|
20.9%
|
|
COUGH2
|
1.00
|
286
|
38.1%
|
|
2.00
|
229
|
30.5%
|
|
3.00
|
235
|
31.3%
|
|
ACTIV2
|
1.00
|
233
|
31.1%
|
|
2.00
|
168
|
22.4%
|
|
3.00
|
165
|
22.0%
|
|
4.00
|
184
|
24.5%
|
|
SPUCOL2
|
1.00
|
30
|
4.0%
|
|
2.00
|
218
|
29.1%
|
|
3.00
|
156
|
20.8%
|
|
4.00
|
171
|
22.8%
|
|
5.00
|
175
|
23.3%
|
|
SPUVOL2
|
1.00
|
227
|
30.3%
|
|
2.00
|
207
|
27.6%
|
|
3.00
|
136
|
18.1%
|
|
4.00
|
180
|
24.0%
|
|
WHEZ2
|
1.00
|
382
|
50.9%
|
|
2.00
|
368
|
49.1%
|
|
Valid
|
750
|
100.0%
|
|
Missing
|
0
|
|
|
Total
|
750
|
|
Evie Gardner
--- On Tue, 3/3/09, Art Kendall [hidden email]
wrote:
From:
Art Kendall [hidden email]
Subject: Re: ordinal regression - no. cases/independent variables
To: [hidden email]
Date: Tuesday, 3 March, 2009, 1:52 PM
if you had 6 categorical variables each
with exactly 3 values, that would mean that you have a minimum of 3^6=
729 cases for each value of the DV. Even if you treated the DV as not
very discrepant from interval you still have at least 729 cells on the
independent side.
Are you sure that none of your variables are somewhere near interval
level?
Please describe your study in more detail. Perhaps if you posted the
data definition variable names, variable labels, values and value
labels, and what level of measurement assumptions you are willing to
make, members of the list could offer suggestions.
Think of the situation as a 7 way crosstab.
Art Kendall
Social Research Consultants
SoS Statistical Services wrote:
|
I am doing ordinal regression using SPSS and
get the following warning:
|
There are 1467
(71.5%) cells (i.e., dependent variable levels by combinations of
predictor variable values) with zero frequencies.
|
I have 750 cases, 6 categorical independent variables (from 3-5 groups
with 100+ per group, often 200+) and my dependent variable has 4
categories with frequencies of 328, 62, 141, 219 respectively.
Does anyone know of a rule-of-thumb or
recommendation regarding no. cases, no. independent variables etc for
ordinal regression.
Thank you for your help.
Evie
|
|
Art Kendall
Social Research Consultants