ordinal regression - no. cases/independent variables

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ordinal regression - no. cases/independent variables

SoS Statistical Services
I am  doing ordinal regression using SPSS and get the following warning:
 

There are 1467 (71.5%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies.


I have 750 cases, 6 categorical independent variables (from 3-5 groups with 100+ per group, often 200+) and my dependent variable has 4 categories with frequencies of 328, 62, 141, 219 respectively.
 
Does anyone know of a  rule-of-thumb or recommendation regarding no. cases, no. independent variables etc for ordinal  regression.
 
Thank you for your help.
 
Evie

Reply | Threaded
Open this post in threaded view
|

Re: ordinal regression - no. cases/independent variables

SoS Statistical Services

Thanks, Art. Below is some additional info about my variables - the dependent variables is INHALER2 with the 6 independent (ordinal) variables shown below. The original INHALER variable is coded 0 to 9 with the intention of using multiple regression but I decided to categorise it into 4 groups because it was heavily skewed and hence use ordinal regressin instead.

 

Case Processing Summary

N

Marginal Percentage

INHALER2

0

328

43.7%

1-3

62

8.3%

4-6

141

18.8%

7-9

219

29.2%

BREATH2

1.00

220

29.3%

2.00

124

16.5%

3.00

140

18.7%

4.00

109

14.5%

5.00

157

20.9%

COUGH2

1.00

286

38.1%

2.00

229

30.5%

3.00

235

31.3%

ACTIV2

1.00

233

31.1%

2.00

168

22.4%

3.00

165

22.0%

4.00

184

24.5%

SPUCOL2

1.00

30

4.0%

2.00

218

29.1%

3.00

156

20.8%

4.00

171

22.8%

5.00

175

23.3%

SPUVOL2

1.00

227

30.3%

2.00

207

27.6%

3.00

136

18.1%

4.00

180

24.0%

WHEZ2

1.00

382

50.9%

2.00

368

49.1%

Valid

750

100.0%

Missing

0

Total

750



Evie Gardner

--- On Tue, 3/3/09, Art Kendall <[hidden email]> wrote:
From: Art Kendall <[hidden email]>
Subject: Re: ordinal regression - no. cases/independent variables
To: [hidden email]
Date: Tuesday, 3 March, 2009, 1:52 PM

if you had 6 categorical variables each with exactly 3 values, that would mean that you have a minimum of 3^6= 729 cases for each value of the DV.  Even if you treated the DV as not very discrepant from interval you still have at least 729 cells on the independent side.

Are you sure that none of your variables are somewhere near interval level?

Please describe your study in more detail. Perhaps if you posted the data definition variable names, variable labels, values and value labels, and what level of measurement assumptions you are willing to make, members of the list could offer suggestions.

Think of the situation as a 7 way crosstab.

Art Kendall
Social Research Consultants


SoS Statistical Services wrote:
I am  doing ordinal regression using SPSS and get the following warning:
 

There are 1467 (71.5%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies.


I have 750 cases, 6 categorical independent variables (from 3-5 groups with 100+ per group, often 200+) and my dependent variable has 4 categories with frequencies of 328, 62, 141, 219 respectively.
 
Does anyone know of a  rule-of-thumb or recommendation regarding no. cases, no. independent variables etc for ordinal  regression.
 
Thank you for your help.
 
Evie


Reply | Threaded
Open this post in threaded view
|

Re: ordinal regression - no. cases/independent variables

Art Kendall
It is not very important what the distribution of the raw scores on INHALER2 is.  Regression assumes that the residuals are not severely discrepant from interval.

If you treat the IVs and DV as nominal you have a 7 way crosstab as the basis of your analysis. Necessarily most cells are empty.
If you treat the IVs              as nominal and the DV as continuous, you have a 6 way crosstab as your IV side with 1800 cells (a 6 way ANOVA). The ANOVA approach necessarily has empty cells so higher order interactions need to be pooled.

If the values of all of the IVs are at least ordinal,I would suggest starting with the most straightforward analysis.i.e., treat them as not very discrepant from interval.
do ordinary regression first.  Examine the residuals see if they are very far from normal looking.

Then use CATREG and see how much difference it makes in the substantive conclusions if you relax the level of measurement.

If level of measurement assumptions seem to be important, try an ANOVA via GLM.  There should be info in the archives on ANOVA with empty design cells.


Art Kendall
Social Research Consultants



SoS Statistical Services wrote:

Thanks, Art. Below is some additional info about my variables - the dependent variables is INHALER2 with the 6 independent (ordinal) variables shown below. The original INHALER variable is coded 0 to 9 with the intention of using multiple regression but I decided to categorise it into 4 groups because it was heavily skewed and hence use ordinal regressin instead.

 

Case Processing Summary


N

Marginal Percentage

INHALER2

0

328

43.7%

1-3

62

8.3%

4-6

141

18.8%

7-9

219

29.2%

BREATH2

1.00

220

29.3%

2.00

124

16.5%

3.00

140

18.7%

4.00

109

14.5%

5.00

157

20.9%

COUGH2

1.00

286

38.1%

2.00

229

30.5%

3.00

235

31.3%

ACTIV2

1.00

233

31.1%

2.00

168

22.4%

3.00

165

22.0%

4.00

184

24.5%

SPUCOL2

1.00

30

4.0%

2.00

218

29.1%

3.00

156

20.8%

4.00

171

22.8%

5.00

175

23.3%

SPUVOL2

1.00

227

30.3%

2.00

207

27.6%

3.00

136

18.1%

4.00

180

24.0%

WHEZ2

1.00

382

50.9%

2.00

368

49.1%

Valid

750

100.0%

Missing

0


Total

750




Evie Gardner

--- On Tue, 3/3/09, Art Kendall [hidden email] wrote:

From: Art Kendall [hidden email]
Subject: Re: ordinal regression - no. cases/independent variables
To: [hidden email]
Date: Tuesday, 3 March, 2009, 1:52 PM

if you had 6 categorical variables each with exactly 3 values, that would mean that you have a minimum of 3^6= 729 cases for each value of the DV.  Even if you treated the DV as not very discrepant from interval you still have at least 729 cells on the independent side.

Are you sure that none of your variables are somewhere near interval level?

Please describe your study in more detail. Perhaps if you posted the data definition variable names, variable labels, values and value labels, and what level of measurement assumptions you are willing to make, members of the list could offer suggestions.

Think of the situation as a 7 way crosstab.

Art Kendall
Social Research Consultants


SoS Statistical Services wrote:
I am  doing ordinal regression using SPSS and get the following warning:
 

There are 1467 (71.5%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies.


I have 750 cases, 6 categorical independent variables (from 3-5 groups with 100+ per group, often 200+) and my dependent variable has 4 categories with frequencies of 328, 62, 141, 219 respectively.
 
Does anyone know of a  rule-of-thumb or recommendation regarding no. cases, no. independent variables etc for ordinal  regression.
 
Thank you for your help.
 
Evie


Art Kendall
Social Research Consultants