empty cells warning

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

empty cells warning

txmama
Hello -

I'm running an ordinal logistic regression model and get the following warning

"Warnings
There are 9046 (86.1%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies."

I understand this leads to questions of how reliable the model fit may be. What solutions have others used? I've tried modifying the model but it doesn't get me very far without losing substantive value.

Thanks so much!



Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

Bruce Weaver
Administrator
Let me guess:  One or more of the predictor variables are continuous, right?  

I used to get the same kind of message when running NOMREG with continuous predictors, and I thought it was rather silly be including continuous variables in crosstabs when defining "cells".  What I typically did in those cases was re-run the model without any continuous predictors included to get a message telling me what percentage of cells were empty with cells defined by the crosstabulation of all categorical explanatory variables and the outcome variable.  

HTH.


txmama wrote
Hello -

I'm running an ordinal logistic regression model and get the following warning

"Warnings
There are 9046 (86.1%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies."

I understand this leads to questions of how reliable the model fit may be. What solutions have others used? I've tried modifying the model but it doesn't get me very far without losing substantive value.

Thanks so much!
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

txmama
Thanks, Bruce -

actually, none of the predictors are continuous. A couple are larger categorical, but not huge number of categories.

Other ideas?

Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

Bruce Weaver
Administrator
In that case, you do have a problem.  One option would be to reduce the number of categories for variables that have a lot of categories (assuming this can be done in some reasonable and sensible way).  On the other hand, if you do have any variables that are actually continuous, but have been converted into categories, you could revert to the original continuous variables (e.g., if you are using age categories, but have actual age, use actual age instead).  

What are the categorical variables, and how many categories do they have?

HTH.


txmama wrote
Thanks, Bruce -

actually, none of the predictors are continuous. A couple are larger categorical, but not huge number of categories.

Other ideas?
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

txmama
the predictors are organized in sections
demographics:
Generation (GENX,Y = 1; Baby Boomers = 0)
Income *was categorical* recoded to (low = 0; middle = 1; high = 2)
Rural (0,1)
Survey administration (0 = Phone; 1 = online)
Education (1 = high; 0 = low)
Male (1; 0)
Technology:
High Speed Internet at home (1= yes; 0 = no)
Hours use internet/wk (my mistake - this is continuous, but bivariate and Wald don't indicate it's the offender).
Security concerns:
Security concerns (additive index 4 to 20) - no indication this is the offender

Even with simplified model - just demographic variables, still getting about 50% empty cells.
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

Bruce Weaver
Administrator
What's the sample size?  Sounds like it might just be too low to support a model of that complexity.


txmama wrote
the predictors are organized in sections
demographics:
Generation (GENX,Y = 1; Baby Boomers = 0)
Income *was categorical* recoded to (low = 0; middle = 1; high = 2)
Rural (0,1)
Survey administration (0 = Phone; 1 = online)
Education (1 = high; 0 = low)
Male (1; 0)
Technology:
High Speed Internet at home (1= yes; 0 = no)
Hours use internet/wk (my mistake - this is continuous, but bivariate and Wald don't indicate it's the offender).
Security concerns:
Security concerns (additive index 4 to 20) - no indication this is the offender

Even with simplified model - just demographic variables, still getting about 50% empty cells.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

txmama
:)

Thanks Bruce!
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

Art Kendall
In reply to this post by txmama
Perhaps your model in unduly complex.

What is your DV?  What is its construct and what are the values it can take?
Is a less coarse operationalization of the construct underlying the DV available?
Did you use CATREG to decide that you cannot use it as interval?

Are less coarse measures of the IV's available? e.g., how did you originally measure generation, income, education, etc. 

over all your variables. are there some where the underlying construct is continuous but there wee only a few values allowed in the raw data?
Art Kendall
Social Research Consultants
On 1/10/2014 6:15 PM, Bruce Weaver [via SPSSX Discussion] wrote:
What's the sample size?  Sounds like it might just be too low to support a model of that complexity.


txmama wrote
the predictors are organized in sections
demographics:
Generation (GENX,Y = 1; Baby Boomers = 0)
Income *was categorical* recoded to (low = 0; middle = 1; high = 2)
Rural (0,1)
Survey administration (0 = Phone; 1 = online)
Education (1 = high; 0 = low)
Male (1; 0)
Technology:
High Speed Internet at home (1= yes; 0 = no)
Hours use internet/wk (my mistake - this is continuous, but bivariate and Wald don't indicate it's the offender).
Security concerns:
Security concerns (additive index 4 to 20) - no indication this is the offender

Even with simplified model - just demographic variables, still getting about 50% empty cells.
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/empty-cells-warning-tp5723859p5723881.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: empty cells warning

Art Kendall
In reply to this post by Bruce Weaver
Perhaps your model in unduly complex.

What is your DV?  What is its construct and what are the values it can take?
Is a less coarse operationalization of the construct underlying the DV available?
Did you use CATREG to decide that you cannot use it as interval?

Are less coarse measures of the IV's available? e.g., how did you originally measure generation, income, education, etc. 

over all your variables. are there some where the underlying construct is continuous but there wee only a few values allowed in the raw data?
Art Kendall
Social Research Consultants
On 1/10/2014 6:15 PM, Bruce Weaver [via SPSSX Discussion] wrote:
What's the sample size?  Sounds like it might just be too low to support a model of that complexity.


txmama wrote
the predictors are organized in sections
demographics:
Generation (GENX,Y = 1; Baby Boomers = 0)
Income *was categorical* recoded to (low = 0; middle = 1; high = 2)
Rural (0,1)
Survey administration (0 = Phone; 1 = online)
Education (1 = high; 0 = low)
Male (1; 0)
Technology:
High Speed Internet at home (1= yes; 0 = no)
Hours use internet/wk (my mistake - this is continuous, but bivariate and Wald don't indicate it's the offender).
Security concerns:
Security concerns (additive index 4 to 20) - no indication this is the offender

Even with simplified model - just demographic variables, still getting about 50% empty cells.
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/empty-cells-warning-tp5723859p5723881.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants