Multinomial regression with missing values

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Multinomial regression with missing values

Luca Meyer-3
Hi,
 
I am running NOMREG with some missing values and of course I get the following message:
 
Warnings
Unexpected singularities in the Hessian matrix are encountered. This indicates that either some predictor variables should be excluded or some categories should be merged.
The NOMREG procedure continues despite the above warning(s). Subsequent results shown are based on the last iteration. Validity of the model fit is uncertain.
 
Now, what could I do to fix it? Please notice that I neither can further aggregate nor eliminate predictors.
 
Should I substitue a very low value (something like 0,01) to the missing observations using weights? Or should I insert one observation instead of missing values? Or yet should I use an altogether different procedure?
 
What would you advice doing?
 
Thanks,
Luca
 
Reply | Threaded
Open this post in threaded view
|

Re: Multinomial regression with missing values

Bruce Weaver
Administrator
Luca Meyer-3 wrote
Hi,

I am running NOMREG with some missing values and of course I get the
following message:

  *
Warnings*
 Unexpected singularities in the Hessian matrix are encountered. This
indicates that either some predictor variables should be excluded or some
categories should be merged.
 The NOMREG procedure continues despite the above warning(s). Subsequent
results shown are based on the last iteration. Validity of the model fit is
uncertain.

Now, what could I do to fix it? Please notice that I neither can further
aggregate nor eliminate predictors.

Should I substitue a very low value (something like 0,01) to the missing
observations using weights? Or should I insert one observation instead of
missing values? Or yet should I use an altogether different procedure?

What would you advice doing?

Thanks,
Luca

Why do you have missing data?  Is it a case where you can justify using multiple imputation?

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Multinomial regression with missing values

Steve Simon, P.Mean Consulting
In reply to this post by Luca Meyer-3
Luca Meyer wrote:

> I am running NOMREG with some missing values and of course I get the
> following message:
>
> *
> Warnings
> *
>
> Unexpected singularities in the Hessian matrix are encountered. This
> indicates that either some predictor variables should be excluded or
> some categories should be merged.
> The NOMREG procedure continues despite the above warning(s). Subsequent
> results shown are based on the last iteration. Validity of the model fit
> is uncertain.
>
>
> Now, what could I do to fix it? Please notice that I neither can further
> aggregate nor eliminate predictors.
>
> Should I substitue a very low value (something like 0,01) to the missing
> observations using weights? Or should I insert one observation instead
> of missing values? Or yet should I use an altogether different procedure?

You can indeed further aggregate and you can indeed eliminate
predictors. What you mean to say is that you can't do this without
losing something that is essential to your data analysis.

Here's some practical advice. Don't invent data with very low weights or
insert a single observation in place of missing. You won't ever trust
such a model.

Singularities mean that you don't have enough data spread throughout key
  regions of your data set. You can't investigate the interaction of
race and gender, for example, if all of your males are black and all of
your females are white.

You need to discover what portions of the data set have adequate data
coverage and what portion of the data set have inadequate data coverage.
That means doing exactly what you don't want to do. Fit simpler models
with fewer independent variables and/or combine some of the levels of
your dependent variable.

You won't report these simpler models (though maybe you should), but by
learning which models generate the singularity message and which models
do not, you'll learn where the data allows you to make substantive
predictions and where it does not.

Until you find the particular levels of your outcome variable or the
particular combinations of independent variables are causing the
singularity, you will not be able to make any progress.

It will take a lot of trial and error (I don't know of any other way to
do this), but eventually, you will be able to find the cause of the
singularity. Then in your paper, state that the data set failed to allow
you to investigate (fill in something here).

You can't invent some white males and black females and then expect the
interaction of race and gender to be anything other than an artefact of
your own creation.

Someone else suggested using multiple imputation. That's a reasonable
suggestion if you have a lot of data outside of the data used in the
model that can help impute missing values. But imputation won't remove a
singularity from your data unless you are very very lucky. If there are
no data values in the white male and black female cells, how can you
impute those values? Imputation will work if you are trying to impute
something like socioeconomic status (SES) and for those observations
where SES is missing you have a lot of good proxy variable hanging
around. I suspect this is not the case.

Sometimes you can perform a sensitivity analysis by making two extreme
assumptions about missing values, one most favorable to the null
hypothesis and another most favorable to the alternative hypothesis and
then comparing the results. If you draw the same conclusion even with
the two extreme assumptions, then hopefully your conclusions are robust
to any assumption in between these two. Sensitivity analysis probably
will not work for a singularity, though, except to confirm that making
two extreme assumptions is likely to produce two very different conclusions.

Sorry that I could not be more help. You're trying to squeeze blood from
a turnip and there is no statistical trick that can get you what you
want. Eventually you will have to simplify your data analysis, as much
as you hate to do so.
--
Steve Simon, Standard Disclaimer
Second free statistics webinar, Wed, Nov 4, 11am-noon CST.
"The first three steps in data entry, with examples in PASW/SPSS"
Details at www.pmean.com/webinars

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD