What am I losing using a logistic regression

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

What am I losing using a logistic regression

Moshe Marko
Hi,
 
I have a dependent variable with a scoring range of 0-40. 50% of the subject scored 0-5 with most of them actually scoring 0. I decided to dichotomize the outcome where the cut-off score is 5 or above. Due to what appears to be abnormal distribution I thought that I should use logistic regression rather than least squares method. What am I losing by using the logistic regression?  
 
Thanks
 
Moshe 
 
 
Moshe Marko, PT, DPT, MHS, OCS, CSCS
Assistant Professor
Department of Physical Therapy Education
College of Health Professions
SUNY Upstate Medical University
Room 2232  Silverman Hall
750 Adams Street
Syracuse, NY 13210-1834
315 464 6577
FAX 315 464 6887
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: What am I losing using a logistic regression

Hector Maletta

Dichotomizing necessarily involves losing information. Now in your case what you appear to have is a sort of Poisson distribution, where the most frequent event is zero, then rapidly decreasing numbers in the range 1-5, and even less in higher values. Thus you may want to use Poisson regression.

On the other hand, if you must dichotomize, why not dichotomizing at “zero” and “1 or more”? Seems more reasonable to me, without knowing the actual content of your research. The value 5 does not seem to have any intrinsic characteristic to make it the critical value, especially because most of those below 5 are actually zero.

 

Hector

 

De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Moshe Marko
Enviado el: Friday, September 23, 2011 12:56
Para: [hidden email]
Asunto: What am I losing using a logistic regression

 

Hi,

 

I have a dependent variable with a scoring range of 0-40. 50% of the subject scored 0-5 with most of them actually scoring 0. I decided to dichotomize the outcome where the cut-off score is 5 or above. Due to what appears to be abnormal distribution I thought that I should use logistic regression rather than least squares method. What am I losing by using the logistic regression?  

 

Thanks

 

Moshe 

 

 

Moshe Marko, PT, DPT, MHS, OCS, CSCS
Assistant Professor
Department of Physical Therapy Education
College of Health Professions
SUNY Upstate Medical University
Room 2232  Silverman Hall
750 Adams Street
Syracuse, NY 13210-1834
315 464 6577
FAX 315 464 6887
[hidden email]


No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1410 / Virus Database: 1520/3914 - Release Date: 09/23/11

Reply | Threaded
Open this post in threaded view
|

Re: What am I losing using a logistic regression

Rich Ulrich
I agree with Hector, that zero is very often important, and it makes
sense to at least consider taking it alone, "none" versus "some".
And also, that it is wasteful to dichotomize.  However, "mostly-zero"
with scores running to 40 is not a very likely Poisson.

And that reminds me that sometimes there is a reasonable distribution
for the rest, once zero is excluded.  Does the density decrease as
scores increase, or is there some other shape to what is left?

Given a variable that is merely highly skewed, it is my tendency to
look for a reasonable transformation that yields something close to
equal-intervals in the latent quality being assessed.  Is zero
reasonable as a step below 1, or is there something special about
zero?

It could be better to use a second variable to describe non-linearity.
In this case, the simple procedure might be this -- to do one analysis
for none/ some  and a second analysis that *excludes* the data with
zero, and uses either the 1-40 score, or a transformation of it.

--
Rich Ulrich




Date: Fri, 23 Sep 2011 13:46:50 -0300
From: [hidden email]
Subject: Re: What am I losing using a logistic regression
To: [hidden email]

Dichotomizing necessarily involves losing information. Now in your case what you appear to have is a sort of Poisson distribution, where the most frequent event is zero, then rapidly decreasing numbers in the range 1-5, and even less in higher values. Thus you may want to use Poisson regression.

On the other hand, if you must dichotomize, why not dichotomizing at “zero” and “1 or more”? Seems more reasonable to me, without knowing the actual content of your research. The value 5 does not seem to have any intrinsic characteristic to make it the critical value, especially because most of those below 5 are actually zero.

 

Hector

[snip, previous]
Reply | Threaded
Open this post in threaded view
|

Re: What am I losing using a logistic regression

Swank, Paul R

How about a negative binomial distribution or perhaps a zero inflated negative binomial if the number of zero responses is too large?

 

Dr. Paul R. Swank,

Children's Learning Institute

Professor, Department of Pediatrics, Medical School

Adjunct Professor, School of Public Health

University of Texas Health Science Center-Houston

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Rich Ulrich
Sent: Friday, September 23, 2011 1:02 PM
To: [hidden email]
Subject: Re: What am I losing using a logistic regression

 

I agree with Hector, that zero is very often important, and it makes
sense to at least consider taking it alone, "none" versus "some".
And also, that it is wasteful to dichotomize.  However, "mostly-zero"
with scores running to 40 is not a very likely Poisson.

And that reminds me that sometimes there is a reasonable distribution
for the rest, once zero is excluded.  Does the density decrease as
scores increase, or is there some other shape to what is left?

Given a variable that is merely highly skewed, it is my tendency to
look for a reasonable transformation that yields something close to
equal-intervals in the latent quality being assessed.  Is zero
reasonable as a step below 1, or is there something special about
zero?

It could be better to use a second variable to describe non-linearity.
In this case, the simple procedure might be this -- to do one analysis
for none/ some  and a second analysis that *excludes* the data with
zero, and uses either the 1-40 score, or a transformation of it.

--
Rich Ulrich



Date: Fri, 23 Sep 2011 13:46:50 -0300
From: [hidden email]
Subject: Re: What am I losing using a logistic regression
To: [hidden email]

Dichotomizing necessarily involves losing information. Now in your case what you appear to have is a sort of Poisson distribution, where the most frequent event is zero, then rapidly decreasing numbers in the range 1-5, and even less in higher values. Thus you may want to use Poisson regression.

On the other hand, if you must dichotomize, why not dichotomizing at “zero” and “1 or more”? Seems more reasonable to me, without knowing the actual content of your research. The value 5 does not seem to have any intrinsic characteristic to make it the critical value, especially because most of those below 5 are actually zero.

 

Hector

[snip, previous]

Reply | Threaded
Open this post in threaded view
|

Re: What am I losing using a logistic regression

Ryan
In reply to this post by Moshe Marko
Moshe,

Please provide a more accurate distribution of your data--perhaps you
should just provide a frequency distribution table. Also, please tell
us what this variable represents.

Ryan

On Fri, Sep 23, 2011 at 11:56 AM, Moshe Marko <[hidden email]> wrote:

> Hi,
>
> I have a dependent variable with a scoring range of 0-40. 50% of the subject
> scored 0-5 with most of them actually scoring 0. I decided to dichotomize
> the outcome where the cut-off score is 5 or above. Due to what appears to be
> abnormal distribution I thought that I should use logistic regression rather
> than least squares method. What am I losing by using the logistic
> regression?
>
> Thanks
>
> Moshe
>
>
> Moshe Marko, PT, DPT, MHS, OCS, CSCS
> Assistant Professor
> Department of Physical Therapy Education
> College of Health Professions
> SUNY Upstate Medical University
> Room 2232  Silverman Hall
> 750 Adams Street
> Syracuse, NY 13210-1834
> 315 464 6577
> FAX 315 464 6887
> [hidden email]
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: What am I losing using a logistic regression

Ryan
In reply to this post by Moshe Marko
Do not simply dichotomize your data. You have several options which
are partly dependent on the distribution (some of which have already
been mentioned). What is the range? What is the shape across the
entire range? Are those minimum and maximum values absolute limits, in
that no matter what (even with a new sample), those limits could never
be crossed? In the same vein, please provide a detailed explanation as
to what these scores actually represent.

Bottom line--more information would be helpful.

Ryan

On Fri, Sep 23, 2011 at 11:56 AM, Moshe Marko <[hidden email]> wrote:

> Hi,
>
> I have a dependent variable with a scoring range of 0-40. 50% of the subject
> scored 0-5 with most of them actually scoring 0. I decided to dichotomize
> the outcome where the cut-off score is 5 or above. Due to what appears to be
> abnormal distribution I thought that I should use logistic regression rather
> than least squares method. What am I losing by using the logistic
> regression?
>
> Thanks
>
> Moshe
>
>
> Moshe Marko, PT, DPT, MHS, OCS, CSCS
> Assistant Professor
> Department of Physical Therapy Education
> College of Health Professions
> SUNY Upstate Medical University
> Room 2232  Silverman Hall
> 750 Adams Street
> Syracuse, NY 13210-1834
> 315 464 6577
> FAX 315 464 6887
> [hidden email]
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD