|
Hi everyone,
I have two income variables: the first one excludes the zero values, and the other income variable includes the zeros values. When I run a log reg, all the dummies for the variable with the non-zeros are significant; whereas the one with the zeros included are all non-significant. Why is this the case? Should I exclude the non-zero values then? I'd greatly appreciate any advice and/or suggestions. Regards, Grigoris Argeros PhD Candidate in Sociology Fordham University |
|
Grigoris, your explanation is not completely clear. I presume you use income
as a predictor for some dichotomous outcome, but your income variable is not an interval variable; you may have a variable defined as income brackets, converted into a series of dummies. This series of dummies may or may not include cases with zero income. The decision about using or not using zero income cases depends on what these cases mean. They may represent authentic cases of zero income, or simply cases where the information about income is missing. Similar to the case of the missing information is the case in which the definition of income is too narrow. For instance, in US household surveys "income" used to be defined in such a way that remittances or family help did not count as income, and therefore a parent-supported student living alone (say the young G.W. Bush during his Yale years) appeared as a one-person household without an income, and is thus classified as below the poverty line. Likewise for people living off savings, student loans or remittances. If a zero income represents cases of missing information (there is an income but it is not reported), or it reflects a definition of income that is too narrow, then those cases should be excluded from the sample, because you do not know the actual income. Instead, if those cases represent cases actually without an income, they should be kept in the study. Who may really be without an income? Hardly a household, if "income" is properly defined, but if the study is about individual persons, workers and not workers, there would of course be people not earning any income. Among workers, it is perfectly possible to have unpaid workers (e.g. family help) not getting any monetary income for their efforts. But even in this case, however, there is an indirect income; it can be estimated from the production side since unpaid family contribute to the revenue at the family farm or shop, and some imputation might be used to figure out how much revenue is generated by that unpaid work. In the opposite calculation from the income side, there is also an income to the worker in the form of sharing in family consumption (shelter, food, clothing, etc., the value of which could be estimated. (A usual shortcut is assigning unpaid workers the going wage rate for their kind of work). Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of grigoris Sent: 01 July 2009 14:39 To: [hidden email] Subject: logistic regression with zero values Hi everyone, I have two income variables: the first one excludes the zero values, and the other income variable includes the zeros values. When I run a log reg, all the dummies for the variable with the non-zeros are significant; whereas the one with the zeros included are all non-significant. Why is this the case? Should I exclude the non-zero values then? I'd greatly appreciate any advice and/or suggestions. Regards, Grigoris Argeros PhD Candidate in Sociology Fordham University -- View this message in context: http://www.nabble.com/logistic-regression-with-zero-values-tp24294074p242940 74.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
To add to Hector's comments: I've seen where zero (or negative) income
is reported by survey respondents in secondary data, but... of course that's not accurate (in the sense they have no income). In looking at other parts of the data, the person has assets and other indicators that they imply that they aren't, ummm, "the idle rich". What can it be? In the U.S., it's probably net income after deductions (including losses) reported to the Federal government. This also includes cases of some people who are what we would measure as below the poverty line (i.e., people with low income, but they are clearly not in that category based on a look at other responses in their case). [Of course, why they would report it like this in a survey is a discussion for another time/place...] David Chapman, PhD [hidden email] On Wed, Jul 1, 2009 at 2:31 PM, Hector Maletta<[hidden email]> wrote: > Grigoris, your explanation is not completely clear. I presume you use income > as a predictor for some dichotomous outcome, but your income variable is not > an interval variable; you may have a variable defined as income brackets, > converted into a series of dummies. This series of dummies may or may not > include cases with zero income. > The decision about using or not using zero income cases depends on what > these cases mean. They may represent authentic cases of zero income, or > simply cases where the information about income is missing. > Similar to the case of the missing information is the case in which the > definition of income is too narrow. For instance, in US household surveys > "income" used to be defined in such a way that remittances or family help > did not count as income, and therefore a parent-supported student living > alone (say the young G.W. Bush during his Yale years) appeared as a > one-person household without an income, and is thus classified as below the > poverty line. Likewise for people living off savings, student loans or > remittances. > If a zero income represents cases of missing information (there is an income > but it is not reported), or it reflects a definition of income that is too > narrow, then those cases should be excluded from the sample, because you do > not know the actual income. Instead, if those cases represent cases actually > without an income, they should be kept in the study. > Who may really be without an income? Hardly a household, if "income" is > properly defined, but if the study is about individual persons, workers and > not workers, there would of course be people not earning any income. > Among workers, it is perfectly possible to have unpaid workers (e.g. family > help) not getting any monetary income for their efforts. But even in this > case, however, there is an indirect income; it can be estimated from the > production side since unpaid family contribute to the revenue at the family > farm or shop, and some imputation might be used to figure out how much > revenue is generated by that unpaid work. In the opposite calculation from > the income side, there is also an income to the worker in the form of > sharing in family consumption (shelter, food, clothing, etc., the value of > which could be estimated. (A usual shortcut is assigning unpaid workers the > going wage rate for their kind of work). > Hector > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > grigoris > Sent: 01 July 2009 14:39 > To: [hidden email] > Subject: logistic regression with zero values > > Hi everyone, > > I have two income variables: the first one excludes the zero values, and the > other income variable includes the zeros values. > > When I run a log reg, all the dummies for the variable with the non-zeros > are significant; whereas the one with the zeros included are all > non-significant. > > Why is this the case? Should I exclude the non-zero values then? I'd greatly > appreciate any advice and/or suggestions. > > Regards, > Grigoris Argeros > PhD Candidate in Sociology > Fordham University > -- > View this message in context: > http://www.nabble.com/logistic-regression-with-zero-values-tp24294074p242940 > 74.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
