Hello to everybody!
I have used in my regression a dummy for marriage: married, single, divorced (the base dummy), sex (female=0 and male=1), and if someone has any children: child, nochild(base dummy) and other. I created an interactive dummy if someone is man and married (sex*married) and another ifsomeone is a man and has children (man*child). My first question is to which I compare each one of them. For example for the first case I compare the married man to those men not married or to women? Also, I will use in my regression both the interactive and the first dummies (sex*married, sex and married)? Finally, can I use interactive dummies for both sexes at the same time? By this I mean, can I have in my regression male*married, male*child, woman*married and woman*child? Thank you very much for your valuable help! Dimitris! _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ |
Here's another one I get to add two cents on. I'm a time series guy.
Suppose someone decides to adjust for seasonality by in monthly data by inserting 0/1 dummies for each month of the year. If you run that with also a constant of regression, a vector of 1's, you get an error message/it doesn't run...the error message in my old eviews program was Near Singular Matrix.... Ok that's perfect multicollinearity... The problem with too many dummies is you start to get near multicollinearity, it will depend on your data, but when you have mutually exclusive categories like months of the year you will get the singularity problem once the number of dummies equals the number of categies...then you can dump your regression constant and it will run, but now what you are actually calculating is not some supposed effect of your dummies, but the combined constant of regression/dummy.... essentially its a multicollinearity/near multicollinearity problem, a too many variables problem.... What you may want to do for your analysis is run it using just one or a few dummies at a time... Technically you may be able to do something and the machine will calculate it, but it is usually not a good idea, not in anything I have done. Much of my panel data knowledge comes form the current population survey et al analysing participation in the labor force by sex, but then we also account for race et al... You have People who are black/white and male/female so 4 combinations of categories, then you also have is someone living in an SMSA (they don't actually use that word any more). You can put dummies on 3 of the 4 race/sex categories, but you don't neccessarily want to. If you put dummies on all 4 categeries, which are mutually exclusive but also exhaustive (every one is in exactly one categor) then you have produce a convex combination of the constant of regression, a vector of ones, in this case the sum of your dummy vectors is a vector of ones. The SMSA dummy can in general be used with anything because it is not perfectly correlated with any of the other dummies or any combination of them, but you may get the problem of it is very correlated with something else. Many things related to race/sex have issues because race and sex are frequently highly correlated with other variables. One of the problems will be is that as you use different mixtures of dummies, you can potentially manipulate any particular dummy to be what you want. The other issue is the usual of when you add a variable you improve your likelihood function or R2 et al, but now your coefficients which had previously been good are bad. What groups you wish to compare depends on what question you are trying to answer, but why not compare all groups, this is as they say, serendipitous. What are you interested in, the difference between married and unmarried, between men and women, or the interaction of the two, i.e., how the effect of being married is different for men than it is for women, based on US data that I have seen marriage is very 'good' for men, and somewhat dubious for women. Children tend to be 'good' for married of either sex, and very very 'bad' for women, where the reference good/bad refers to miscellaneous social indicators, especially income/poverty status. Supposedly the age of the women when she got married has a high positive correlation with 'good', but I have never worked on a data set that had that information since graduate school. >this I mean, can I have in my regression male*married, male*child, >woman*married and woman*child< These do not appear to mutually exclusive and exhaustive categories, so you will not have the perfect no go problems, but you could have the near no go problems. In these day and age, as my Grandmother used to say, and especially where I am, Alaska, I would not be surprised if the probability that someone, especially women, had a child (child living with them) was higher for the unmarried than for the married, but if your population is such that all or nearly all of your people with children are also married, you may be hitting yet another multicollinearity type problem. I would suggest that in general 4 dummies within the same set of relatively few related categories is a lot. You could I expect technically do it, but good idea is another thing. Try the assorted permutations of your regressions and see what kind of results you get, and what coefficients are stable across permutations and which are not. Consider the question you are trying to answer and the known characteristics of your population. Hope that was sufficiently confusing. >From: Dimitris Nikolaou <[hidden email]> >Reply-To: Dimitris Nikolaou <[hidden email]> >To: [hidden email] >Subject: interactive dummies >Date: Thu, 5 Oct 2006 19:25:28 +0300 > >Hello to everybody! > >I have used in my regression a dummy for marriage: married, single, >divorced >(the base dummy), sex (female=0 and male=1), and if someone has any >children: child, nochild(base dummy) and other. I created an interactive >dummy if someone is man and married (sex*married) and another ifsomeone is >a >man and has children (man*child). >My first question is to which I compare each one of them. For example for >the first case I compare the married man to those men not married or to >women? >Also, I will use in my regression both the interactive and the first >dummies >(sex*married, sex and married)? >Finally, can I use interactive dummies for both sexes at the same time? By >this I mean, can I have in my regression male*married, male*child, >woman*married and woman*child? > >Thank you very much for your valuable help! >Dimitris! > >_________________________________________________________________ >Express yourself instantly with MSN Messenger! Download today it's FREE! >http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ |
Free forum by Nabble | Edit this page |