|
I just want to confirm: If I have many different independent variables, and i want to find the best regression model from these variables. Once I find the variables that make the best multiple linear regression model, would those same variables make the best logistic regression model? Or, could there be a case where a different group of variables make the best logistic regression model. Thanks.
|
|
jimjohn says:
I just want to confirm: If I have many different independent variables, and i want to find the best regression model from these variables. Once I find the variables that make the best multiple linear regression model, would those same variables make the best logistic regression model? Or, could there be a case where a different group of variables make the best logistic regression model. Thanks. It depends on what your outcome variables are. If your continuous outcome variable (for the linear reg.) and categorical outcome variable (for the log. reg.) are highly related, then you could use the same predictors, though it seems that your outcomes should be tapping different concepts so you may want to use two different sets of variables. It is true that either regression will work with the same types of predictors (so both regressions can accept dummy variables, interaction terms and so on), but what variables work best in a model should be derived from theory and empirical evidence. What concepts are your outcomes measuring and what, according to theory and past research, predicts these outcomes? Sara Sara M. House, M.A. Adjunct Faculty Loyola University Chicago, Psychology Department Email: [hidden email] Teaching: Research Methods, Psychology & Law AND Data Analyst Chicago Public Schools, Department of Program Evaluation Email: [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks Sara. To answer your question, the variable I am trying to predict is funding ratio. In mortgage commitments, the customer is locked in to a rate up to 4 months in advance. Once the commitment period is up, they have the option to fund the mortgage or to cancel. The funding ratio is the percentage that will fund their mortgage. This is the first time this research is being conducted, but I have been given many different variables that should theoretically have some effect on the funding ratio (e.x. expected future interest rates, percentage of customers switching over to variable mortgages, difference in two types of rates, etc.). I have already tried this out with multiple linear regression and tried to come up with good models. However, since the ratio is only between 0 and 1, as suggested here, I'm going to try and transform my dependent variable (ln p / 1-p), and then run linear regression on that. Then, i would compare the two models and see which one provides a better fit. For the first branch, the same variables that were predictors for the normal linear regression model are the predictors for the transformed regression model. I have to do this analysis for many different branches, regions, etc. so I'm just wondering if I need to go on and find the best model again in each case for the transformed variable. or if i can just look at the same variables, and use the new regression equation. thanks!
|
|
In reply to this post by jimjohn
At 02:23 PM 7/2/2008, jimjohn wrote:
>If I have many different independent variables, and i want to find >the best regression model from these variables. Once I find the >variables that make the best multiple linear regression model, would >those same variables make the best logistic regression model? Absolutely no reason why they models would have the same 'best set'. Why would you think so? Is the dependent variable for the logistic regression any close relation to the DV for the linear regression? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks Richard. I was just thinking that since my new dependent variable is just a transformation of the old one, that the same variables that affect the old one would affect the transformed one. But I guess I should run the regressions again, just in case different variables affect my transformed dependent variable better?
|
|
At 12:57 PM 7/3/2008, jimjohn wrote:
>I was just thinking that since my new [dichotomous?] dependent >variable is just a transformation of the old one, that the same >variables that affect the old one would affect the transformed one. That opens another area for discussion: when it is, and when it is not, advisable to dichotomize (or categorize) a continuous variable. It's been discussed on this and other lists. I'd like to invite list members to respond with general advice, or particular questions. I'm not going to; I'm far from the best person to start this discussion. >I guess I should run the regressions again, just in case different >variables affect my transformed dependent variable better? I would. Among other things, non-linearities in the effects could mean the dichotomized variable is affected differently -- and if you don't think there may be non-linearities, the logistic regression doesn't make much sense. By the way, you've written as if you're working with a large set of independent variables, with a good deal of multi-collinearity. Good luck. You'll get sound advice on this list (and elsewhere) against selecting a subset of independent variables based on experience with the data. Collinearity makes the process not only dubious, but unstable. I think you've already been advised about factor analysis and other prior dimension-reduction techniques. Good luck to you! Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
At 08:35 AM 7/3/2008, Richard Ristow wrote:
>At 12:57 PM 7/3/2008, jimjohn wrote: > >>I was just thinking that since my new [dichotomous?] dependent >>variable is just a transformation of the old one, that the same >>variables that affect the old one would affect the transformed one. > >That opens another area for discussion: when it is, and when it is >not, advisable to dichotomize (or categorize) a continuous variable. > >It's been discussed on this and other lists. I'd like to invite list >members to respond with general advice, or particular questions. I'm >not going to; I'm far from the best person to start this discussion. Richard, Your question is multiplied because the original question is not simply about *a variable* but about *a relationship between variables*. In that context, it matters a great deal whether one has in mind to alter both variables, or only one to match the other. If the variables are of a different type to begin with, it might be a better idea to switch to a mode of analysis, such as analysis of variance, which is constructed for mixed variables in the first place. >>I guess I should run the regressions again, just in case different >>variables affect my transformed dependent variable better? > >I would. Among other things, non-linearities in the effects could >mean the dichotomized variable is affected differently -- and if you >don't think there may be non-linearities, the logistic regression >doesn't make much sense. My first caveat is that reducing a variable's measurement level (e.g. from ratio to interval or categorical) always involves throwing information away, and that sounds bad. On the other hand, if the fundamental reason for the variable reduction is that one had been attempting to do a kind of analysis that assumes variable characteristics that were not valid, then what one is doing, in effect, is switching from a powerful analysis based on inappropriate assumptions, to a less powerful analysis based on appropriate assumptions. This seems to be a good reason for doing so. Those are my initial thoughts to your excellent questions. Bob Schacht Robert M. Schacht, Ph.D. <[hidden email]> Pacific Basin Rehabilitation Research & Training Center 1268 Young Street, Suite #204 Research Center, University of Hawaii Honolulu, HI 96814 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks guys for all the help! Although my dependent variable is a ratio between 0 and 1, some of my independent variables are also ratios/percentages, while others are continuous variables that can take on any value. I planned to do a logit transformation on my dependent variable and then run linear regression on that, because I thought otherwise, some possible combinations of my independent variables could result in a predicted value for my DV that is outside of the ratio (1,0). Also, because I heard that otherwise, without the logit transformation, changes in my independent variables could result in changes to my predicted DV that are higher or smaller than they should be.
Just wondering, there are some cases where the only high predictors are ratios between 0 and 1, and i guess in those cases, I do not need to conduct a logit transformation? (since the intervals of my IV's match the interval of my DV). Do you guys agree with this? Also, I am seeing lots of multicollinearity, but since I get high Adusted R^2's with only 2-3 uncorrelated variables, I am probably going to leave out a lot of the other highly correlated variables. Any suggestions or thoughts? Thanks!
|
| Free forum by Nabble | Edit this page |
