|
When I was at Amex we had regular modeling meetings my models usually had between 7 and 12 variables. Others presented theirs with around 18 variables. We implemented and tracked the models and their performance sometimes suffered after a couple of years. I particularly did not like having too many variables as some of them suffered from collinear relationships. But since we were interested in prediction that did not seem to be an issue.
-----Original Message----- From: Ergul, Emel A. [mailto:[hidden email]] Sent: Wednesday, April 22, 2009 4:10 PM To: Ornelas, Fermin; [hidden email] Subject: RE: Re: Logistic Regression OK I remember from journal reviewers that total number of predictor for LR can be maximum number of event/10. They say over this number, model becomes unstable...How about that? -----Original Message----- From: SPSSX(r) Discussion on behalf of Ornelas, Fermin Sent: Wed 4/22/2009 4:55 PM To: [hidden email] Subject: Re: Logistic Regression This answer is on (1) and (2). There is no magic number for the set of predictor variables in a model but once you clean the data and the model itself you could have between 7 - 18 predictors. That was my experience. It is possible for a model to perform better or worse in a validation sample than in a training sample. However, to ensure that the model performs equally well you need to make sure that your descriptive statistics on the data are similar in the validation and training sample. If the performance difference is large that could pose a problem when implementing a model particularly if the performance is worse, which is not your case. Regarding the test, I cannot give input from the top of my head for fear of getting some uncomfortable feedback. But I use to graph a Lorenz curve plotting both training and validation and calculate curve lift. ________________________________ From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of <R. Abraham> Sent: Wednesday, April 22, 2009 1:37 PM To: [hidden email] Subject: Logistic Regression I have 2 questions on Predictive Modeling: 1. I am building a logistic regression model with about 480 predictors. The 'training' sample has about 18000 records with about 3000 responders. I would like to know how many significant predictors can the model have? Is there any suggested number of significant variables that a model can have? 2. Can a predictive model perform better on the 'validation' sample than that seen in the 'Training' sample. The test results on my validation sample performs atleast 15% better than the 'train' sample in the prediction in the top deciles. 3. Does the "Kolmogorov-Smirnov test" help in finding out how much the 'Validation' sample results can differ from the 'Training' sample results? If so, can someone give me some pointers on how to perform the test? Thanks. R. Abraham ________________________________ NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed. It may contain information that is privileged and confidential under state and federal law. This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by <R. Abraham>
Yes. “Statistically significant” is not identical to “Substantively
significant” or “Predictively worthwhile”. Hector From: SPSSX(r)
Discussion
|
| Free forum by Nabble | Edit this page |
