Improve Your Regression with Modern Regression Analysis Techniques
•
Part 1: July 27 @ 10:00 am PDT: Linear, Nonlinear, Regularized, GPS, LARS, LASSO, Elastic Net, MARS®
•
Part 2: August 10 @ 10am PDT: TreeNet® Gradient Boosting, RandomForests®, ISLE™ and RuleLearner®
•
Alternative link: http://info.salford-systems.com/improve-your-regression Can't make it? Sign up and receive the recording! Abstract: Join us for this two part webinar series on improving your regression using modern regression analysis techniques, presented by Senior Scientist, Mikhail Golovyna. In these webinars you will learn how to drastically
improve predication accuracy in your regression with a new model that addresses common concerns such as missing values, interactions, and nonlinearities in your data. We will demonstrate the techniques using real-world data sets and introduce the main concepts behind Leo Breiman's Random Forests and Jerome Friedman's GPS (Generalized PathSeeker™), MARS® (Multivariate Adaptive
Regression Splines), and Gradient Boosting. |
Administrator
|
Some of these approaches (e.g., Multivariate Adaptive Regression Splines) sound like they're very susceptible to over-fitting. A quick Google search on MARS(r) took me to this set of slides:
http://www.lans.ece.utexas.edu/courses/ee380l_ese/2013/mars.pdf On slide 2, I find: "MARS is a form of stepwise linear regression". It is well-known that step-wise linear regression is great at generating over-fitted models. E.g., http://www.stata.com/support/faqs/statistics/stepwise-regression-problems/ I doubt very much that adding adaptive splines to the soup will improve things. It might even make things worse.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Empirical variable selection IMO is the most difficult area of statistical modeling. Stepwise is much maligned, but some very respectable people in statistical learning still approve of it when used properly. I rather like lasso, available in the CATREG procedure. However, all these empirical selection techniques are best used with cross validation/holdout samples etc to combat overfitting and minimize generalization error. Of course, it's nice when you have enough theory or prior evidence not to need empirical selection methods or can use randomized trials, but most of us don't often have such luxuries. On Fri, Jul 22, 2016 at 9:22 AM, Bruce Weaver <[hidden email]> wrote: Some of these approaches (e.g., Multivariate Adaptive Regression Splines) |
Free forum by Nabble | Edit this page |