|
Administrator
|
In addition to the problems you mention, you are over-fitting your model (i.e., you have too many variables for the amount of data). For a good overview of over-fitting, check out Mike Babyak's nice article.
http://www.psychosomaticmedicine.org/content/66/3/411.short Have to get to a meeting, so no time to address the other problems right now! HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Anter
This topic could get dense, and I’m sure a lot of people have strong opinions on this one way or another. My experience is that with many models and smaller
samples, it simply doesn’t make a huge difference precisely how you do this, as long as you follow certain protocols. First is that you need to know what the variable’s are (continuous, categorical, ordinal) and pick the correct way to handle those. If you
just want to run a ridge regression with all continuous linear variables, then you want “Level=nume” for variable level is numeric. Then you have the issue of how to discretize the variables in the transformation. For a numeric variable you use either ranking
or multiplying. Multiply is the same as a normal linear regression. Ranking is similar to any of the non-parametric ranking procedures, and the same thought process would be used. If the values of your variables are non-normally distributed, then a ranking
procedure would give you a more normal distribution. You pick how you want to handle missing data, I tend to go with listwise deletion. The key to making this a ridge regression is the regularization process, which deals with the multicolinearity. As seen
in my code below, this is “regularization=ridge” The parameters after that are the standard values. To generate this I literally used the point and click interface and printed the syntax to show you the standard you would get.
Now you have another problem, you have an arguably over-estimated model, and as such that creates issues. I would argue you should try and estimate less variables
in your model, given your sample of only 72, but if theory dictates that all variables are crucial in the model, then I strongly encourage you to add in the bootstrap estimates. In terms of the interpretation of the output. I would look the website I will include a link for, as anything I would say would likely repeat what is here.
Basically remember what you are trying to do. Adjust the model via a constant such that the multi-colinearity is reduced, but the r squared remains roughly the same. I tried to run a sample of data to show you, but infortunately the variables weren’t correlated
enough for ridge regression to be appropriate, and those that were didn’t change as you would want in a ridge regression.
http://www.coe.fau.edu/faculty/morris/STA7114%20Files/Lab%203/Instructions/ridge_regression.htm
CATREG VARIABLES=famrewwk sestch sesstr sesneg seshrsh seseffi fambldg famhmwk /ANALYSIS=famrewwk(LEVEL=NUME) WITH sestch(LEVEL=NUME) sesstr(LEVEL=NUME) sesneg(LEVEL=NUME)
seshrsh(LEVEL=NUME) seseffi(LEVEL=NUME) fambldg(LEVEL=NUME) famhmwk(LEVEL=NUME) /DISCRETIZATION=famrewwk(MULTIPLYING) sestch(MULTIPLYING) sesstr(MULTIPLYING) sesneg(MULTIPLYING)
seshrsh(MULTIPLYING) seseffi(MULTIPLYING) fambldg(MULTIPLYING) famhmwk(MULTIPLYING)
/MISSING=famrewwk(LISTWISE) sestch(LISTWISE) sesstr(LISTWISE) sesneg(LISTWISE) seshrsh(LISTWISE)
seseffi(LISTWISE) fambldg(LISTWISE) famhmwk(LISTWISE) /MAXITER=100 /CRITITER=.00001 /PRINT=R COEFF OCORR CORR ANOVA DESCRIP(sestch) REGU /INITIAL=RANDOM /PLOT= REGU /REGULARIZATION=RIDGE(0.0,1.0,0.02)(DataSet2) /RESAMPLE=BOOTSTRAP(500). Matthew J Poes Research Data Specialist Center for Prevention Research and Development
University of Illinois 510 Devonshire Dr. Champaign, IL 61820 Phone: 217-265-4576 email:
[hidden email] From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Andra Th
|
|
Free forum by Nabble | Edit this page |