The theory: in K-12 education putting more administrative authority in the state board of education is "better" that leaving it to the local boards.
DVs: I have two ways to measure "better" from 26 states: % kids graduating high school within 4 years and scores by school district on a standardized test. I'll be examining them separately. IVs: I have 37 different measures for the types of administrative authority: 3 (state has complete control), 2 (shared/split) and 1 (locality has complete control). So I fire up SPSS, plunk in the % kids graduating high school within 4 years by state in 26 states as my DV, plunk in the 37 IVs, use "Enter" as my method (I've been told stepwise is evil, evil, evil) and...
This might be "good" if it means the predictors are useless. It is "bad" if I am getting this because my model stinks. How can I determine which? |
You only have 26 states and you estimated 37 parameters in the model? I'm surprised SPSS spit out anything! (Did it silently drop some predictors?)
|
Should read 43 states.
|
Administrator
|
In reply to this post by cynicalflyer
Andy W replied: "You only have 26 states and you estimated 37 parameters in the model? I'm surprised SPSS spit out anything! (Did it silently drop some predictors?)"
And cynicalflyer replied to that: "Should read 43 states." I assume, then, that the unit of analysis is state. Is that right? Here's why I think it must be: For completely random data, the expected value of the multiple correlation coefficient R is p / (N-1), where p = the number of predictors and N = the sample size. You supplied p = 37 and R = .853, so I rearranged the formula to work out that your sample size must be around 43 (i.e., 37 / .853 = 43.38). If state is the unit of analysis, your model is grossly over-fitted. See Mike Babyak's nice article for more info on that topic. http://people.duke.edu/~mababyak/papers/babyakregression.pdf HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Yes unit of analysis is state. So since I'm not getting more data than 43
states, the solution per the Babyak article is to reduce the IVs from 37 down to something smaller by combining. One final question on that score, Babyak doesn't address, if I have 43 states is there a formula that will tell me what the maximum recommended number of IVs is (i.e. what I should reduce down to)? -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Negative-Adjusted-R-Square-is-a-good-thing-tp5724399p5724403.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
A followup: my data is a set of scores from 3 years. I was going to average them, but if I treated each individually I would get 129 observations which under the rule of 10 would allow for 13 IVs, yes?
|
Administrator
|
Here are some notes on the number of explanatory variables in a linear regression model:
http://www.angelfire.com/wv/bwhomedir/notes/linreg_rule_of_thumb.txt Re your 3 years of data, the 3 data points for a given state would not be independent of each other, and your analysis would have to take that into account. So it could not be an OLS regression model. Two methods you could consider to handle those dependencies are 1) generalized estimating equations (GEE) or 2) a multilevel model, with years clustered within states.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Given that I've never heard of generalized estimating equations (GEE) and only heard of multilevel modeling once I think, I'm basically screwed. My proposal defense I specified OLS regression and my methodologist didn't bat an eye, never told me this was going to be a problem before I wasted 2 years on data collection.
Great. |
In reply to this post by cynicalflyer
Yep, "No prediction". Reason: bad model; too many useless
degrees of freedom. If I had 37 indicators for Types of authority, I would create composite scores. With N of 46, factoring is not very robust, but I'd look at it. Probably, I would retreat to combining the most correlated items, in order to create a few tentative composites scores, and then look at the other correlations with those composites to see what others should be added. There should be no overlap among the items chosen for different scores. In the end, I would hope for maybe two or three composites. I think I almost never saw 37 items that all would be of equal importance; so, they shouldn't be tested as if they were. Among all the variables, I would use "expert judgement" (based on literature, logic, etc.) of which of these items, with their ranges of endorsement as observed in this sample, are apt to be most salient. That would give me two to five items. These might already have been included in the composite scores. Then I would test my original hypotheses when I carry out two OLS regressions on the average graduation rate -- composite scores; and salient items. I suspect that there are regional disparities in graduation rates, so I might include some control for those, as nuisance parameter, if they don't "confound" the original IVs. -- Rich Ulrich ________________________________ > Date: Sun, 9 Feb 2014 12:18:50 -0800 > From: [hidden email] > Subject: Negative Adjusted R Square is a "good" thing? > To: [hidden email] > > The theory: in K-12 education putting more administrative authority in > the state board of education is "better" that leaving it to the local > boards. > > DVs: I have two ways to measure "better" from 26 states: % kids > graduating high school within 4 years and scores by school district on > a standardized test. I'll be examining them separately. > > IVs: I have 37 different measures for the types of administrative > authority: 3 (state has complete control), 2 (shared/split) and 1 > (locality has complete control). > > So I fire up SPSS, plunk in the % kids graduating high school within 4 > years by state in 26 states as my DV, plunk in the 37 IVs, use "Enter" > as my method (I've been told stepwise is evil, evil, evil) and... > > Model Summary > > > Model > > > R > > > R Square > > > Adjusted R Square > > > Std. Error of the Estimate > > > 1 > > > .853a > > > .727 > > > -.041 > > > .148600607125323 > > > > This might be "good" if it means the predictors are useless. It is > "bad" if I am getting this because my model stinks. How can I determine > which? > ________________________________ > View this message in context: Negative Adjusted R Square is a "good" > thing?<http://spssx-discussion.1045642.n5.nabble.com/Negative-Adjusted-R-Square-is-a-good-thing-tp5724399.html> > Sent from the SPSSX Discussion mailing list > archive<http://spssx-discussion.1045642.n5.nabble.com/> at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thanks to you all. I reduced the 37 indicators down to 5 based on literature + my judgment and ran the regressions; I still get all negative Adjusted R Squares vs. composite scores and vs. salient items. My methodologist has told me that the committee will likely rescind it prior approval of the proposal and that the only reason for getting a negative adjusted R square is data error (read: I screwed up). A proquest search of all dissertations finds only like 12 that had negative adjusted R squares. Have to start over.
|
Administrator
|
Maybe you should review your data and the method(s) you used to form composites?
Inspect the correlation matrix and go from there... Maybe there is a data error. Maybe one that can be corrected? ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
This post was updated on .
Ok, so I ran the correlation matrix of all 37 IVs against each other, combined those significant at the .001 level, and kept crunching until I had 5 IVs left, then ran that. My R Square = .272 & Adjusted R Square = .168, but none of my combined IVs are significant (best of the bunch is .029). Think that will be good enough?
|
Free forum by Nabble | Edit this page |