Negative Adjusted R Square is a "good" thing?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Negative Adjusted R Square is a "good" thing?

cynicalflyer
The theory: in K-12 education putting more administrative authority in the state board of education is "better" that leaving it to the local boards.

DVs: I have two ways to measure "better" from 26 states: % kids graduating high school within 4 years and scores by school district on a standardized test. I'll be examining them separately.

IVs: I have 37 different measures for the types of administrative authority: 3 (state has complete control), 2 (shared/split) and 1 (locality has complete control).

So I fire up SPSS, plunk in the % kids graduating high school within 4 years by state in 26 states as my DV, plunk in the 37 IVs, use "Enter" as my method (I've been told stepwise is evil, evil, evil) and...

Model Summary

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.853a

.727

-.041

.148600607125323

This might be "good" if it means the predictors are useless. It is "bad" if I am getting this because my model stinks. How can I determine which?

Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

Andy W
You only have 26 states and you estimated 37 parameters in the model? I'm surprised SPSS spit out anything! (Did it silently drop some predictors?)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
Should read 43 states.
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

Bruce Weaver
Administrator
In reply to this post by cynicalflyer
Andy W replied:  "You only have 26 states and you estimated 37 parameters in the model? I'm surprised SPSS spit out anything! (Did it silently drop some predictors?)"

And cynicalflyer replied to that:  "Should read 43 states."

I assume, then, that the unit of analysis is state.  Is that right?  Here's why I think it must be:

For completely random data, the expected value of the multiple correlation coefficient R is  p / (N-1), where p = the number of predictors and N = the sample size.  You supplied p = 37 and R = .853, so I rearranged the formula to work out that your sample size must be around 43 (i.e., 37 / .853 = 43.38).  

If state is the unit of analysis, your model is grossly over-fitted.  See Mike Babyak's nice article for more info on that topic.  

  http://people.duke.edu/~mababyak/papers/babyakregression.pdf

HTH.


cynicalflyer wrote
The theory: in K-12 education putting more administrative authority in the state board of education is "better" that leaving it to the local boards.
<p>
DVs: I have two ways to measure "better" from 26 states: % kids graduating high school within 4 years and scores by school district on a standardized test. I'll be examining them separately.
<p>
IVs: I have 37 different measures for the types of administrative authority: 3 (state has complete control), 2 (shared/split) and 1 (locality has complete control).
<p>
So I fire up SPSS, plunk in the % kids graduating high school within 4 years by state in 26 states as my DV, plunk in the 37 IVs, use "Enter" as my method (I've been told stepwise is evil, evil, evil) and...
<p>
                                                               
<p align="center"><strong>Model    Summary</strong> </p>
<p>Model</p><p align="center">R</p><p align="center">R Square</p><p align="center">Adjusted    R Square</p><p align="center">Std.    Error of the Estimate</p>
<p>1</p><p align="right">.853a</p><p align="right">.727</p><p align="right">-.041</p><p align="right">.148600607125323</p>
<p>
This might be "good" if it means the predictors are useless. It is "bad" if I am getting this because my model stinks. How can I determine which?
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
Yes unit of analysis is state. So since I'm not getting more data than 43
states, the solution per the Babyak article is to reduce the IVs from 37
down to something smaller by combining. One final question on that score,
Babyak doesn't address, if I have 43 states is there a formula that will
tell me what the maximum recommended number of IVs is (i.e. what I should
reduce down to)?



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Negative-Adjusted-R-Square-is-a-good-thing-tp5724399p5724403.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
A followup: my data is a set of scores from 3 years. I was going to average them, but if I treated each individually I would get 129 observations which under the rule of 10 would allow for 13 IVs, yes?
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

Bruce Weaver
Administrator
Here are some notes on the number of explanatory variables in a linear regression model:

   http://www.angelfire.com/wv/bwhomedir/notes/linreg_rule_of_thumb.txt

Re your 3 years of data, the 3 data points for a given state would not be independent of each other, and your analysis would have to take that into account.  So it could not be an OLS regression model.  Two methods you could consider to handle those dependencies are 1) generalized estimating equations (GEE) or 2) a multilevel model, with years clustered within states.  


cynicalflyer wrote
A followup: my data is a set of scores from 3 years. I was going to average them, but if I treated each individually I would get 129 observations which under the rule of 10 would allow for 13 IVs, yes?
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
Given that I've never heard of generalized estimating equations (GEE) and only heard of multilevel modeling once I think, I'm basically screwed. My proposal defense I specified OLS regression and my methodologist didn't bat an eye, never told me this was going to be a problem before I wasted 2 years on data collection.
Great.
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

Rich Ulrich
In reply to this post by cynicalflyer
Yep, "No prediction".  Reason: bad model; too many useless
degrees of freedom.

If I had 37 indicators for Types of authority, I would
create composite scores.  With N of 46, factoring is not
very robust, but I'd look at it.

Probably, I would retreat to combining the most correlated
items, in order to create a few tentative composites
scores, and then look at the other correlations with
those composites to see what others should be added. There
should be no overlap among the items chosen for different scores.

In the end, I would hope for maybe two or three composites.

I think I almost never saw 37 items that all would be of
equal importance; so, they shouldn't be tested as if they were.
Among all the variables, I would use "expert judgement"
(based on literature, logic, etc.) of which of these items,
with their ranges of endorsement as observed in this sample,
are apt to be most salient. That would give me two to five items.
These might already have been included in the composite scores.

Then I would test my original hypotheses when I carry out two OLS
regressions on the average graduation rate -- composite scores;
and salient items.

I suspect that there are regional disparities in graduation
rates, so I might include some control for those, as
nuisance parameter, if they don't "confound" the original IVs.

--
Rich Ulrich

________________________________

> Date: Sun, 9 Feb 2014 12:18:50 -0800
> From: [hidden email]
> Subject: Negative Adjusted R Square is a "good" thing?
> To: [hidden email]
>
> The theory: in K-12 education putting more administrative authority in
> the state board of education is "better" that leaving it to the local
> boards.
>
> DVs: I have two ways to measure "better" from 26 states: % kids
> graduating high school within 4 years and scores by school district on
> a standardized test. I'll be examining them separately.
>
> IVs: I have 37 different measures for the types of administrative
> authority: 3 (state has complete control), 2 (shared/split) and 1
> (locality has complete control).
>
> So I fire up SPSS, plunk in the % kids graduating high school within 4
> years by state in 26 states as my DV, plunk in the 37 IVs, use "Enter"
> as my method (I've been told stepwise is evil, evil, evil) and...
>
> Model Summary
>
>
> Model
>
>
> R
>
>
> R Square
>
>
> Adjusted R Square
>
>
> Std. Error of the Estimate
>
>
> 1
>
>
> .853a
>
>
> .727
>
>
> -.041
>
>
> .148600607125323
>
>
>
> This might be "good" if it means the predictors are useless. It is
> "bad" if I am getting this because my model stinks. How can I determine
> which?
> ________________________________
> View this message in context: Negative Adjusted R Square is a "good"
> thing?<http://spssx-discussion.1045642.n5.nabble.com/Negative-Adjusted-R-Square-is-a-good-thing-tp5724399.html>
> Sent from the SPSSX Discussion mailing list
> archive<http://spssx-discussion.1045642.n5.nabble.com/> at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
Thanks to you all. I reduced the 37 indicators down to 5 based on literature + my judgment and ran the regressions; I still get all negative Adjusted R Squares vs. composite scores and vs. salient items. My methodologist has told me that the committee will likely rescind it prior approval of the proposal and that the only reason for getting a negative adjusted R square is data error (read: I screwed up). A proquest search of all dissertations finds only like 12 that had negative adjusted R squares. Have to start over.
 
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

David Marso
Administrator
Maybe you should review your data and the method(s) you used to form composites?
Inspect the correlation matrix and go from there...
Maybe there is a data error.  Maybe one that can be corrected?
---
cynicalflyer wrote
Thanks to you all. I reduced the 37 indicators down to 5 based on literature + my judgment and ran the regressions; I still get all negative Adjusted R Squares vs. composite scores and vs. salient items. My methodologist has told me that the committee will likely rescind it prior approval of the proposal and that the only reason for getting a negative adjusted R square is data error (read: I screwed up). A proquest search of all dissertations finds only like 12 that had negative adjusted R squares. Have to start over.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Negative Adjusted R Square is a "good" thing?

cynicalflyer
This post was updated on .
Ok, so I ran the correlation matrix of all 37 IVs against each other, combined those significant at the .001 level, and kept crunching until I had 5 IVs left, then ran that. My R Square = .272 & Adjusted R Square = .168, but none of my combined IVs are significant (best of the bunch is .029). Think that will be good enough?