http://spssx-discussion.165.s1.nabble.com/Fixed-Effects-Model-LSDV-and-Overfitting-tp5717179p5717183.html
How often have bans been imposed?
To start with a simple model, before you think of "bans" that have been
imposed, you are looking at "policy outcomes" by State; across 17 years.
I would expect that there would be some fairly strong and consistent
differences between States -- and this *might* be considered
as "nuisance" variance, something to be accounted for but not very
interesting at all. But that is part of what is making up your "large R^2".
Depending on what the "policy outcomes" consist of, there might
also be a sizable component of linear trend, since these are time-series.
The time trend *also* could be "nuisance". Time series are notorious
for showing large and largely-irrelevant R^2. Is that one of your other
variables?
The interesting terms would seem to be the effects, on parallel
time series, of imposing Interventions (bans). - This seems more and
more like one of those complicated models in econometrics, since
(for instance) the previous change in "policy outcomes" in a particular
state (or neighboring states, or any states) might have helped create
opinions that led to the bans.
In any case -- the large R^2 is not surprising to me, but it is also
(probably) not interesting until you look at terms that model the
possible effects of bans. That could be more complicated than Yes/No.
--
Rich Ulrich
> Date: Mon, 31 Dec 2012 07:11:11 -0800
> From:
[hidden email]> Subject: Fixed Effects Model (LSDV) and Overfitting??
> To:
[hidden email]>
> Hi,
>
> I am currently in the end faze of writing my bachelor's thesis in political
> science where I use a panel data set to investigate the effects of corporate
> contribution bans on policy outcomes on state level. Since I don't have much
> statistical background to draw upon I've been forced to read up on panel
> data, how to model it and different problems as I've gone along.
>
> Now I've encountered one problem which I don't know how to tackle and
> hopefully someone here will be able to offer some advice. My question is if
> my model is overfitted (a concept I just discovered). I first got suspicious
> due to the very high value of my models R-squared (0,766) despite the fact
> that I'm only using 4 predictor variables. The 732 degrees of freedom also
> appears very high to my untrained eye. Due to my use of LSDV I also have 62
> dummy variables (the study includes 47 states and 17 time periods, resulting
> in 46+16 dummy variables) Could this have resulted in that my model now is
> too complex and that I should apply a bootstrap or another similar test?
>
> I greatly appreciate all help and wish you all a happy new year.
>
...