Weighting in Multilevel regression

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Weighting in Multilevel regression

jmdpulido
This post was updated on .
Dear all,

I am running a multilevel regression (hierarchical linear model) with a regional representative survey for Spain 17 regions.

The sampling is non-proportional. Thus, I have a vector of weights that assigns a similar weight to all individuals living in each region. Of course, this weight is greater than 1 for big regions and smaller than 1 for small regions.

I have a question: Should I weight the data when I run multilevell regression?

Weighting gives me accurate point estimates for fixed effects (similar effects for all Spain). Nevertheless, weighting makes SPSS to believe that in small regions I have very small samples (e.g. a sample of 300 interviews in a small region, after weighting gets an n(j)=18), so the "bayesian level-2 residual" for these small regions "srhinks" a lot.

Weighting also changes my standard errors, reducing them for big regions and increasing them (a lot) for small regions.

On the other hand, if I do not weight I am giving more relative value to the people of small regions than the real value that they have in the total of Spanish population, which is important for my fix effects point estimates.

What do you reccomend?
Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

Hector Maletta
Weights can be inflationary (i.e. expanding sample frequencies to population
level) or merely proportional (i.e. modifying the relative weight of each
individual case without altering the sample total). Ordinarily, official
sample surveys come with weights that are inflationary, so that they add up
to N (an estimate of the population total) instead of n (sample size). To
transform inflationary weights into merely proportional weights, the
procedure is as follows:
1. Suppose you have a variable INFLAWEIGHT which contains inflationary
weights. If you produce any statistical result, say a frequency
distribution, you would obtain a total of N weighted cases. Do it, and note
the number N obtained.
2. Compute a new variable, say PROPWEIGHT= INFLAWEIGHT * n/N.
3. Use the new variable for weighting.

Hector




-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
jmdpulido
Enviado el: Sunday, June 12, 2011 16:22
Para: [hidden email]
Asunto: Weighting in Multilevel regression

Dear all,

I am running a multilevel regression (hierarchical linear model) with a
regional representative survey for Spain 17 regions.

The sampling is non-proportional. Thus, I have a vector of weights that
assigns a similar weight to all individuals living in each region. Of
course, this weight is greater than 1 for big regions and smaller than 1 for
small regions.

I have a question: Should I weight the data when I run multilevell
regression?

Weighting gives me accurate point estimates for fixed effects (similar
effects for all Spain). Nevertheless, weighting makes SPSS to believe that
in small regions I have very small samples (e.g. a sample of 300 hundred
interviews in a small region, after weighting gets an n(j)=18), so the
"bayesian level-2 residual" for these small regions "srhinks" a lot.

Weighting also changes my standard errors, reducing them for big regions and
increasing them (a lot) for small regions.

On the other hand, if I do not weight I am giving more relative value to the
people of small regions than the real value that they have in the total of
Spanish population, which is important for my fix effects point estimates.

What do you reccomend?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres
sion-tp4482307p4482307.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1382 / Virus Database: 1513/3697 - Release Date: 06/12/11

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

jmdpulido
Dear Hector,

Thanks a lot for your answer. My weights are not inflationary. The total N for Spain is not changed by weighting (i.e. the average weight is 1).

So my estimates of the standard errors for regression coefficients for fix effects for Spain are not changed by weighting (as total N for Spain does not change).

The problem is that for some regions (small regions) the weighting vector is much smaller than 1 and for other regions (big regions) the weighting vector is much bigger than 1.

Thus, in multilevel regression, the regional (level 2) bayesian residuals and their standard errors are in fact very affected by weighting: ML regression shrinks level 2 residual estimators for small regions quite a lot.

However, if I don't weight, the point estimates for the regression coefficients of the fixed effects for Spain change, as they will be calculated based more on individuals from small regions.

What do you reccomend?

Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

Hector Maletta
If for any region (small or large) you have a relatively small sample (so
that its weights are on average >1) this simply reflects the fact that
sample size is effectively small at that region (and relatively large in
other regions where the average weight is <1). The results reflect this
fact. It is not related to the region being large or small, but to the
absolute size of the (weighted) sample for each region. If the sample for a
region j is relatively small, its standard error (SDj / sqr root of weighted
Nj) would tend to be larger, and the same for confidence intervals, even if
the region itself is large.
As a more thorough solution I would recommend using SPSS Complex Samples to
estimate standard errors.


Hector

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
jmdpulido
Enviado el: Monday, June 13, 2011 04:04
Para: [hidden email]
Asunto: Re: Weighting in Multilevel regression

Dear Hector,

Thanks a lot for your answer. My weights are not inflationary. The total N
for Spain is not changed by weighting (i.e. the average weight is 1).

So my estimates of the standard errors for regression coefficients for fix
effects for Spain are not changed by weighting (as total N for Spain does
not change).

The problem is that for some regions (small regions) the weighting vector is
much smaller than 1 and for other regions (big regions) the weighting vector
is much bigger than 1.

Thus, in multilevel regression, the regional (level 2) bayesian residuals
and their standard errors are in fact very affected by weighting: ML
regression shrinks level 2 residual estimators for small regions quite a
lot.

However, if I don't weight, the point estimates for the regression
coefficients of the fixed effects for Spain change, as they will be
calculated based more on individuals from small regions.

What do you reccomend?



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres
sion-tp4482307p4483523.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1382 / Virus Database: 1513/3699 - Release Date: 06/12/11

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

jmdpulido
Dear Hector,

You hit the spot. What I need is to use the "complex samples" module. Do you have any suggestion about goods materials to read about "complex sample"?

PS: My point about big and small regions is that in small regions I have samples much bigger than in big regions. So in small regions my weighting factor is much smaller than 1. This is the case as in Spain we have regions with many inhabitants and regions with only few inhabitants. Thus, a proportional sample is impossible. That's the reason why the sample is relatively bigger in smaller regions..... I guess it is also the case in Canada and other federal States.
Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

Hector Maletta
Just read the SPSS help/reference materials. They include the Help system,
the Syntax reference, and the Complex Samples algorithm, all available in
the SPSS installation and also at the SPSS web site. You will have to
provide details on sample design (e.g. number of regions, size of the
region, clustering or stratification of the sample within each region,
etc.).

HM

-----Mensaje original-----
De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de
jmdpulido
Enviado el: Monday, June 13, 2011 09:40
Para: [hidden email]
Asunto: Re: Weighting in Multilevel regression

Dear Hector,

You hit the spot. What I need is to use the "complex samples" module. Do you
have any suggestion about goods materials to read about "complex sample"?

PS: My point about big and small regions is that in small regions I have
samples much bigger than in big regions. So in small regions my weighting
factor is much smaller than 1. This is the case as in Spain we have regions
with many inhabitants and regions with only few inhabitants. Thus, a
proportional sample is impossible. That's the reason why the sample is
relatively bigger in smaller regions..... I guess it is also the case in
Canada and other federal States.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres
sion-tp4482307p4484180.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1382 / Virus Database: 1513/3701 - Release Date: 06/13/11

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Weighting in Multilevel regression

jmdpulido
In reply to this post by jmdpulido
Dear Rich Ulrich,
 
Thanks a lot for your help. In fact, there is not an easy answer. You hit the spot as one of my research goals is estimate de "variance components" (level 1 & level 2) and weighting affects the results. I will proceed as you suggest and run both models (weighing and not weighting). I also appreciate very much your idea of borrowing standard errors of unweighted general estimates in order to explain regional differences.
 
Kind Regards and thanks a lot.


De: Rich Ulrich <[hidden email]>
Para: [hidden email]; [hidden email]
Enviado: lun,13 junio, 2011 20:24
Asunto: RE: Weighting in Multilevel regression

I believe you have run up against a problem that has no ideal solution.

Weighting, I think, gives you something similar to the problem arising in
non-orthogonal two-way designs -- You can get a precise and exact
test of what is technically the *wrong* variance components by using
the Ns that exist in the ANOVA; or you can get an imprecise and non-exact
test of the right variance components by using weights to create orthogonality.

My own experience has been with the two-way designs, and I can't give firm
advice, because I've never been stuck with defending an analysis that changes
the Ns.  (Be wary when you search literature:  The early, traditional labels of
"weighted" and "unweighted" ANOVA  are reversed from the intuitive
sense.  Done by hand, an "unweighted" ANOVA assumed equal Ns by group.)

The more severe the weighting is (further from 1.0 per case), 
the more distorted the testing must be.

In situations that are somewhat analogous, I have advocated the proposition,
"Run it both ways" and work from there.  In presentation, tell your audience
what you have done, while you do what you can to integrate the results.  In
your data, I think I would probably try to "borrow" the size of the confidence
intervals from an analysis using the raw means. 

But your best guide to what has been acceptable in your area would be
to look for similar analyses presented in the past.

--
Rich Ulrich

> Date: Mon, 13 Jun 2011 05:40:08 -0700
> From: [hidden email]
> Subject: Re: Weighting in Multilevel regression
> To: [hidden email]
>
> Dear Hector,
>
> You hit the spot. What I need is to use the "complex samples" module. Do you
> have any suggestion about goods materials to read about "complex sample"?
>
> PS: My point about big and small regions is that in small regions I have
> samples much bigger than in big regions. So in small regions my weighting
> factor is much smaller than 1. This is the case as in Spain we have regions
> with many inhabitants and regions with only few inhabitants. Thus, a
> proportional sample is impossible. That's the reason why the sample is
> relatively bigger in smaller regions..... I guess it is also the case in
> Canada and other federal States.
>