This post was updated on .
Dear all,
I am running a multilevel regression (hierarchical linear model) with a regional representative survey for Spain 17 regions. The sampling is non-proportional. Thus, I have a vector of weights that assigns a similar weight to all individuals living in each region. Of course, this weight is greater than 1 for big regions and smaller than 1 for small regions. I have a question: Should I weight the data when I run multilevell regression? Weighting gives me accurate point estimates for fixed effects (similar effects for all Spain). Nevertheless, weighting makes SPSS to believe that in small regions I have very small samples (e.g. a sample of 300 interviews in a small region, after weighting gets an n(j)=18), so the "bayesian level-2 residual" for these small regions "srhinks" a lot. Weighting also changes my standard errors, reducing them for big regions and increasing them (a lot) for small regions. On the other hand, if I do not weight I am giving more relative value to the people of small regions than the real value that they have in the total of Spanish population, which is important for my fix effects point estimates. What do you reccomend? |
Weights can be inflationary (i.e. expanding sample frequencies to population
level) or merely proportional (i.e. modifying the relative weight of each individual case without altering the sample total). Ordinarily, official sample surveys come with weights that are inflationary, so that they add up to N (an estimate of the population total) instead of n (sample size). To transform inflationary weights into merely proportional weights, the procedure is as follows: 1. Suppose you have a variable INFLAWEIGHT which contains inflationary weights. If you produce any statistical result, say a frequency distribution, you would obtain a total of N weighted cases. Do it, and note the number N obtained. 2. Compute a new variable, say PROPWEIGHT= INFLAWEIGHT * n/N. 3. Use the new variable for weighting. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de jmdpulido Enviado el: Sunday, June 12, 2011 16:22 Para: [hidden email] Asunto: Weighting in Multilevel regression Dear all, I am running a multilevel regression (hierarchical linear model) with a regional representative survey for Spain 17 regions. The sampling is non-proportional. Thus, I have a vector of weights that assigns a similar weight to all individuals living in each region. Of course, this weight is greater than 1 for big regions and smaller than 1 for small regions. I have a question: Should I weight the data when I run multilevell regression? Weighting gives me accurate point estimates for fixed effects (similar effects for all Spain). Nevertheless, weighting makes SPSS to believe that in small regions I have very small samples (e.g. a sample of 300 hundred interviews in a small region, after weighting gets an n(j)=18), so the "bayesian level-2 residual" for these small regions "srhinks" a lot. Weighting also changes my standard errors, reducing them for big regions and increasing them (a lot) for small regions. On the other hand, if I do not weight I am giving more relative value to the people of small regions than the real value that they have in the total of Spanish population, which is important for my fix effects point estimates. What do you reccomend? -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres sion-tp4482307p4482307.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ----- No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1382 / Virus Database: 1513/3697 - Release Date: 06/12/11 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Hector,
Thanks a lot for your answer. My weights are not inflationary. The total N for Spain is not changed by weighting (i.e. the average weight is 1). So my estimates of the standard errors for regression coefficients for fix effects for Spain are not changed by weighting (as total N for Spain does not change). The problem is that for some regions (small regions) the weighting vector is much smaller than 1 and for other regions (big regions) the weighting vector is much bigger than 1. Thus, in multilevel regression, the regional (level 2) bayesian residuals and their standard errors are in fact very affected by weighting: ML regression shrinks level 2 residual estimators for small regions quite a lot. However, if I don't weight, the point estimates for the regression coefficients of the fixed effects for Spain change, as they will be calculated based more on individuals from small regions. What do you reccomend? |
If for any region (small or large) you have a relatively small sample (so
that its weights are on average >1) this simply reflects the fact that sample size is effectively small at that region (and relatively large in other regions where the average weight is <1). The results reflect this fact. It is not related to the region being large or small, but to the absolute size of the (weighted) sample for each region. If the sample for a region j is relatively small, its standard error (SDj / sqr root of weighted Nj) would tend to be larger, and the same for confidence intervals, even if the region itself is large. As a more thorough solution I would recommend using SPSS Complex Samples to estimate standard errors. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de jmdpulido Enviado el: Monday, June 13, 2011 04:04 Para: [hidden email] Asunto: Re: Weighting in Multilevel regression Dear Hector, Thanks a lot for your answer. My weights are not inflationary. The total N for Spain is not changed by weighting (i.e. the average weight is 1). So my estimates of the standard errors for regression coefficients for fix effects for Spain are not changed by weighting (as total N for Spain does not change). The problem is that for some regions (small regions) the weighting vector is much smaller than 1 and for other regions (big regions) the weighting vector is much bigger than 1. Thus, in multilevel regression, the regional (level 2) bayesian residuals and their standard errors are in fact very affected by weighting: ML regression shrinks level 2 residual estimators for small regions quite a lot. However, if I don't weight, the point estimates for the regression coefficients of the fixed effects for Spain change, as they will be calculated based more on individuals from small regions. What do you reccomend? -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres sion-tp4482307p4483523.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ----- No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1382 / Virus Database: 1513/3699 - Release Date: 06/12/11 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Hector,
You hit the spot. What I need is to use the "complex samples" module. Do you have any suggestion about goods materials to read about "complex sample"? PS: My point about big and small regions is that in small regions I have samples much bigger than in big regions. So in small regions my weighting factor is much smaller than 1. This is the case as in Spain we have regions with many inhabitants and regions with only few inhabitants. Thus, a proportional sample is impossible. That's the reason why the sample is relatively bigger in smaller regions..... I guess it is also the case in Canada and other federal States. |
Just read the SPSS help/reference materials. They include the Help system,
the Syntax reference, and the Complex Samples algorithm, all available in the SPSS installation and also at the SPSS web site. You will have to provide details on sample design (e.g. number of regions, size of the region, clustering or stratification of the sample within each region, etc.). HM -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de jmdpulido Enviado el: Monday, June 13, 2011 09:40 Para: [hidden email] Asunto: Re: Weighting in Multilevel regression Dear Hector, You hit the spot. What I need is to use the "complex samples" module. Do you have any suggestion about goods materials to read about "complex sample"? PS: My point about big and small regions is that in small regions I have samples much bigger than in big regions. So in small regions my weighting factor is much smaller than 1. This is the case as in Spain we have regions with many inhabitants and regions with only few inhabitants. Thus, a proportional sample is impossible. That's the reason why the sample is relatively bigger in smaller regions..... I guess it is also the case in Canada and other federal States. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Weighting-in-Multilevel-regres sion-tp4482307p4484180.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ----- No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1382 / Virus Database: 1513/3701 - Release Date: 06/13/11 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by jmdpulido
Dear Rich Ulrich,
Thanks a lot for your help. In fact, there is not an easy answer. You hit the spot as one of my research goals is estimate de "variance components" (level 1 & level 2) and weighting affects the results. I will proceed as you suggest and run both models (weighing and not weighting). I also appreciate very much your idea of borrowing standard errors of unweighted general estimates in order to explain regional differences.
Kind Regards and thanks a lot.
De: Rich Ulrich <[hidden email]> Para: [hidden email]; [hidden email] Enviado: lun,13 junio, 2011 20:24 Asunto: RE: Weighting in Multilevel regression I believe you have run up against a problem that has no ideal solution. Weighting, I think, gives you something similar to the problem arising in non-orthogonal two-way designs -- You can get a precise and exact test of what is technically the *wrong* variance components by using the Ns that exist in the ANOVA; or you can get an imprecise and non-exact test of the right variance components by using weights to create orthogonality. My own experience has been with the two-way designs, and I can't give firm advice, because I've never been stuck with defending an analysis that changes the Ns. (Be wary when you search literature: The early, traditional labels of "weighted" and "unweighted" ANOVA are reversed from the intuitive sense. Done by hand, an "unweighted" ANOVA assumed equal Ns by group.) The more severe the weighting is (further from 1.0 per case), the more distorted the testing must be. In situations that are somewhat analogous, I have advocated the proposition, "Run it both ways" and work from there. In presentation, tell your audience what you have done, while you do what you can to integrate the results. In your data, I think I would probably try to "borrow" the size of the confidence intervals from an analysis using the raw means. But your best guide to what has been acceptable in your area would be to look for similar analyses presented in the past. -- Rich Ulrich > Date: Mon, 13 Jun 2011 05:40:08 -0700 > From: [hidden email] > Subject: Re: Weighting in Multilevel regression > To: [hidden email] > > Dear Hector, > > You hit the spot. What I need is to use the "complex samples" module. Do you > have any suggestion about goods materials to read about "complex sample"? > > PS: My point about big and small regions is that in small regions I have > samples much bigger than in big regions. So in small regions my weighting > factor is much smaller than 1. This is the case as in Spain we have regions > with many inhabitants and regions with only few inhabitants. Thus, a > proportional sample is impossible. That's the reason why the sample is > relatively bigger in smaller regions..... I guess it is also the case in > Canada and other federal States. > |
Free forum by Nabble | Edit this page |