rake weighting question

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

rake weighting question

Sebastián Daza
List,
We carry out a survey in the university, stratifying for faculty (24) and year (1 at 4). We wanted to post-stratify for sex, and to maintain the weights of the faculty and year for each faculty. We use the rake module to create the weights.

We build two variables for weighting: "strat", equivalent to each stratum of the sample (combination of faculty and year, in total 94 categories); and "us", equivalent to the sex of each faculty  (in total 48 categories, we don't weight for the sex of each year, because in some faculties didn't have enough cases). We ran the function of the module rake according to the populational data (percentages):

begin program.
import spss, spssaux, rake
rake.rake(['estr','us'],
[{11:.0217666413566186,
12:.0194043702016367,
13:.0162827976039821,
14:.0149329283725639,
21:.00911161731207289,
22:.0080148485615456,
23:.0100396524086729,
24:.00894288365814562,
31:.00776174808065469,
32:.00539947692567283,
33:.00590567788745465,
34:.00514637644478191,
41:.0345903990550915,
42:.0286003543406732,
51:.0102083860626002,
52:.00860541635029107,
53:.00717118029190922,
54:.00658061250316376,
61:.0153547625073821,
62:.0132455918332912,
63:.0130768581793639,
64:.0126550240445457,
71:.0196574706825276,
72:.0187294355859276,
73:.0209229730869822,
74:.0201636716443095,
81:.00641187884923648,
82:.00615877836834557,
83:.00632751202227284,
84:.00497764279085464,
91:.0205855057791276,
92:.0190669028937822,
93:.0226946764532186,
94:.0222728423184004,
101:.0238758120307095,
102:.0180545009702185,
103:.0184763351050367,
104:.0192356365477094,
111:.00877415000421834,
112:.00793048173458196,
113:.00759301442672741,
114:.00607441154138193,
121:.00413397452121826,
122:.0031215725976546,
123:.00261537163587278,
124:.00295283894372733,
131:.00413397452121826,
132:.00329030625158188,
133:.00354340673247279,
134:.00371214038640007,
141:.00582131106049101,
142:.00539947692567283,
143:.0062431451953092,
144:.00531511009870919,
151:.0344216654011643,
152:.032143761073146,
153:.0350122331899097,
154:.0350966000168734,
161:.0102083860626002,
162:.00700244663798195,
163:.00759301442672741,
164:.0055682105796001,
171:.00548384375263646,
172:.0042183413481819,
173:.00362777355943643,
174:.00303720577069096,
181:.00809921538850924,
182:.00818358221547288,
183:.00928035096600017,
184:.00911161731207289,
191:.00987091875474563,
192:.00877415000421834,
193:.0073399139458365,
194:.00826794904243651,
201:.00826794904243651,
202:.00818358221547288,
203:.0113051548131275,
204:.0106302201974184,
211:.0100396524086729,
212:.00911161731207289,
213:.00809921538850924,
214:.00607441154138193,
221:.00480890913692736,
222:.00438707500210917,
223:.00531511009870919,
224:.00472454230996372,
231:.00362777355943643,
232:.00320593942461824,
233:.00269973846283641,
234:.00278410528980005,
241:.00371214038640007,
242:.00329030625158188,
243:.00278410528980005,
244:.00354340673247279},
 

{11:.032143761073146,
12:.0402429764616553,
21:.0172951995275458,
22:.0188138024128912,
31:.0187294355859276,
32:.00548384375263646,
41:.0307095250147642,
42:.0324812283810006,
51:.0161140639500548,
52:.0164515312579094,
61:.011642622120982,
62:.0426896144436008,
71:.0338310976124188,
72:.0456424533873281,
81:.0189825360668185,
82:.004893275963891,
91:.0388087404032734,
92:.0458111870412554,
101:.0725554711887286,
102:.00708681346494558,
111:.0278410528980005,
112:.00253100480890914,
121:.00354340673247279,
122:.00928035096600017,
131:.00691807981101831,
132:.00776174808065469,
141:.0133299586602548,
142:.00944908461992744,
151:.020501138952164,
152:.116173120728929,
161:.0200793048173458,
162:.0102927528895638,
171:.00649624567620012,
172:.00987091875474563,
181:.0151016620264912,
182:.019573103855564,
191:.0258162490508732,
192:.00843668269636379,
201:.0280941533788914,
202:.0102927528895638,
211:.0175483000084367,
212:.0157765966422003,
221:.0105458533704547,
222:.0086897831772547,
231:.00818358221547288,
232:.00413397452121826,
241:.0109676875052729,
242:.00236227115498186}]
, finalweight='wt', visible=True, delta=1, poptotal=2405)
end program.

comp wt2=wt*(11853/2405).
exe,

The module adjusts to the perfection the sex of each faculty and the distribution of the faculties, but not the years of each faculty. When comparing the population with the weighted population the differences are the following:


year
faculty 1 2 3 4
Agronomía e Ing. Forestal 0 0 0 0
Arquitectura 0 0 0 0
Arte 0 0 0 0
Bachillerato 0 0 0 0
Ciencias Biológicas 0 0 0 0
Construcción Civil 0 0 0 0
Derecho 9 0 -9 0
Diseño 0 0 0 0
Cs. Económicas y Admin. 0 8 -8 0
Educación 0 0 0 0
Enfermería 0 0 0 0
Física 2 0 0 -2
Geografía 0 0 0 0
Historia 0 0 0 0
Ingeniería 0 0 0 0
Letras 0 2 0 -2
Matemáticas 2 0 -2 0
Medicina 0 0 0 0
Periodismo 0 0 0 0
Psicología 3 -3 0 0
Química 0 0 0 0
Sociología 0 0 0 0
Teatro 0 0 0 0
Trabajo Social 0 0 0 0

Why do these differences exist? What opinion do you have of this type of adjustments of the sample?
The media of weight variable is 4,9, the minimum value 1,4 and the maximum 13,3.
Thanks and regards.



--
Sebastián Daza Aranzaes

Sebastián Daza Aranzaes
Instituto de Sociología UC
8-471 53 87 / 686 57 20 / Fax 5521834
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rake weighting question

Peck, Jon
It sounds like you really have three dimensions of raking but have to drop to two because the data are insufficient for that.  That suggests that your sample may be very difficult to adjust in all these dimensions.  Not all specifications of control totals can be reached with a given sample.

But are these differences large relative to the population counts?

Also, you can adjust the tolerance parameters in the rake model to force more iterations, which will then come closer to the control totals if that is your problem.  You might want to specify that the genlog output be visible in the rake call to see exactly what it is doing.

HTH,
Jon Peck


-----Original Message-----
From: SPSSX(r) Discussion on behalf of Sebastián Daza
Sent: Sat 4/21/2007 11:26 AM
To: [hidden email]
Subject:      [SPSSX-L] rake weighting question
 
List,
We carry out a survey in the university, stratifying for faculty (24) and year (1 at 4). We wanted to post-stratify for sex, and to maintain the weights of the faculty and year for each faculty. We use the rake module to create the weights.

We build two variables for weighting: "strat", equivalent to each stratum of the sample (combination of faculty and year, in total 94 categories); and "us", equivalent to the sex of each faculty  (in total 48 categories, we don't weight for the sex of each year, because in some faculties didn't have enough cases). We ran the function of the module rake according to the populational data (percentages):

begin program.
import spss, spssaux, rake
rake.rake(['estr','us'],
[{11:.0217666413566186,
12:.0194043702016367,
13:.0162827976039821,
14:.0149329283725639,
21:.00911161731207289,
22:.0080148485615456,
23:.0100396524086729,
24:.00894288365814562,
31:.00776174808065469,
32:.00539947692567283,
33:.00590567788745465,
34:.00514637644478191,
41:.0345903990550915,
42:.0286003543406732,
51:.0102083860626002,
52:.00860541635029107,
53:.00717118029190922,
54:.00658061250316376,
61:.0153547625073821,
62:.0132455918332912,
63:.0130768581793639,
64:.0126550240445457,
71:.0196574706825276,
72:.0187294355859276,
73:.0209229730869822,
74:.0201636716443095,
81:.00641187884923648,
82:.00615877836834557,
83:.00632751202227284,
84:.00497764279085464,
91:.0205855057791276,
92:.0190669028937822,
93:.0226946764532186,
94:.0222728423184004,
101:.0238758120307095,
102:.0180545009702185,
103:.0184763351050367,
104:.0192356365477094,
111:.00877415000421834,
112:.00793048173458196,
113:.00759301442672741,
114:.00607441154138193,
121:.00413397452121826,
122:.0031215725976546,
123:.00261537163587278,
124:.00295283894372733,
131:.00413397452121826,
132:.00329030625158188,
133:.00354340673247279,
134:.00371214038640007,
141:.00582131106049101,
142:.00539947692567283,
143:.0062431451953092,
144:.00531511009870919,
151:.0344216654011643,
152:.032143761073146,
153:.0350122331899097,
154:.0350966000168734,
161:.0102083860626002,
162:.00700244663798195,
163:.00759301442672741,
164:.0055682105796001,
171:.00548384375263646,
172:.0042183413481819,
173:.00362777355943643,
174:.00303720577069096,
181:.00809921538850924,
182:.00818358221547288,
183:.00928035096600017,
184:.00911161731207289,
191:.00987091875474563,
192:.00877415000421834,
193:.0073399139458365,
194:.00826794904243651,
201:.00826794904243651,
202:.00818358221547288,
203:.0113051548131275,
204:.0106302201974184,
211:.0100396524086729,
212:.00911161731207289,
213:.00809921538850924,
214:.00607441154138193,
221:.00480890913692736,
222:.00438707500210917,
223:.00531511009870919,
224:.00472454230996372,
231:.00362777355943643,
232:.00320593942461824,
233:.00269973846283641,
234:.00278410528980005,
241:.00371214038640007,
242:.00329030625158188,
243:.00278410528980005,
244:.00354340673247279},
 

{11:.032143761073146,
12:.0402429764616553,
21:.0172951995275458,
22:.0188138024128912,
31:.0187294355859276,
32:.00548384375263646,
41:.0307095250147642,
42:.0324812283810006,
51:.0161140639500548,
52:.0164515312579094,
61:.011642622120982,
62:.0426896144436008,
71:.0338310976124188,
72:.0456424533873281,
81:.0189825360668185,
82:.004893275963891,
91:.0388087404032734,
92:.0458111870412554,
101:.0725554711887286,
102:.00708681346494558,
111:.0278410528980005,
112:.00253100480890914,
121:.00354340673247279,
122:.00928035096600017,
131:.00691807981101831,
132:.00776174808065469,
141:.0133299586602548,
142:.00944908461992744,
151:.020501138952164,
152:.116173120728929,
161:.0200793048173458,
162:.0102927528895638,
171:.00649624567620012,
172:.00987091875474563,
181:.0151016620264912,
182:.019573103855564,
191:.0258162490508732,
192:.00843668269636379,
201:.0280941533788914,
202:.0102927528895638,
211:.0175483000084367,
212:.0157765966422003,
221:.0105458533704547,
222:.0086897831772547,
231:.00818358221547288,
232:.00413397452121826,
241:.0109676875052729,
242:.00236227115498186}]
, finalweight='wt', visible=True, delta=1, poptotal=2405)
end program.

comp wt2=wt*(11853/2405).
exe,

The module adjusts to the perfection the sex of each faculty and the distribution of the faculties, but not the years of each faculty. When comparing the population with the weighted population the differences are the following:



        year    
faculty  1       2       3       4      
Agronomía e Ing. Forestal      0       0       0       0      
Arquitectura     0       0       0       0      
Arte     0       0       0       0      
Bachillerato     0       0       0       0      
Ciencias Biológicas    0       0       0       0      
Construcción Civil     0       0       0       0      
Derecho  9       0       -9      0      
Diseño         0       0       0       0      
Cs. Económicas y Admin.        0       8       -8      0      
Educación      0       0       0       0      
Enfermería     0       0       0       0      
Física         2       0       0       -2      
Geografía      0       0       0       0      
Historia         0       0       0       0      
Ingeniería     0       0       0       0      
Letras   0       2       0       -2      
Matemáticas    2       0       -2      0      
Medicina         0       0       0       0      
Periodismo       0       0       0       0      
Psicología     3       -3      0       0      
Química        0       0       0       0      
Sociología     0       0       0       0      
Teatro   0       0       0       0      
Trabajo Social   0       0       0       0      

Why do these differences exist? What opinion do you have of this type of adjustments of the sample?
The media of weight variable is 4,9, the minimum value 1,4 and the maximum 13,3.
Thanks and regards.







--


Sebastián Daza Aranzaes
Instituto de Sociología UC
8-471 53 87 / 686 57 20 / Fax 5521834
[hidden email]