|
Hi list,
I have a question on raking procedure. Actually I have the national level contingency table with occupations and industries, from there on I want to impute metropolitan area level cell values of job categories per industry provided that I have the marginal totals of the contingency table. A statistician told me that I can use this method to impute cell numbers for the metropolitan area. Can i ran this or a similar procedure in SPSS 16 ? Is there another method in SPSS to perform that? Thank you and enjoy the leap day! Melike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Melike,
This can be done very easy with Quantum (rim weighting procedure) or with Spss, starting with the 14th version and the Python programmability extension. Search the forum for "rim" or "raking" weighting and you will find some very useful explanations from Jon Peck, regarding the implementation of the raking procedure in Spss through Python. Hth, Vlad On Sat, Mar 1, 2008 at 12:43 AM, Melike Findikoglu < [hidden email]> wrote: > Hi list, > > I have a question on raking procedure. Actually I have the national level > contingency table with occupations and industries, from there on I want to > impute metropolitan area level cell values of job categories per industry > provided that I have the marginal totals of the contingency table. A > statistician told me that I can use this method to impute cell numbers for > the metropolitan area. Can i ran this or a similar procedure in SPSS 16 ? > Is > there another method in SPSS to perform that? > > Thank you and enjoy the leap day! > > Melike > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi all,
I just found an interesting free article about this topic. The hyperlink & abstract are below. Cheers!! Albert-Jan http://www.jos.nu/Articles/abstract.asp?article=192081 Graham Kalton and Ismael Flores-Cervantes (2003). Weighting Methods. Journal of Official Statistics, Vol.19, No.2,pp. 81-97 Abstract: Weighting adjustments are commonly applied in surveys to compensate for nonresponse and noncoverage, and to make weighted sample estimates conform to external values. Recent years have seen theoretical developments and increased use of methods that take account of substantial amounts of auxiliary information in making these adjustments. The article uses a simple example to describe such methods as cell weighting, raking, generali sed regression estimation, logistic regression weighting, mixtures of methods, and methods for restricting the range of the resultant adjustments. It also discusses how auxiliary variables may be chosen for use in the adjustments and describes some applications. Keywords: Calibration; generalised regression estimation; poststratification; raking; trimming weights. --- vlad simion <[hidden email]> wrote: > Hi Melike, > > This can be done very easy with Quantum (rim > weighting procedure) or with > Spss, starting with the 14th version and the Python > programmability > extension. Search the forum for "rim" or "raking" > weighting and you will > find some very useful explanations from Jon Peck, > regarding the > implementation of the raking procedure in Spss > through Python. > > Hth, > Vlad > > On Sat, Mar 1, 2008 at 12:43 AM, Melike Findikoglu < > [hidden email]> wrote: > > > Hi list, > > > > I have a question on raking procedure. Actually I > have the national level > > contingency table with occupations and industries, > from there on I want to > > impute metropolitan area level cell values of job > categories per industry > > provided that I have the marginal totals of the > contingency table. A > > statistician told me that I can use this method to > impute cell numbers for > > the metropolitan area. Can i ran this or a similar > procedure in SPSS 16 ? > > Is > > there another method in SPSS to perform that? > > > > Thank you and enjoy the leap day! > > > > Melike > > > > ===================== > > To manage your subscription to SPSSX-L, send a > message to > > [hidden email] (not to SPSSX-L), with > no body text except the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, > send the command > > INFO REFCARD > > > > ===================== > To manage your subscription to SPSSX-L, send a > message to > [hidden email] (not to SPSSX-L), with no > body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send > the command > INFO REFCARD > ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Melike Findikoglu
Hi vlad, thank you for the note. I read Jon Peck's comment on loglinear
models. I did that but couldnt achieved the estimation. I dont have the whole dataset, just the aggregated count data for occupation types by industries (14 * 20 crosstabulations) Can it be the reason that I dont have the whole dataset. Do I need to use population totals (US level) or sample level (chicago level) for weighting ? Thank you, Melike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Raking is typically used to adjust cell weights to one or more marginals. If you don’t have any cell weights to start with, all you could do in practice would be to assume independence and multiply the marginal probabilities. I'm not clear on what data you are starting with.
It is also possible for raking to fail, mainly due to empty cells: if you have no plumbers in your dataset, raking won't give you anyone to call to fix a leak. :-( -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Melike Findikoglu Sent: Saturday, March 08, 2008 9:28 AM To: [hidden email] Subject: Re: [SPSSX-L] Raking or similar sample weighting Hi vlad, thank you for the note. I read Jon Peck's comment on loglinear models. I did that but couldnt achieved the estimation. I dont have the whole dataset, just the aggregated count data for occupation types by industries (14 * 20 crosstabulations) Can it be the reason that I dont have the whole dataset. Do I need to use population totals (US level) or sample level (chicago level) for weighting ? Thank you, Melike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Melike Findikoglu
hi Jon,
I read that raking or iterative proportional fitting can also be used for estimation of the cell counts. My dataset is the aggregated table from BLS with IT jobs in industries at national level. I also have the group totals of job occupations for Chicago area (marginal total) and group totals of IT workforce by industry (marginal total) I don't have the cell counts for each occupation by industry at Chicago level and my aim is to estimate the count for each IT job in a particular area (the cell numbers) So starting from national level marginal totals and Chicago level marginal totals I would like to estimate the cell counts for this contingency table. (I don't have any other variable cause I don't have the data set, though I can add some variables at industry level , but will not help. My problems : it is hard to claim independence between industry and occupation type. The only argument here is that the IT job profile (proportions for each job) is the same for the national and regional level for a given industry. Yes, there are some missing cells, there I impute a number 0.0001 to bypass the problem with logs. Am I overstretching the raking/ ipf / post stratification or sample weighting ? Should I use national level marginal totals or regional level marginal totals to weigh ? simply the problem is that I don't have the whole data set (unit = company0 instead I have the aggregated table (unit = industry) ? I appreciate your thoughts/ Melike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I'm still not clear on what information you have about the joint distribution of industries and occupations, but I think the answer is none. You are certainly right not to claim that these are independent, but raking cannot come up with the joint distribution in the absence of something to start with.
Typically, at least in my experience, you start with some estimate of the joint distribution in the form of a contingency/crosstab table. Raking can then adjust the marginals of that table to match given control totals and distribute the adjustments back across the interior cells. But if you don't have a joint distribution to start with, I don't see how raking can help. Usually raking is used to adjust for nonresponse or other factors that make the table totals nonrepresentative but in situations where you have a joint distribution to start with. If you do have that information, BTW, the raking procedure as implemented in the SPSS Developer Central rake module will take care of the empty cells problem. It is not necessary to introduce artificial nonzero counts. Perhaps someone knows of a way to handle this in a raking procedure, but I don't. HTH, Jon Peck -----Original Message----- From: Melike Findikoglu [mailto:[hidden email]] Sent: Sunday, March 09, 2008 4:59 PM To: [hidden email]; Peck, Jon Subject: Re: Raking or similar sample weighting hi Jon, I read that raking or iterative proportional fitting can also be used for estimation of the cell counts. My dataset is the aggregated table from BLS with IT jobs in industries at national level. I also have the group totals of job occupations for Chicago area (marginal total) and group totals of IT workforce by industry (marginal total) I don't have the cell counts for each occupation by industry at Chicago level and my aim is to estimate the count for each IT job in a particular area (the cell numbers) So starting from national level marginal totals and Chicago level marginal totals I would like to estimate the cell counts for this contingency table. (I don't have any other variable cause I don't have the data set, though I can add some variables at industry level , but will not help. My problems : it is hard to claim independence between industry and occupation type. The only argument here is that the IT job profile (proportions for each job) is the same for the national and regional level for a given industry. Yes, there are some missing cells, there I impute a number 0.0001 to bypass the problem with logs. Am I overstretching the raking/ ipf / post stratification or sample weighting ? Should I use national level marginal totals or regional level marginal totals to weigh ? simply the problem is that I don't have the whole data set (unit = company0 instead I have the aggregated table (unit = industry) ? I appreciate your thoughts/ Melike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
