Dear listers,
we have a dataset of about 4.500 respondents and weighted it by a weighting variable which is standardized by the number of cases, so that the number of cases in the weighted and the unweighted dataset is the same. Calculating nonparametric tests like Kruskal-Wallis, however, provides a "wrong" (too large) number of cases for the weighted data. Does this mean nonparametric tests can or should not be used with weighted datasets? Thanks in advance Andreas -- Andreas H. Schneider Dipl.-Sozialwirt Institut für empirische Soziologie an der Friedrich-Alexander-Universität Erlangen-Nürnberg Marienstr. 2 90402 Nürnberg Tel.: 0911 23565 -41 Fax: 0911 23565 -50 |
Nonparametric tests require "whole" cases. When non-integer weights are used with any of the tests generated by the NPAR TESTS command, they are randomly rounded up or down to create integer weights. If you have lots of cases you might not even notice except when re-running the test unless you set the seed on the SET command.
I can't speak to the statistical question of weights and non-parametric tests. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andreas Schneider Sent: Monday, August 07, 2006 7:19 AM To: [hidden email] Subject: Weighting and Nonparametric Tests Dear listers, we have a dataset of about 4.500 respondents and weighted it by a weighting variable which is standardized by the number of cases, so that the number of cases in the weighted and the unweighted dataset is the same. Calculating nonparametric tests like Kruskal-Wallis, however, provides a "wrong" (too large) number of cases for the weighted data. Does this mean nonparametric tests can or should not be used with weighted datasets? Thanks in advance Andreas -- Andreas H. Schneider Dipl.-Sozialwirt Institut für empirische Soziologie an der Friedrich-Alexander-Universität Erlangen-Nürnberg Marienstr. 2 90402 Nürnberg Tel.: 0911 23565 -41 Fax: 0911 23565 -50 |
I think ViAnn is correct. This is not the only instance in which fractional
weights cause some similar problem. However, this should not create a large difference in the number of cases since the rounding is random, so cases of rounding up should be (approximately) offset by cases of rounding down, and the final difference should be small or nil, even with relatively small samples. Perhaps Andreas may explain his case in somewhat fuller terms. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Beadle, ViAnn Enviado el: Monday, August 07, 2006 9:55 AM Para: [hidden email] Asunto: Re: Weighting and Nonparametric Tests Nonparametric tests require "whole" cases. When non-integer weights are used with any of the tests generated by the NPAR TESTS command, they are randomly rounded up or down to create integer weights. If you have lots of cases you might not even notice except when re-running the test unless you set the seed on the SET command. I can't speak to the statistical question of weights and non-parametric tests. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andreas Schneider Sent: Monday, August 07, 2006 7:19 AM To: [hidden email] Subject: Weighting and Nonparametric Tests Dear listers, we have a dataset of about 4.500 respondents and weighted it by a weighting variable which is standardized by the number of cases, so that the number of cases in the weighted and the unweighted dataset is the same. Calculating nonparametric tests like Kruskal-Wallis, however, provides a "wrong" (too large) number of cases for the weighted data. Does this mean nonparametric tests can or should not be used with weighted datasets? Thanks in advance Andreas -- Andreas H. Schneider Dipl.-Sozialwirt Institut für empirische Soziologie an der Friedrich-Alexander-Universität Erlangen-Nürnberg Marienstr. 2 90402 Nürnberg Tel.: 0911 23565 -41 Fax: 0911 23565 -50 |
Dear ViAnn, Hector and others,
thank you so far for your first comments. We are working with a dataset of 4.500 novice drivers. Using K-W-test with weighted data expands the number of cases by about 100. Our main problem is the question whether the significance test using weighted data or the one using unweighted data is the correct one. The p-values are different, and sometimes they show significance using the weighted data but show no significance using unweighted data. Thanks in advance for your help. Greetings Andreas Hector Maletta schrieb: > I think ViAnn is correct. This is not the only instance in which fractional > weights cause some similar problem. However, this should not create a large > difference in the number of cases since the rounding is random, so cases of > rounding up should be (approximately) offset by cases of rounding down, and > the final difference should be small or nil, even with relatively small > samples. Perhaps Andreas may explain his case in somewhat fuller terms. > Hector > > > -----Mensaje original----- > De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de > Beadle, ViAnn > Enviado el: Monday, August 07, 2006 9:55 AM > Para: [hidden email] > Asunto: Re: Weighting and Nonparametric Tests > > Nonparametric tests require "whole" cases. When non-integer weights are used > with any of the tests generated by the NPAR TESTS command, they are randomly > rounded up or down to create integer weights. If you have lots of cases you > might not even notice except when re-running the test unless you set the > seed on the SET command. > > I can't speak to the statistical question of weights and non-parametric > tests. > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Andreas Schneider > Sent: Monday, August 07, 2006 7:19 AM > To: [hidden email] > Subject: Weighting and Nonparametric Tests > > Dear listers, > > we have a dataset of about 4.500 respondents and weighted it by a > weighting variable which is standardized by the number of cases, so that > the number of cases in the weighted and the unweighted dataset is the same. > > Calculating nonparametric tests like Kruskal-Wallis, however, provides a > "wrong" (too large) number of cases for the weighted data. > > Does this mean nonparametric tests can or should not be used with > weighted datasets? > > Thanks in advance > > Andreas > > > -- > Andreas H. Schneider > Dipl.-Sozialwirt > > Institut für empirische Soziologie > an der Friedrich-Alexander-Universität Erlangen-Nürnberg > > Marienstr. 2 > 90402 Nürnberg > > Tel.: 0911 23565 -41 > Fax: 0911 23565 -50 > > -- Andreas H. Schneider Dipl.-Sozialwirt Institut für empirische Soziologie an der Friedrich-Alexander-Universität Erlangen-Nürnberg Marienstr. 2 90402 Nürnberg Tel.: 0911 23565 -41 Fax: 0911 23565 -50 |
In reply to this post by Andreas Schneider-5
At 07:42 AM 8/7/2006, you wrote:
>Dear ViAnn, Hector and others, > >thank you so far for your first comments. > >We are working with a dataset of 4.500 novice drivers. Using K-W-test >with weighted data expands the number of cases by about 100. Our main >problem is the question whether the significance test using weighted >data or the one using unweighted data is the correct one. The p-values >are different, and sometimes they show significance using the weighted >data but show no significance using unweighted data. > >Thanks in advance for your help. > >Greetings Andreas Perhaps a stupid question, but did you "norm" or "scale" the weights first? Some software, e.g., SAS, does this automatically, SPSS does not - at least as far as I know. I.e. the sum of the weights should add to N (4,500 in this case). If the weights were determined to project a frequency to some population, they would not have been normed or scaled. If not, you need to convert them. Jeff |
Jeff,
The original query explained that the weights were non-inflationary, i.e. standardized to the sample size. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Jeff Enviado el: Monday, August 07, 2006 12:57 PM Para: [hidden email] Asunto: Re: Weighting and Nonparametric Tests At 07:42 AM 8/7/2006, you wrote: >Dear ViAnn, Hector and others, > >thank you so far for your first comments. > >We are working with a dataset of 4.500 novice drivers. Using K-W-test >with weighted data expands the number of cases by about 100. Our main >problem is the question whether the significance test using weighted >data or the one using unweighted data is the correct one. The p-values >are different, and sometimes they show significance using the weighted >data but show no significance using unweighted data. > >Thanks in advance for your help. > >Greetings Andreas Perhaps a stupid question, but did you "norm" or "scale" the weights first? Some software, e.g., SAS, does this automatically, SPSS does not - at least as far as I know. I.e. the sum of the weights should add to N (4,500 in this case). If the weights were determined to project a frequency to some population, they would not have been normed or scaled. If not, you need to convert them. Jeff |
In reply to this post by Andreas Schneider-5
A difference of 100 cases in 4500, i.e. about 2%, looks as the likely effect
of rounding, and therefore you should not worry too much about it. The results, i.e. the decision based on the K-W test, would have been most probably the same if the weighted number of cases would have been 4500 instead of 4600 (except if you are almost exactly over the edge of non-significance, in which case you would end up still near the edge but probably on the other side of it). Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Andreas Schneider Enviado el: Monday, August 07, 2006 10:42 AM Para: [hidden email] Asunto: Re: Weighting and Nonparametric Tests Dear ViAnn, Hector and others, thank you so far for your first comments. We are working with a dataset of 4.500 novice drivers. Using K-W-test with weighted data expands the number of cases by about 100. Our main problem is the question whether the significance test using weighted data or the one using unweighted data is the correct one. The p-values are different, and sometimes they show significance using the weighted data but show no significance using unweighted data. Thanks in advance for your help. Greetings Andreas Hector Maletta schrieb: > I think ViAnn is correct. This is not the only instance in which fractional > weights cause some similar problem. However, this should not create a large > difference in the number of cases since the rounding is random, so cases of > rounding up should be (approximately) offset by cases of rounding down, and > the final difference should be small or nil, even with relatively small > samples. Perhaps Andreas may explain his case in somewhat fuller terms. > Hector > > > -----Mensaje original----- > De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de > Beadle, ViAnn > Enviado el: Monday, August 07, 2006 9:55 AM > Para: [hidden email] > Asunto: Re: Weighting and Nonparametric Tests > > Nonparametric tests require "whole" cases. When non-integer weights are > with any of the tests generated by the NPAR TESTS command, they are randomly > rounded up or down to create integer weights. If you have lots of cases you > might not even notice except when re-running the test unless you set the > seed on the SET command. > > I can't speak to the statistical question of weights and non-parametric > tests. > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Andreas Schneider > Sent: Monday, August 07, 2006 7:19 AM > To: [hidden email] > Subject: Weighting and Nonparametric Tests > > Dear listers, > > we have a dataset of about 4.500 respondents and weighted it by a > weighting variable which is standardized by the number of cases, so that > the number of cases in the weighted and the unweighted dataset is the > > Calculating nonparametric tests like Kruskal-Wallis, however, provides a > "wrong" (too large) number of cases for the weighted data. > > Does this mean nonparametric tests can or should not be used with > weighted datasets? > > Thanks in advance > > Andreas > > > -- > Andreas H. Schneider > Dipl.-Sozialwirt > > Institut für empirische Soziologie > an der Friedrich-Alexander-Universität Erlangen-Nürnberg > > Marienstr. 2 > 90402 Nürnberg > > Tel.: 0911 23565 -41 > Fax: 0911 23565 -50 > > -- Andreas H. Schneider Dipl.-Sozialwirt Institut für empirische Soziologie an der Friedrich-Alexander-Universität Erlangen-Nürnberg Marienstr. 2 90402 Nürnberg Tel.: 0911 23565 -41 Fax: 0911 23565 -50 |
Free forum by Nabble | Edit this page |