Dear All
May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests? Thanks! Best regards, Lily Wan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Almost certainly, I should think. I gave a presentation recently that you might find interesting. You can download the slides (in PDF format, so no animation) here:
http://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf Notice the example on slide 26, and the section beginning on slide 34. Slides 61-64 (in the "cutting room floor" section may also be of interest. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Hi Bruce,
Good, interesting content but what an artful set of slides! That had to take quite a bit of work to develop the concept, find the images and then get them into what I assume is powerpoint. Thanks, Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: Friday, July 22, 2011 7:14 AM To: [hidden email] Subject: Re: (no subject) Almost certainly, I should think. I gave a presentation recently that you might find interesting. You can download the slides (in PDF format, so no animation) here: http://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conferenc e/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf Notice the example on slide 26, and the section beginning on slide 34. Slides 61-64 (in the "cutting room floor" section may also be of interest. HTH. Hong Wan wrote: > > Dear All > > May I ask for a question: if the normality test (Kolmogorov-Smirnov test) > was violated, and the sample size is 200, is it appropriate to use > parametric tests? > > Thanks! > > Best regards, > > Lily Wan > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4622685.h tml Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Thanks Gene. It was indeed a PowerPoint presentation. I put it together for a health research conference that has one session only (i.e., no dividing things up by topic or discipline), so the qualitative folks, the survey researchers, the docs who are doing some research, the epidemiologists, the medical scientists, and anyone else you can think of are all in together listening to the same talks. It's always a challenge to come up with something that will be accessible to a reasonable number of them, and that a few of them might actually put into practice. But it's a lot of fun too. ;-)
Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Hong Wan
Well, that is not the first question.
The first question is, "HOW and WHY was the normality test violated?" Or, "How big is the violation? and can it readily be corrected?" If the data "ought" to be transformed by log, square root, etc. (which is not always simple to decide), then transform the data. If there are a few outliers when an overall transformation is not needed, what do those outliers mean? - bad data? - the only important data? If extremes exist because of relatively huge errors at the scale extremes, would it be sufficient to Windsorize the data (pull in the extremes)? Then there is the consideration that ordinary t-tests and other ANOVAs are generally pretty robust against non-normality, especially when Ns are not too different. What tests are you considering? Equal Ns for two groups? One guideline is that if you look at the means, and decide that they do a poor job of reflecting the central tendency or main parameter of the distributions - poor enough that you would not want to report them - then you probably ought to transform. The rank-transformation, which gives you what is most often called the "nonparametric test", is *one* of the choices that is available. If a "natural" transformation is available, like using square-root for counts, or log for concentrations, then the natural transformation is a better choice if it apparently "normalizes". What non-parametric tests are you considering? For short rating scales, the existence of ties can be a worse problem for the rank-test than for the ordinary t-test, and the absence of outliers says there is no real concern; outliers screw up ANOVA tests by putting a huge weight on just a few cases. -- A single outlier pair can make a correlation huge; a single, huge outlier score in one group can make a t-test non-significant. If it seems nice to be able to report the means as a *good* description of the groups, then the parametric test is apt to be appropriate. But it is never wrong to check your assumptions and decisions by confirming that a nonparametric test does not give a different result. The experience of comparing results is what will help give you a deeper recognition of what degree or what sorts of non-normality need to be a concern when you see it in your data. -- Rich Ulrich > Date: Fri, 22 Jul 2011 05:13:11 +0100 > From: [hidden email] > To: [hidden email] > > Dear All > > May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests? > > Thanks! > |
Administrator
|
Rich's reply reminded me of one other problem that sometimes occurs when people use tests of normality as precursors to t-tests or ANOVA: I.e., they test for normality with both (or all) groups combined. (IIRC, this was the case for the example I gave in the presentation I posted a link for earlier--but I didn't have time to discuss that in the short time allowed for the talk.) But of course, the assumption is that there should be normality *within* groups, which is another way of saying normality of the residuals.
HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Really the assumption is that the residuals are normally distributed, which would take into account the groups.
Paul Dr. Paul R. Swank, Professor Children's Learning Institute University of Texas Health Science Center-Houston -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: Friday, July 22, 2011 4:21 PM To: [hidden email] Subject: Re: (no subject) Rich's reply reminded me of one other problem that sometimes occurs when people use tests of normality as precursors to t-tests or ANOVA: I.e., they test for normality with both (or all) groups combined. (IIRC, this was the case for the example I gave in the presentation I posted a link for earlier--but I didn't have time to discuss that in the short time allowed for the talk.) But of course, the assumption is that there should be normality *within* groups, which is another way of saying normality of the residuals. HTH. Rich Ulrich-2 wrote: > > Well, that is not the first question. > > The first question is, "HOW and WHY was the normality test violated?" > Or, "How big is the violation? and can it readily be corrected?" > > If the data "ought" to be transformed by log, square root, etc. (which is > not always simple to decide), then transform the data. > If there are a few outliers when an overall transformation is not needed, > what do those outliers mean? - bad data? - the only important data? > If extremes exist because of relatively huge errors at the scale extremes, > would it be sufficient to Windsorize the data (pull in the extremes)? > > > Then there is the consideration that ordinary t-tests and other ANOVAs are > generally pretty robust against non-normality, especially when Ns are > not too different. What tests are you considering? Equal Ns for two > groups? > > One guideline is that if you look at the means, and decide that they do > a poor job of reflecting the central tendency or main parameter of the > distributions - poor enough that you would not want to report them - then > you probably ought to transform. The rank-transformation, which gives > you what is most often called the "nonparametric test", is *one* of > the choices that is available. If a "natural" transformation is > available, > like using square-root for counts, or log for concentrations, then the > natural transformation is a better choice if it apparently "normalizes". > > What non-parametric tests are you considering? For short rating scales, > the existence of ties can be a worse problem for the rank-test than > for the ordinary t-test, and the absence of outliers says there is no > real concern; outliers screw up ANOVA tests by putting a huge weight > on just a few cases. -- A single outlier pair can make a correlation huge; > a single, huge outlier score in one group can make a t-test > non-significant. > > If it seems nice to be able to report the means as a *good* description > of the groups, then the parametric test is apt to be appropriate. But it > is > never wrong to check your assumptions and decisions by confirming that > a nonparametric test does not give a different result. The experience of > comparing results is what will help give you a deeper recognition of > what degree or what sorts of non-normality need to be a concern when > you see it in your data. > > -- > Rich Ulrich > > >> Date: Fri, 22 Jul 2011 05:13:11 +0100 >> From: [hidden email] >> To: [hidden email] >> >> Dear All >> >> May I ask for a question: if the normality test (Kolmogorov-Smirnov test) >> was violated, and the sample size is 200, is it appropriate to use >> parametric tests? >> >> Thanks! >> > ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4624749.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Hi Paul. I thought that's what I said (in the last sentence). But perhaps I was unclear. ;-)
At the risk of being FAR too pedantic this late on a Friday afternoon, we're both wrong. It is actually the *errors* that are assumed to be normally distributed. But the errors are unobservable, so we use the *residuals*, which are observable estimates of the errors. The Wikipedia page on this distinction is pretty good, I think. http://en.wikipedia.org/wiki/Errors_and_residuals_in_statistics Right...time to pack it in and go home! Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |