SPSSX Discussion

(no subject)

Classic

List

Threaded

8 messages Options

Hong Wan

(no subject)

Dear All

May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests?

Thanks!

Best regards,

Lily Wan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: (no subject)

Administrator

Almost certainly, I should think. I gave a presentation recently that you might find interesting. You can download the slides (in PDF format, so no animation) here:

http://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf

Notice the example on slide 26, and the section beginning on slide 34. Slides 61-64 (in the "cutting room floor" section may also be of interest.

HTH.

Hong Wan wrote

Dear All

May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests?

Thanks!

Best regards,

Lily Wan

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Maguin, Eugene

Re: (no subject)

Hi Bruce,

Good, interesting content but what an artful set of slides! That had to take
quite a bit of work to develop the concept, find the images and then get
them into what I assume is powerpoint.

Thanks, Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: Friday, July 22, 2011 7:14 AM
To: [hidden email]
Subject: Re: (no subject)

Almost certainly, I should think. I gave a presentation recently that you
might find interesting. You can download the slides (in PDF format, so no
animation) here:

http://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conferenc
e/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf

Notice the example on slide 26, and the section beginning on slide 34.
Slides 61-64 (in the "cutting room floor" section may also be of interest.

HTH.

Hong Wan wrote:

>
> Dear All
>
> May I ask for a question: if the normality test (Kolmogorov-Smirnov test)
> was violated, and the sample size is 200, is it appropriate to use
> parametric tests?
>
> Thanks!
>
> Best regards,
>
> Lily Wan
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4622685.h
tml
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: (no subject)

Administrator

Thanks Gene. It was indeed a PowerPoint presentation. I put it together for a health research conference that has one session only (i.e., no dividing things up by topic or discipline), so the qualitative folks, the survey researchers, the docs who are doing some research, the epidemiologists, the medical scientists, and anyone else you can think of are all in together listening to the same talks. It's always a challenge to come up with something that will be accessible to a reasonable number of them, and that a few of them might actually put into practice. But it's a lot of fun too. ;-)

Cheers,
Bruce

Gene Maguin wrote

Hi Bruce,

Good, interesting content but what an artful set of slides! That had to take
quite a bit of work to develop the concept, find the images and then get
them into what I assume is powerpoint.

Thanks, Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: Friday, July 22, 2011 7:14 AM
To: [hidden email]
Subject: Re: (no subject)

Almost certainly, I should think. I gave a presentation recently that you
might find interesting. You can download the slides (in PDF format, so no
animation) here:

http://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conferenc
e/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf

Notice the example on slide 26, and the section beginning on slide 34.
Slides 61-64 (in the "cutting room floor" section may also be of interest.

HTH.

Hong Wan wrote:
>
> Dear All
>
> May I ask for a question: if the normality test (Kolmogorov-Smirnov test)
> was violated, and the sample size is 200, is it appropriate to use
> parametric tests?
>
> Thanks!
>
> Best regards,
>
> Lily Wan
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4622685.h
tml
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

(no subject)

In reply to this post by Hong Wan

Well, that is not the first question.

The first question is, "HOW and WHY was the normality test violated?"
Or, "How big is the violation? and can it readily be corrected?"

If the data "ought" to be transformed by log, square root, etc. (which is
not always simple to decide), then transform the data.
If there are a few outliers when an overall transformation is not needed,
what do those outliers mean? - bad data? - the only important data?
If extremes exist because of relatively huge errors at the scale extremes,
would it be sufficient to Windsorize the data (pull in the extremes)?

Then there is the consideration that ordinary t-tests and other ANOVAs are
generally pretty robust against non-normality, especially when Ns are
not too different. What tests are you considering? Equal Ns for two groups?

One guideline is that if you look at the means, and decide that they do
a poor job of reflecting the central tendency or main parameter of the
distributions - poor enough that you would not want to report them - then
you probably ought to transform. The rank-transformation, which gives
you what is most often called the "nonparametric test", is *one* of
the choices that is available. If a "natural" transformation is available,
like using square-root for counts, or log for concentrations, then the
natural transformation is a better choice if it apparently "normalizes".

What non-parametric tests are you considering? For short rating scales,
the existence of ties can be a worse problem for the rank-test than
for the ordinary t-test, and the absence of outliers says there is no
real concern; outliers screw up ANOVA tests by putting a huge weight
on just a few cases. -- A single outlier pair can make a correlation huge;
a single, huge outlier score in one group can make a t-test non-significant.

If it seems nice to be able to report the means as a *good* description
of the groups, then the parametric test is apt to be appropriate. But it is
never wrong to check your assumptions and decisions by confirming that
a nonparametric test does not give a different result. The experience of
comparing results is what will help give you a deeper recognition of
what degree or what sorts of non-normality need to be a concern when
you see it in your data.

--
Rich Ulrich

> Date: Fri, 22 Jul 2011 05:13:11 +0100
> From: [hidden email]
> To: [hidden email]
>
> Dear All
>
> May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests?
>
> Thanks!
>

Bruce Weaver

Re: (no subject)

Administrator

Rich's reply reminded me of one other problem that sometimes occurs when people use tests of normality as precursors to t-tests or ANOVA: I.e., they test for normality with both (or all) groups combined. (IIRC, this was the case for the example I gave in the presentation I posted a link for earlier--but I didn't have time to discuss that in the short time allowed for the talk.) But of course, the assumption is that there should be normality *within* groups, which is another way of saying normality of the residuals.

HTH.

Rich Ulrich-2 wrote

Well, that is not the first question.

The first question is, "HOW and WHY was the normality test violated?"
Or, "How big is the violation? and can it readily be corrected?"

If the data "ought" to be transformed by log, square root, etc. (which is
not always simple to decide), then transform the data.
If there are a few outliers when an overall transformation is not needed,
what do those outliers mean? - bad data? - the only important data?
If extremes exist because of relatively huge errors at the scale extremes,
would it be sufficient to Windsorize the data (pull in the extremes)?

Then there is the consideration that ordinary t-tests and other ANOVAs are
generally pretty robust against non-normality, especially when Ns are
not too different. What tests are you considering? Equal Ns for two groups?

One guideline is that if you look at the means, and decide that they do
a poor job of reflecting the central tendency or main parameter of the
distributions - poor enough that you would not want to report them - then
you probably ought to transform. The rank-transformation, which gives
you what is most often called the "nonparametric test", is *one* of
the choices that is available. If a "natural" transformation is available,
like using square-root for counts, or log for concentrations, then the
natural transformation is a better choice if it apparently "normalizes".

What non-parametric tests are you considering? For short rating scales,
the existence of ties can be a worse problem for the rank-test than
for the ordinary t-test, and the absence of outliers says there is no
real concern; outliers screw up ANOVA tests by putting a huge weight
on just a few cases. -- A single outlier pair can make a correlation huge;
a single, huge outlier score in one group can make a t-test non-significant.

If it seems nice to be able to report the means as a *good* description
of the groups, then the parametric test is apt to be appropriate. But it is
never wrong to check your assumptions and decisions by confirming that
a nonparametric test does not give a different result. The experience of
comparing results is what will help give you a deeper recognition of
what degree or what sorts of non-normality need to be a concern when
you see it in your data.

--
Rich Ulrich

> Date: Fri, 22 Jul 2011 05:13:11 +0100
> From: [hidden email]
> To: [hidden email]
>
> Dear All
>
> May I ask for a question: if the normality test (Kolmogorov-Smirnov test) was violated, and the sample size is 200, is it appropriate to use parametric tests?
>
> Thanks!
>

Swank, Paul R

Re: (no subject)

Really the assumption is that the residuals are normally distributed, which would take into account the groups.

Paul

Dr. Paul R. Swank,
Professor
Children's Learning Institute
University of Texas Health Science Center-Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Friday, July 22, 2011 4:21 PM
To: [hidden email]
Subject: Re: (no subject)

Rich's reply reminded me of one other problem that sometimes occurs when
people use tests of normality as precursors to t-tests or ANOVA: I.e., they
test for normality with both (or all) groups combined. (IIRC, this was the
case for the example I gave in the presentation I posted a link for
earlier--but I didn't have time to discuss that in the short time allowed
for the talk.) But of course, the assumption is that there should be
normality *within* groups, which is another way of saying normality of the
residuals.

HTH.

Rich Ulrich-2 wrote:

>
> Well, that is not the first question.
>
> The first question is, "HOW and WHY was the normality test violated?"
> Or, "How big is the violation? and can it readily be corrected?"
>
> If the data "ought" to be transformed by log, square root, etc. (which is
> not always simple to decide), then transform the data.
> If there are a few outliers when an overall transformation is not needed,
> what do those outliers mean? - bad data? - the only important data?
> If extremes exist because of relatively huge errors at the scale extremes,
> would it be sufficient to Windsorize the data (pull in the extremes)?
>
>
> Then there is the consideration that ordinary t-tests and other ANOVAs are
> generally pretty robust against non-normality, especially when Ns are
> not too different. What tests are you considering? Equal Ns for two
> groups?
>
> One guideline is that if you look at the means, and decide that they do
> a poor job of reflecting the central tendency or main parameter of the
> distributions - poor enough that you would not want to report them - then
> you probably ought to transform. The rank-transformation, which gives
> you what is most often called the "nonparametric test", is *one* of
> the choices that is available. If a "natural" transformation is
> available,
> like using square-root for counts, or log for concentrations, then the
> natural transformation is a better choice if it apparently "normalizes".
>
> What non-parametric tests are you considering? For short rating scales,
> the existence of ties can be a worse problem for the rank-test than
> for the ordinary t-test, and the absence of outliers says there is no
> real concern; outliers screw up ANOVA tests by putting a huge weight
> on just a few cases. -- A single outlier pair can make a correlation huge;
> a single, huge outlier score in one group can make a t-test
> non-significant.
>
> If it seems nice to be able to report the means as a *good* description
> of the groups, then the parametric test is apt to be appropriate. But it
> is
> never wrong to check your assumptions and decisions by confirming that
> a nonparametric test does not give a different result. The experience of
> comparing results is what will help give you a deeper recognition of
> what degree or what sorts of non-normality need to be a concern when
> you see it in your data.
>
> --
> Rich Ulrich
>
>
>> Date: Fri, 22 Jul 2011 05:13:11 +0100
>> From: [hidden email]
>> To: [hidden email]
>>
>> Dear All
>>
>> May I ask for a question: if the normality test (Kolmogorov-Smirnov test)
>> was violated, and the sample size is 200, is it appropriate to use
>> parametric tests?
>>
>> Thanks!
>>
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4624749.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: (no subject)

Administrator

Hi Paul. I thought that's what I said (in the last sentence). But perhaps I was unclear. ;-)

At the risk of being FAR too pedantic this late on a Friday afternoon, we're both wrong. It is actually the *errors* that are assumed to be normally distributed. But the errors are unobservable, so we use the *residuals*, which are observable estimates of the errors. The Wikipedia page on this distinction is pretty good, I think.

http://en.wikipedia.org/wiki/Errors_and_residuals_in_statistics

Right...time to pack it in and go home!

Cheers,
Bruce

Swank, Paul R wrote

Really the assumption is that the residuals are normally distributed, which would take into account the groups.

Paul

Dr. Paul R. Swank,
Professor
Children's Learning Institute
University of Texas Health Science Center-Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Friday, July 22, 2011 4:21 PM
To: [hidden email]
Subject: Re: (no subject)

Rich's reply reminded me of one other problem that sometimes occurs when
people use tests of normality as precursors to t-tests or ANOVA: I.e., they
test for normality with both (or all) groups combined. (IIRC, this was the
case for the example I gave in the presentation I posted a link for
earlier--but I didn't have time to discuss that in the short time allowed
for the talk.) But of course, the assumption is that there should be
normality *within* groups, which is another way of saying normality of the
residuals.

HTH.

Rich Ulrich-2 wrote:
>
> Well, that is not the first question.
>
> The first question is, "HOW and WHY was the normality test violated?"
> Or, "How big is the violation? and can it readily be corrected?"
>
> If the data "ought" to be transformed by log, square root, etc. (which is
> not always simple to decide), then transform the data.
> If there are a few outliers when an overall transformation is not needed,
> what do those outliers mean? - bad data? - the only important data?
> If extremes exist because of relatively huge errors at the scale extremes,
> would it be sufficient to Windsorize the data (pull in the extremes)?
>
>
> Then there is the consideration that ordinary t-tests and other ANOVAs are
> generally pretty robust against non-normality, especially when Ns are
> not too different. What tests are you considering? Equal Ns for two
> groups?
>
> One guideline is that if you look at the means, and decide that they do
> a poor job of reflecting the central tendency or main parameter of the
> distributions - poor enough that you would not want to report them - then
> you probably ought to transform. The rank-transformation, which gives
> you what is most often called the "nonparametric test", is *one* of
> the choices that is available. If a "natural" transformation is
> available,
> like using square-root for counts, or log for concentrations, then the
> natural transformation is a better choice if it apparently "normalizes".
>
> What non-parametric tests are you considering? For short rating scales,
> the existence of ties can be a worse problem for the rank-test than
> for the ordinary t-test, and the absence of outliers says there is no
> real concern; outliers screw up ANOVA tests by putting a huge weight
> on just a few cases. -- A single outlier pair can make a correlation huge;
> a single, huge outlier score in one group can make a t-test
> non-significant.
>
> If it seems nice to be able to report the means as a *good* description
> of the groups, then the parametric test is apt to be appropriate. But it
> is
> never wrong to check your assumptions and decisions by confirming that
> a nonparametric test does not give a different result. The experience of
> comparing results is what will help give you a deeper recognition of
> what degree or what sorts of non-normality need to be a concern when
> you see it in your data.
>
> --
> Rich Ulrich
>
>
>> Date: Fri, 22 Jul 2011 05:13:11 +0100
>> From: [hidden email]
>> To: [hidden email]
>>
>> Dear All
>>
>> May I ask for a question: if the normality test (Kolmogorov-Smirnov test)
>> was violated, and the sample size is 200, is it appropriate to use
>> parametric tests?
>>
>> Thanks!
>>
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/no-subject-tp4621762p4624749.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD