With regard to tests of normality such as Kolmogorov-Smirnov, Shapiro-Wilk, Anderson-Darling, etc., is anyone out there using the results to determine whether to perform a parametric (e.g., t-test) or non-parametric (e.g., Mann-Whitney
U) on a dataset? If the data are interval, but are found not normal according to one of these tests, is that enough to discard a parametric statistic? Thanks. Brian Brian Dates, M.A. |
NO.
Rarely does the distribution of the raw data matter. In most instances what matters is the distribution of the residuals (errors). What questions are you using statistical methods to answer?
Art Kendall
Social Research Consultants |
Administrator
|
In reply to this post by bdates
Hi Brian. My thoughts on testing for normality as a precursor to a parametric test are summarized in a conference presentation I gave a few years ago. A PDF of the slides can be found here:
https://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf Cheers & Happy New Year. Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks Art and Bruce. This has been my position all along, especially given the robust nature of the t-test with regards to non-normality. My question was a result of a discussion/disagreement with a colleague with whom I teach stats (me in the Fall, him in the Winter). Bruce...thanks for the presentation. It's great! The listserve ought to download and save it. To all, a Happy New Year.
B ________________________________________ From: SPSSX(r) Discussion [[hidden email]] on behalf of Bruce Weaver [[hidden email]] Sent: Wednesday, December 31, 2014 11:39 AM To: [hidden email] Subject: Re: Question About Tests of Normality and Choice of Statistical Analysis Hi Brian. My thoughts on testing for normality as a precursor to a parametric test are summarized in a conference presentation I gave a few years ago. A PDF of the slides can be found here: https://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf Cheers & Happy New Year. Bruce bdates wrote > With regard to tests of normality such as Kolmogorov-Smirnov, > Shapiro-Wilk, Anderson-Darling, etc., is anyone out there using the > results to determine whether to perform a parametric (e.g., t-test) or > non-parametric (e.g., Mann-Whitney U) on a dataset? If the data are > interval, but are found not normal according to one of these tests, is > that enough to discard a parametric statistic? Thanks. > > Brian > > Brian Dates, M.A. > Director of Evaluation and Research | Evaluation & Research | Southwest > Counseling Solutions > Southwest Solutions > 1700 Waterman, Detroit, MI 48209 > 313-841-8900 (x7442) office | 313-849-2702 fax > bdates@ > <mailto: > bdates@ > > | www.swsol.org<http://www.swsol.org/> > > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-About-Tests-of-Normality-and-Choice-of-Statistical-Analysis-tp5728302p5728308.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Bruce Weaver
nice presentation.
WRT slide 45. of couse one can do a simulation with a "pop" proportion of .5 (etc) and an n of 11, but can never actually hit a .5 proportion of an odd number.
Art Kendall
Social Research Consultants |
In reply to this post by Art Kendall
further, the residuals (differences (distances) from the parameter are what the standard errors are mad up of.
Art Kendall
Social Research Consultants |
In reply to this post by Bruce Weaver
Bruce,
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
That is very nice. But you never even mentioned the assumptions of the relevant non-parametric tests that are based on the rank-transformation: continuous data of similar- shape distributions in both samples, and few ties. Some of your examples ("Normal versus skewed") would not be appropriate for testing by ranks. Likert-type items deserve normal testing for various reasons, including the occasional weird scoring that you can observe as resulting from rank-transforms. Continuous items with similar skew, etc., usually should be transformed by taking logs or reciprocal (whatever is appropriate) to "normalize" - That improves both the metric and the test. I can regard rank-testing as a sloppy, time-saving expedient, compared to doing a transformation that is apparent. - If there is not a transformation available, then there is big doubt about whether these data fit the non-par assumption. -- Rich Ulrich > Date: Wed, 31 Dec 2014 09:39:59 -0700 > From: [hidden email] > Subject: Re: Question About Tests of Normality and Choice of Statistical Analysis > To: [hidden email] > > Hi Brian. My thoughts on testing for normality as a precursor to a > parametric test are summarized in a conference presentation I gave a few > years ago. A PDF of the slides can be found here: > > https://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf > > Cheers & Happy New Year. > Bruce |
In reply to this post by Bruce Weaver
Wonderful presentation Bruce..........thanks for sharing !
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: Wednesday, December 31, 2014 10:10 PM To: [hidden email] Subject: Re: Question About Tests of Normality and Choice of Statistical Analysis Hi Brian. My thoughts on testing for normality as a precursor to a parametric test are summarized in a conference presentation I gave a few years ago. A PDF of the slides can be found here: https://www.nosm.ca/uploadedFiles/Research/Northern_Health_Research_Conference/Weaver,%20Bruce_Silly%20or%20Pointless%20Things.pdf Cheers & Happy New Year. Bruce bdates wrote > With regard to tests of normality such as Kolmogorov-Smirnov, > Shapiro-Wilk, Anderson-Darling, etc., is anyone out there using the > results to determine whether to perform a parametric (e.g., t-test) or > non-parametric (e.g., Mann-Whitney U) on a dataset? If the data are > interval, but are found not normal according to one of these tests, is > that enough to discard a parametric statistic? Thanks. > > Brian > > Brian Dates, M.A. > Director of Evaluation and Research | Evaluation & Research | > Southwest Counseling Solutions Southwest Solutions > 1700 Waterman, Detroit, MI 48209 > 313-841-8900 (x7442) office | 313-849-2702 fax > bdates@ > <mailto: > bdates@ > > | www.swsol.org<http://www.swsol.org/> > > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to > manage subscriptions, send the command INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-About-Tests-of-Normality-and-Choice-of-Statistical-Analysis-tp5728302p5728308.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD -**************Nihilent*************** " *** All information contained in this communication is confidential, proprietary, privileged and is intended for the addressees only. If youhave received this E-mail in error please notify mail administrator by telephone on +91-20-39846100 or E-mail the sender by replying to this message, and then delete this E-mail and other copies of it from your computer system. Any unauthorized dissemination,publication, transfer or use of the contents of this communication, with or without modifications is punishable under the relevant law. Nihilent has scanned this mail with current virus checking technologies. However, Nihilent makes no representations or warranties to the effect that this communication is virus-free. Nihilent reserves the right to monitor all E-mail communications through its Corporate Network. *** " *************************************************************************- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Art Kendall
further again.
In my experience, severe departures from normality, e.g., a lot of zeros, are apparent in the visualizations of the raw data and of the residuals. YMMV. Of course nothing is perfectly normally distributed. The question might be better phrased as is the discrepancy from normally distributed large enough to influence the substantive conclusions. What reasons are there to expect that the population of measurements have some other distribution? E.g., we know before hand that incomes are skewed. Rich has made some very useful contributions to the idea of transforms on this list over the years. You should check the archives of this list.
Art Kendall
Social Research Consultants |
In reply to this post by Rich Ulrich
Contemplating an inferential t-test for a Likert ITEM suggests that you are using MEAN of an ordinal variable as a descriptive measure.
This is the statistical equivalent in believing in a flat earth. Move on Likert items are ORDINAL. Consequently the only appropriate descriptive measure are raw probabilities of items 1 -n or cumulative probabilities of [<=1, <=2,…<= n-1] If you are using raw probabilities then chi=square contingency test [pearson or log likelihood] is good for single predictor variable, and multinomial for > 1 predictor. If you are using cumulative probabilities then ordinal regression [either logit or probit] is good SPSS REGRESSION [ordinal or binary logit] is good fro between grip designs SPSS MIXED is good for designs that include repeated measure, or GENERALIZED linear models with generalised estimating equation Its pointless to ask whether ordinal measure are normally distributed . The answer is ALWAYS NO, higher mean implies negative skew, lower mean positive skew Both normal based [t-tests] or rank based [mann-whitney] inferential tests on means or rank are meaningless, as they assume metric data, i.e. difference between agree and strongly agree is same as between neutral and agree. This is a nonsensically improbable hypothesis. There is no excuse for any SPSS user, or anyone else as there always R, doing inappropriate t-tests on Likert items. The right tests are easily available. A user who is unfamiliar with these kinds of tests should consult someone with statistical expertise and ask the RIRHT question, which is: ‘how do I analyse this Likert item data’ not the wrong question, ‘how do i test if this is data is normally distributed’ when consulting it is always a good idea to give expert FULL picture of problem to be solved and data collected [or better to be collected]. assuming you know the right test and skiing how to do that test is a recipe for disaster. if you are unlucky expert will tell you how to do the test without probing whether it is the right test End of RANT, Happy New Year NB if you have Likert SCALE rather than a Likert item, then assumption of metric properties may be appropriate. best Diana On 1 Jan 2015, at 01:40, Rich Ulrich <[hidden email]> wrote:
__________________________ Diana Kornbrot 19 Elmhurst Avenue London N2 0LT, UK +44 [0] 208 444 2081 home +44 [0] 7403 18 16 12 mobile skype: kornbrotme |
Diane,
Haven't we covered this before? There's a good point or two behind your "rant", and I will point out those, plus, where I disagree. 1. Single Likert items are generally not a good choice for major analyses. Use composite scores. But if the score is not arguably "interval", it does not qualify for the description "Likert". The same goes for "skewed"; but, even there, it is hard to discern major disadvantages of using ANOVA instead of a logistic approach. On the other hand, it is *easy* to point to means where the meaning is inherent. 2. When reporting on sets of Likert items, you might get away with showing logistic results -- which I would generally consider as technically superior. 3. If you do focus on a single Likert item, I expect that you will strongly disappoint your reviewers, editors and readers if you do not provide them with the simple MEANs, despite your animadversion. (We flat-earthers live in small cities where it is safe to ignore the curvature owing to advantages of scale.) 4. I am pleased that you do not like the use of rank-transformed scores for Likert; my main purpose was to deprecate those, which was on over-interpreted recommendation from the 1950s which keeps recurring. I thought Brian was addressing that. 5. And then there is this confusing statement, which I will address with an example: "Both normal based [t-tests] or rank based [mann-whitney]
inferential tests on means or rank are meaningless, as they assume metric data, i.e. difference between agree and strongly agree is same as between neutral and agree. This is a nonsensically improbable hypothesis." Every 1-d.f. test will use "a metric". The problem with the rank-transform is that, in the cases that it is apt to make a difference, it is liable to use a palpably BAD choice of metric. You can confirm this when you look at the transformed values. (Plus, as it happens, the tests tend to be inaccurate when they are based on "variance estimates adjusted for ties." Conover showed that doing the rank-transform followed by ordinary ANOVA seemed to be generally more accurate.) The logistic uses a metric which is based on the assumption that distances should be (or can usefully be) defined by the Ns of the sample subsets, using cumulative ranks, and the "logistic" ought to describe the outcome. (By the way, "probit" starts the same way with ranks, but uses "normal" instead of "logistic" as the basis for its distances.) One reason that the ANOVA and the logistic agree better is that the logistic makes rather less change to the metric, where the rank-transform (an intermediate step on the way to the logistic) gives results that, well, *I* do not like. But the logistic *definitely* does use a varied metric; that is why it gives (a slight) variation in results. Here is what Rank-transform does to unequal N, for one skewed example; skewing is where the differences mainly appear. The implicit "metric" is in the final column. The distance between the 0-1 responses is used to rescale the responses for scores 2 and 3. This gives a direct comparison to the original 0-3 metric. Score N range aveRank 0|1|what? 0 55 1-55 28 0 1 31 56-86 72 1 2 9 87-95 92 1.42 3 5 96-100 98 1.48 Instead of analyzing (0,1,2,3), the rank-transform uses the metric which works exactly like (0, 1, 1.42, 1.48), since linear transforms will be transparent . On the other hand, even for this amount of skew, the logistic gives the implicit metric as (0, 1, 1.75, 2.5). There is still a decreased gap between the higher numbers, but the logistic "undoes" most of the compression of the range. The logistic would be computed from the percentile of the average rank, log(P/(1-P) ), which is too messy to go into. -- Rich Ulrich Date: Thu, 1 Jan 2015 16:32:20 +0000 From: [hidden email] Subject: Re: Question About Tests of Normality and Choice of Statistical Analysis for LIkert items To: [hidden email] Contemplating an inferential t-test for a Likert ITEM suggests that you are using MEAN of an ordinal variable as a descriptive measure. This is the statistical equivalent in believing in a flat earth. Move on Likert items are ORDINAL. Consequently the only appropriate descriptive measure are raw probabilities of items 1 -n or cumulative probabilities of [<=1, <=2,…<= n-1] If you are using raw probabilities then chi=square contingency test [pearson or log likelihood] is good for single predictor variable, and multinomial for > 1 predictor. If you are using cumulative probabilities then ordinal regression [either logit or probit] is good SPSS REGRESSION [ordinal or binary logit] is good fro between grip designs SPSS MIXED is good for designs that include repeated measure, or GENERALIZED linear models with generalised estimating equation Its pointless to ask whether ordinal measure are normally distributed . The answer is ALWAYS NO, higher mean implies negative skew, lower mean positive skew Both normal based [t-tests] or rank based [mann-whitney] inferential tests on means or rank are meaningless, as they assume metric data, i.e. difference between agree and strongly agree is same as between neutral and agree. This is a nonsensically improbable hypothesis. There is no excuse for any SPSS user, or anyone else as there always R, doing inappropriate t-tests on Likert items. The right tests are easily available. A user who is unfamiliar with these kinds of tests should consult someone with statistical expertise and ask the RIRHT question, which is: ‘how do I analyse this Likert item data’ not the wrong question, ‘how do i test if this is data is normally distributed’ when consulting it is always a good idea to give expert FULL picture of problem to be solved and data collected [or better to be collected]. assuming you know the right test and skiing how to do that test is a recipe for disaster. if you are unlucky expert will tell you how to do the test without probing whether it is the right test End of RANT, Happy New Year NB if you have Likert SCALE rather than a Likert item, then assumption of metric properties may be appropriate. best Diana On 1 Jan 2015, at 01:40, Rich Ulrich <[hidden email]> wrote:
|
As a general rule, I try to have the operationalization (variable to represent) a construct be no more coarse than is necessary. When you pre-test an instrument, see whether respondents like those you are going to study can deal with more points on a response scale. I try to use pre-existing summative scales so that that there are more legitimate values that the resulting variable may take. [I abhor committing the nefarious median split which is a totally unnecessary coarsening of measurement.]
If one is concerned about whether to treat a single five point response scale item as ordinal vs interval, try running CATREG. Use the built-in options to to compare the fit under both sets of assumptions. Does any observed difference in results pass the "SO WHAT?" test? The syntax below simulates an underlying continuous construct that is operationalized as a 5 point scale. The intervals between values are tweaked, Paste the syntax below into a the syntax window of a new instance of SPSS. Run it. In the output editor fit regression and loess curves to the graphs. try scatter plotting different modifications of the variables and in the output editor fit regression and loess curves. The simulation only uses a fairly small set of cases, i.e., 100. change the loop for ID to try other sample sizes. * demo that item intervals may not make much practical import. INPUT PROGRAM. LOOP id=1 TO 100. COMPUTE x = rnd(rv.uniform(.5,5.5)). compute x_sq = x**2. compute x_cubed = x**3. compute x_sqrt =sqrt(x). compute x_spread1.10 = (x-1)+((x-1)*1.10). compute x_spread1.25 = (x-1)+((x-1)*1.25). compute x_spread1.50 = (x-1)+((x-1)*1.50). compute x_spread2 = (x-1)+((x-1)*2.00). compute x_spread3 = (x-1)+((x-1)*3.00). compute x_spread4 = (x-1)+((x-1)*4.00). END CASE. END LOOP. END FILE. END INPUT PROGRAM. do repeat fuzz = .01, .02,.05,.10,.25,.50 1.00/ xfuzzed = xfuzzed.01, xfuzzed.02,xfuzzed.05,xfuzzed.10,xfuzzed.25,xfuzzed.50 xfuzzed1. compute xfuzzed = x + rv.uniform(0,fuzz). *compute xfuzzed = x + rv.uniform(0,fuzz*x). end repeat. FORMATS id (F3.0) X to x_cubed x_spread2 to x_spread4(F3) x_sqrt X_spread1.10 to x_spread1.50 (f6.2). FREQUENCIES VARS= X to xfuzzed1. correlations vars = x_sq to xfuzzed1 with x. crosstabs x_sq to x_spread4 by x. *edit graph in output to put in a linear fit line and loess curve. * Chart Builder. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=x x_cubed MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: x=col(source(s), name("x")) DATA: x_cubed=col(source(s), name("x_cubed")) GUIDE: axis(dim(1), label("x")) GUIDE: axis(dim(2), label("x_cubed")) ELEMENT: point(position(x*x_cubed)) END GPL. * Chart Builder. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=x xfuzzed1 MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: x=col(source(s), name("x")) DATA: xfuzzed1=col(source(s), name("xfuzzed1")) GUIDE: axis(dim(1), label("x")) GUIDE: axis(dim(2), label("xfuzzed1")) ELEMENT: point(position(x*xfuzzed1)) END GPL.
Art Kendall
Social Research Consultants |
BTW do list members prefer the term "nefarious median split" or "invidious median split"?
Art Kendall
Social Research Consultants |
Free forum by Nabble | Edit this page |