Login  Register

Re: Boxplot (seemingly) does not show outlier

Posted by Spousta Jan on Apr 05, 2007; 4:35pm
URL: http://spssx-discussion.165.s1.nabble.com/Boxplot-seemingly-does-not-show-outlier-tp1074926p1074933.html

Hi,

If you have only four points, both metods (boxplot or mean +- 2 sd) are
worthles because they never show outliers regardles of positions of the
points. (Boxplots need at least 5 points, mean +- 2 sd needs at least 6
points to be able to detect one single outlier in some cases).

And this is in fact OK: the sample of four is too small to estimate the
"normal" behavior of the population correctly - therefore we are not
able to tell the regular points from outliers.

Of course if you have a specific prior information about the
distribution (Bayesian approach), you can sometimes detect an outlier
even in the sample of one.

Regards

Jan

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Ornelas, Fermin
Sent: Thursday, April 05, 2007 5:15 PM
To: [hidden email]
Subject: Re: Boxplot (seemingly) does not show outlier

What I would recommend is just a simple plot of the residual vs fitted
values or residuals against the predictors. If you have an outlier it
will show in the plot. If an error is more than two standard deviations
from zero then it may be an outlier.

A normal probability plot will also show if a residual is an outlier.


Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tom Werner
Sent: Thursday, April 05, 2007 7:38 AM
To: [hidden email]
Subject: Boxplot (seemingly) does not show outlier

In SPSS (12.0 for Windows, Student version) when I attempt to produce a
boxplot of four data points (62, 61, 59, and 33), SPSS generates the
boxplot...

...but does NOT show 33 as an outlier (even though 33 would seem to be
an outlier relative to 62, 61, and 59 to the casual observer).

(I'm analyzing the scores of sets of four judges and would like to use
SPSS to produce boxplots to indicate 'outlier' judge scores.)

Even if I change the value of 33 to 13, it still does not show in a
boxplot as an outlier.

If I add a fifth data point (with a value as low as 50), 33 shows in a
boxplot as an outlier.

Can anyone explain this?

1.  Is it because of the even number of data points (four), thus
requiring that the median be a calculated value?

2.  Are five data points that much more powerful than four data points
at producing a tighter intraquartile range (i.e., a tighter box in the
boxplot), and thus generating an outlier?

3.  Is this perhaps a quirk of SPSS?

Much thanks for any help!

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the
specific
individual(s) to whom it is addressed.  It may contain information that
is privileged and confidential under state and federal law.  This
information may be used or disclosed only in accordance with law, and
you may be subject to penalties under law for improper use or further
disclosure of

the information in this e-mail and its attachments. If you have received

this e-mail in error, please immediately notify the person named above
by reply e-mail, and then delete the original e-mail.  Thank you.