Login  Register

Re: Boxplot (seemingly) does not show outlier

Posted by Tom Werner on Apr 05, 2007; 6:23pm
URL: http://spssx-discussion.165.s1.nabble.com/Boxplot-seemingly-does-not-show-outlier-tp1074926p1074932.html

Thank you very much for your reply.

Yes, it's true that 4 points is a very small sample.

Unfortunately, it is in the nature of the real-world situation.

I have an awards program in which each entry is judged by 4 judges. (Each
set of 4 judges is randomly selected from a large pool of judges.)


Right now, for each entry, I review the judges' scores 'by eyeball' and
subjectively identify outliers.

(For example, if four scores were 62, 62, 59, and 33 (on a scale of 7-70), I
would have subjectively said that the '33' is from an overly strict,
'outlier' judge.)


I was wondering whether an SPSS boxplot could be produced for each entry
showing the 4 judges' scores, and thus use the InterQuartile Range + 1.5 IQR
as a statistical definition of an outlier.


Note: If a conclusion here is that 5 data points (5 judges) is a better
approach, that is a possibility. That obviously involves more effort (more
judges), but if it produces more rigor it may be worth it.


Regards,

Tom


Tom Werner
Brandon Hall Research
734-433-1299
[hidden email]


-----Original Message-----
From: Ornelas, Fermin [mailto:[hidden email]]
Sent: Thursday, April 05, 2007 12:05 PM
To: Tom Werner; [hidden email]
Subject: RE: Boxplot (seemingly) does not show outlier

Let me take another shot at this. It is not clear what you are trying to do
in your analysis. Having only 4 data points is not a very meaningful way to
conduct statistical research. In most practical statistical classes you will
be reminded of questionable results when you have a small sample size. None
of the properties usually referred in regression can be verified (normality,
constant variance, outliers, independence).
That is what I was referring indirectly when I said "why bother if you only
have 4 observations".

Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Tom
Werner
Sent: Thursday, April 05, 2007 7:38 AM
To: [hidden email]
Subject: Boxplot (seemingly) does not show outlier

In SPSS (12.0 for Windows, Student version) when I attempt to produce a
boxplot of four data points (62, 61, 59, and 33), SPSS generates the
boxplot...

...but does NOT show 33 as an outlier (even though 33 would seem to be an
outlier relative to 62, 61, and 59 to the casual observer).

(I'm analyzing the scores of sets of four judges and would like to use SPSS
to produce boxplots to indicate 'outlier' judge scores.)

Even if I change the value of 33 to 13, it still does not show in a boxplot
as an outlier.

If I add a fifth data point (with a value as low as 50), 33 shows in a
boxplot as an outlier.

Can anyone explain this?

1.  Is it because of the even number of data points (four), thus requiring
that the median be a calculated value?

2.  Are five data points that much more powerful than four data points at
producing a tighter intraquartile range (i.e., a tighter box in the
boxplot), and thus generating an outlier?

3.  Is this perhaps a quirk of SPSS?

Much thanks for any help!

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the specific
individual(s) to whom it is addressed.  It may contain information that is
privileged and confidential under state and federal law.  This information
may be used or disclosed only in accordance with law, and you may be subject
to penalties under law for improper use or further disclosure of

the information in this e-mail and its attachments. If you have received

this e-mail in error, please immediately notify the person named above by
reply e-mail, and then delete the original e-mail.  Thank you.