Just to piggy-back on Rich Ulrich’s comments, I would recommend,
since your data are at least ordinal, and more likely interval in nature, that
you consider the ICC. If you choose Fleiss’ kappa at all, it would
be to look at the level of agreement, not the ‘reliability’. The
other potential advantage, if you choose, is that Fleiss’ kappa provides
information on agreement for each category, so you could identify those ratings
which raters found more difficult. But overall, using Fleiss would be for
elucidation of the results of the ICC. It should not be used as the
primary source of information about reliability.
Brian
> Date: Wed, 26 Sep 2012 10:11:53 -0700
> From: [hidden email]
> Subject: Inter rater reliability - Fleiss's Kappa?
> To: [hidden email]
>
> Hi
>
> I am a 4th year medical student, writing my first research paper and have
> very minimal (or zero) experience with statistics!
>
> My project involves the creation of a behavior marking system which has 5
> categories (eg leadership, communication etc). These categories are
further
> broken down into subcategories to define the behavior (called 'elements'),
> the number of which is variable from 3-6 per category. The individuals are
> awarded a score of 1-5 for each element, with 5 representing excellent
> performance of the behavioral element. There are 3 separate raters.
>
> I am hoping to assess the inter rater reliability for each element. What
is
> an appropriate measurement of this? I have done a little bit of research
and
> it would suggest that Fleiss's kappa would be the best as there are 3
> raters. Is this correct, and if so can I use the SPSSX / PASW software to
do
> this?
>
> Any help would be very much appreciated!
>...
Free forum by Nabble | Edit this page |