Re: Inter-rater agreement for multiple raters or something else?
Posted by MJury on Nov 04, 2014; 11:02pm
URL: http://spssx-discussion.165.s1.nabble.com/Inter-rater-agreement-for-multiple-raters-or-something-else-tp5727786p5727790.html
Dear Art and David, thanks so much for your interest!
A, B, C and D are four possible diagnostic categories and raters were asked to choose only one of them (mutually exclusive and exhaustive categories).
Patients are arranged in rows and raters in columns. I calculated overall Fleiss kappa for all patients and all raters but I would be also interested in identifying patients that are the most debatable/controversial from the diagnostic point of view. I am not sure whether Fleiss kappa is the best solution here as I understand its main goal is to assess reliability of raters while I am more interested in recognizing clinical phenotypes that cause disagreement between raters. Maybe Fleiss kappa's pi (=the extent to which raters agree for the i-th subject) would be appropriate? However the Fleiss kappa Excel spreadsheet I downloaded from Jason E. King's website does not calculate that. Brian, thanks a lot for your kindness, does your macro calculate pi for Fleiss kappa?
I would appreciate any comments.
With best regards,
Mack