SPSSX Discussion - Re: inter-rater reliability with multiple raters

Re: inter-rater reliability with multiple raters

Posted by bdates on Jun 16, 2014; 2:49pm
URL: http://spssx-discussion.165.s1.nabble.com/inter-rater-reliability-with-multiple-raters-tp5726465p5726484.html

Fleiss’ kappa was designed for nominal data. If your data are ordinal, interval, ratio, then use the ICC or other related procedure for continuous data. The ICC provides analyses which have been found analogous to Fleiss’ weighted kappa (Fleiss and Cohen, 1973). The syntax that Max refers to looks like the most promising alternative, as long as you know what model you have. If you need Fleiss’ kappa syntax because you have nominal data, I can send that to you offline.

Brian

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Max Jasper
Sent: Friday, June 13, 2014 11:26 PM
To: [hidden email]
Subject: Re: inter-rater reliability with multiple raters

Check these out may help:

ftp://ftp.boulder.ibm.com/software/analytics/spss/support/Stats/Docs/Statistics/Macros/Iccsf.htm

Hi everyone! I need help with a research assignment. I'm new to IBM SPSS statistics, and actually statistics in general, so i'm pretty overwhelmed.

My coworkers and I created a new observation scale to improve the concise transfer of information between nurses and other psychiatric staff. This scale is designed to facilitate clinical care and outcomes related research.

Nurses and other staff members on our particular inpatient unit will use standard clinical observations to rate patient behaviors in eight categories (

1. abnormal motor activity,

2. activities of daily living,

3. bizarre/disorganized behavior,

4. medication adherence,

5. aggression,

6. observation status,

7. participation in assessment, and

8. quality of social interactions).

Each category will be given a score 0-4, and those ratings will be summed to create a a total rating. At least two nurses will rate each patient during each shift, morning and evening (so one patient should theoretically have at least four ratings per day).

My assignment is to examine the reliability and validity of this new scale, and determine its utility for transfer of information.

Right now I'm trying to figure out how to examine inter-rater reliability. IBM SPSS doesn't have a program to calculate Fleiss kappa (that I know of) and I'm not sure if that's what I should be calculating anyway...I'm confused because there are multiple raters, multiple patients, and multiple dates/times/shifts. The raters differ from day to day even on the same patient's chart, so there is a real lack of consistency in the data. Also sometimes only one rating is done on a shift...sometimes the nurses skip a shift of rating altogether. Also there are different lengths of stay for each patient, so the amount of data collected for each one differs dramatically.

I've attached a screenshot of part of our unidentified data. Can anyone please help me figure out how to determine inter-rater reliability? (Or if anyone has any insight into how to determine validity, that'd be great too!)

Thanks so much!