SPSSX Discussion

Inter-rater reliability statistic - which method?

Classic

List

Threaded

5 messages Options

poloboyden

Jul 03, 2012; 2:07pm

Inter-rater reliability statistic - which method?

5 posts

Hello

I wondered if you could help answer a quick (and hopefully easy) question.

I am currently trying to suss out which method of inter-rater reliability is the appropriate one for working out the reliability between two raters who numbered the amount of time's they saw a reference to the self in a set of transcripts.

Thus, it is continuous data. There are two raters - so I am thinking Cohen's Kappa should be used, or should it be Pearson's correlation?

lori.andersen

Jul 03, 2012; 2:07pm

Re: Inter-rater reliability statistic - which method?

14 posts

Sent from my iPhone

On Jul 3, 2012, at 10:07 AM, "poloboyden [via SPSSX Discussion]"<[hidden email]> wrote:

Hello

I wondered if you could help answer a quick (and hopefully easy) question.

I am currently trying to suss out which method of inter-rater reliability is the appropriate one for working out the reliability between two raters who numbered the amount of time's they saw a reference to the self in a set of transcripts.

Thus, it is continuous data. There are two raters - so I am thinking Cohen's Kappa should be used, or should it be Pearson's correlation?

If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Inter-rater-reliability-statistic-which-method-tp5713982.html

To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

bdates

Jul 03, 2012; 2:16pm

Re: Inter-rater reliability statistic - which method?

303 posts

In reply to this post by poloboyden

Cohen's kappa is not appropriate for this, unless you want a measure of
how much actual agreement there is. There are a number of articles
starting with Krippendorf (1970) and followed by Fleiss and Cohen (1973)
about the equivalence of the ICC to agreement statistics when the data
are ordinal or interval in nature. I'd recommend using the ICC for your
work. It's more accepted in the literature generally than simple
correlation. Others on the list may have alternative views.

Brian

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
poloboyden
Sent: Tuesday, July 03, 2012 10:07 AM
To: [hidden email]
Subject: Inter-rater reliability statistic - which method?

Hello

I wondered if you could help answer a quick (and hopefully easy)
question.

I am currently trying to suss out which method of inter-rater
reliability is
the appropriate one for working out the reliability between two raters
who
numbered the amount of time's they saw a reference to the self in a set
of
transcripts.

Thus, it is continuous data. There are two raters - so I am thinking
Cohen's
Kappa should be used, or should it be Pearson's correlation?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Inter-rater-reliability-st
atistic-which-method-tp5713982.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

SR Millis-3

Jul 03, 2012; 2:49pm

Re: Inter-rater reliability statistic - which method?

152 posts

I'd suggest you consider using the root mean square differences and concordance correlation coefficients. See:

1. Psychol Methods. 2012 Jun;17(2):294-308. Epub 2011 May 16.

Examining the reliability of interval level data using root mean square

differences and concordance correlation coefficients.

Barchard KA.

University of Nevada, Las Vegas.

This article introduces new statistics for evaluating score consistency.

Psychologists usually use correlations to measure the degree of linear

relationship between 2 sets of scores, ignoring differences in means and standard

deviations. In medicine, biology, chemistry, and physics, a more stringent

criterion is often used: the extent to which scores are identically equal. For

each test taker (or other unit of measurement), the difference between the 2

scores is calculated. The root mean square difference (RMSD) represents the

average change from 1 set of scores to the other, and the concordance correlation

coefficient (CCC) rescales this coefficient to have a maximum value of 1. This

article shows the relationship of the RMSD and CCC to the intraclass correlation

coefficients, product-moment correlation, and standard error of measurement.

Finally, this article adapts the RMSD and the CCC for linear, consistency, and

rights reserved).

~~~~~~~~~~~
Scott R Millis, PhD, ABPP, CStat, PStat®
Board Certified in Clinical Neuropsychology, Clinical Psychology, & Rehabilitation Psychology
Professor
Wayne State University School of Medicine
Email: [hidden email]
Email: [hidden email]
Tel: 313-993-8085

From: "Dates, Brian" <[hidden email]>
To: [hidden email]
Sent: Tuesday, July 3, 2012 10:16 AM
Subject: Re: Inter-rater reliability statistic - which method?

Rich Ulrich

Jul 03, 2012; 6:51pm

Re: Inter-rater reliability statistic - which method?

1067 posts

In reply to this post by poloboyden

Cohen's Kappa is mainly useful for 2x2 tables. Or as a way
of bragging about near-perfect concordance.

To "work out the relationship", the best general approach
is to look at the Pearson correlation for the similarity and
the paired t-test for systematic difference. For the 2x2 case,
use McNemar's test to check the difference.

The ICC assumes a common mean for the raters, so it is
less suited for examination. It is sometimes preferred
for the summaries when publishing results of multiple tests.

--
Rich Ulrich

> Date: Tue, 3 Jul 2012 07:07:11 -0700

> From: [hidden email]
> Subject: Inter-rater reliability statistic - which method?
> To: [hidden email]
>
> Hello
>
> I wondered if you could help answer a quick (and hopefully easy) question.
>
> I am currently trying to suss out which method of inter-rater reliability is
> the appropriate one for working out the reliability between two raters who
> numbered the amount of time's they saw a reference to the self in a set of
> transcripts.
>
> Thus, it is continuous data. There are two raters - so I am thinking Cohen's
> Kappa should be used, or should it be Pearson's correlation?
>

... [show rest of quote]

...