SPSSX Discussion - Re: SPSS Python Extension for Fleiss Kappa

Re: SPSS Python Extension for Fleiss Kappa

Posted by bdates on Oct 10, 2017; 7:51pm
URL: http://spssx-discussion.165.s1.nabble.com/SPSS-Python-Extension-for-Fleiss-Kappa-tp5734963p5734992.html

Thank-you, Bruce, and please thank Daniel Klein as well. First, let me clarify. The overall kappa's, standard errors, etc. produced from each solution are the same. The category kappa's produced from each solution are also the same. What differs is the standard errors. My syntax and the SAS syntax written by Gwet (INTER_RATER.MAC) produce the same standard errors which are different from those produced by the SPSS or Stata solutions. Fleiss, Nee, and Landis (1979) updated the formula for the standard error for the overall kappa. It seems that all the solutions have adopted that. There was also an update to the category standard error, which produces one standard error for all categories rather than a separate error for each category. The SPSS Python and Stata solutions have incorporated this. I need to update my syntax, but also contact the authors of other syntax about this issue.

Again, thank-you vary much for your facilitation in this situation. It has truly made things go much quicker than otherwise.

Brian

________________________________________
From: SPSSX(r) Discussion [[hidden email]] on behalf of Bruce Weaver [[hidden email]]
Sent: Tuesday, October 10, 2017 3:02 PM
To: [hidden email]
Subject: Re: SPSS Python Extension for Fleiss Kappa

Hi Brian. I also sent Daniel Klein an e-mail alerting him to this thread.
He does not use SPSS, and is probably not going to join this list himself.
But he did ask me to post the following comments he sent me.

Cheers,
Bruce
-------------------------------------

First, note that Gwet (2008, 2014) uses an approach to statistical inference
that is quite different from anything that has been proposed so far in the
area of inter-rater agreement. He calls what is usually done the
"model-based" approach. Here, a hypothetical distribution of the kappa
coefficient under H0 is used to derive test statistics for testing against
0. Such approaches are not necessarily valid for confidence interval
construction (also see Reichenheim 2004). So I suppose there might be a
reason why StataCorp decided not to return a standard error and confidence
interval in their -kap- coammand. What
Gwet (2008, 2014) suggest instead is a "design-based" approach that is based
on finite sample theory and that is implemented in -kappaetc-. This explains
the difference between the standard error that Brian obtains for combined
kappa (0.05770; which is what you confirm using the -kap- command) and the
SE reported by Gwet's SAS macro (0.08132) and -kappaetc- (0.08223). To say
more about the (small) difference between the latter two, I would need to
know which macro was used to produce the results. Brian states that he used
INTER_RATER.MAC but does neither state the software version nor where it is
from.

[BW: I believe the loop Daniel mentions in the next paragraph is referring
to the Stata output I uploaded here:
http://spssx-discussion.1045642.n5.nabble.com/file/t7186/Fleiss_kappa_problem.txt.]

When you program the loop to get kappa for each category, note that
-kappaetc- does not(!) report the same standard error every time. It differs
for each category. It is the SE you obtain from -kap- that is constant
across categories. As I have said, I do not know whether this SE is
trustworthy for confidence interval construction but it seems the Python
extension uses the very same approach. Note that it is documented in [R]
kappa. However, the SEs reported by -kappaetc- do not match Brian's or those
from the SAS macro. I have no explanation for that. All I can tell you is
that -kappaetc- does nothing special here, i.e., it retains Gwet's
"design-based" approach. It seems the SAS macro and Brian have implemented
another formula to get the SE for
each category, separately (and perhaps there is a reason for this). It seems
a bit odd to me that the SEs are so much higher than the combined one, given
that they are all based on the same number of subjects. Also, I find it
somehow counterintuitive to clearly rejecting the null in the combined
coefficient when only one of the category-wise coefficients is statistically
significant different from zero.

I notice that the help file for -kappaetc- might be misleading or incomplete
concerning differences to Stata's -kap- and -kappa-. Here is the already
revised paragraph that will be included in an updated version of -kappaetc-
(hopefuly towards the end of the year).

--- Start of excerpt from help for kappaetc ----

Relation to official Stata's kap and kappa commands

The percent agreement that is reported by kappaetc is the same as the
observed agreement returned by Stata's kap command. The latter, however,
does not provide a standard error or confidence intervals for the
coefficient.

For two unique raters and no missing ratings both, Stata's kap and the
kappaetc command, estimate the same Cohen's kappa coefficient. The standard
error and p-value will differ between the two results, though. Stata's kap
command implements a model-based (theoretical) approximate formula for the
standard error and reports a one-sided test using the standard normal
distribution. kappaetc implements a design-based formula to obtain the
standard error and reports a two-sided t test with n-1 degrees of freedom.
kappaetc will, additionally, provide a confidence interval.

For more than two (nonunique) raters and no missing ratings (i.e. a constant
number of raters) both, Stata's kap (or kappa) command and kappaetc
(possibly with the frequency option) estimate the same Fleiss' kappa
coefficient. kappaetc will, however, not report a kappa value for each
rating category. Standard errors and p-values will differ between the
commands for the same reasons explained in the two unique raters case above.

--- End of excerpt from help for kappaetc ----

A last minor comment on category 4 not being used by either rater. In such a
case it is useful to specify all possible rating categories with the
-categories()- option of -kappaetc-. Failing to do so results in wrong
estimates for Gwet's AC and the Brennan and Prediger coefficient, both of
which depend on the number of rating categories. Note that weighted kappa
might also be off.

Best
Daniel

Gwet, K. L. (2014). Handbook of Inter-Rater Reliability. Gaithersburg,
MD: Advanced Analytics, LLC.

Gwet, K. L. (2008). Computing inter-rater reliability and its variance
in the presence of high agreement. British Journal of Mathematical and
Statistical Psycholgy, 61, 29-48.

Reichenheim, M. E. (2004). Confidence intervals for the kappa
statistic. The Stata Journal, 4, 421-428.

bdates wrote
> Thanks, Bruce. I've sent an email to him. He mentions Gwet in his
> acknowledgments, so it'll be interesting what Gwet says about all of this.
>
> Brian
>
> ________________________________________
> From: SPSSX(r) Discussion [

> SPSSX-L@.UGA

> ] on behalf of Bruce Weaver [

> bruce.weaver@

> ]
> Sent: Monday, October 09, 2017 4:45 PM
> To:

> SPSSX-L@.UGA

> Subject: Re: SPSS Python Extension for Fleiss Kappa
>
> Hi Brian. The author of kappaetc can be reached via the e-mail address at
> the bottom of that text file I uploaded. You could always ask him
> directly
> what method(s) he used.
>
> Cheers,
> Bruce
>
>
>
> bdates wrote
>> Bruce,
>>
>> Thanks for this. I'm wondering what formulae they're using for category
>> SE's. I'm going to check with other sources and get back to you. This is
>> certainly a conundrum. We have multiple results from multiple sources. I
>> didn't send this along, but I also used the application from
>> RealStatistics in Excel and got the same results as my syntax and that of
>> Gwet's SAS solution. I'm using the updated 1979 computation for the
>> kappa's and standard errors. Maybe they're using the 1971 version.
>>
>> Brian
>>
>>
>> ________________________________________
>> From: SPSSX(r) Discussion [
>
>> SPSSX-L@.UGA
>
>> ] on behalf of Bruce Weaver [
>
>> bruce.weaver@
>
>> ]
>> Sent: Monday, October 09, 2017 2:23 PM
>> To:
>
>> SPSSX-L@.UGA
>
>> Subject: Re: SPSS Python Extension for Fleiss Kappa
>>
>> Thanks Brian. I don't know if this will helpful to you or not, but I've
>> uploaded (in Nabble) a text file containing results from some analyses
>> carried out using kappaetc, a user-written program for Stata.
>> Unfortunately, kappaetc does not report a kappa for each category
>> separately. But with a little programming, I was able to obtain those.
>> But
>> the way I did it yielded the same SE for every category, and it matched
>> the
>> value shown by the Python extension command for SPSS.
>>
>> Below my output, I appended the help file info for kappaetc (including a
>> reference list), in case it is helpful to you.
>>
>> Cheers,
>> Bruce
>>
>> Fleiss_kappa_problem.txt
>> <http://spssx-discussion.1045642.n5.nabble.com/file/t7186/Fleiss_kappa_problem.txt>
>>
>>
>> bdates wrote
>>> I'm uploading a Word document with the results of my syntax, Kilem
>>> Gwet's
>>> SAS
>>> macro, and the SPSS Python extension, in that order. I'm also uploading
>>> an
>>> SPSS data file with the raw data. Notice the Standard Error column in
>>> the
>>> Python extension solution. All values are the same.
>>>
>>> Brian Fleiss_kappa_Discrepancies.docx
>>> <http://spssx-discussion.1045642.n5.nabble.com/file/t47633/Fleiss_kappa_Discrepancies.docx>
>>> irr_test_data.sav
>>> <http://spssx-discussion.1045642.n5.nabble.com/file/t47633/irr_test_data.sav>
>>>
>>>
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>> (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>
>> bweaver@
>
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>> (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD