Rich,
I agree with you wholeheartedly. There is a lot of information that gets lost with global statistics like the ICC or measures of agreement. We had a thread a couple of weeks ago about generating two-rater Fleiss' statistics, and some syntax was developed and shared. We should very well do that for all of these statistics, for raters as well as for categories. We could identify raters who needed further training and categories that could benefit from better definition, or even collapsing. I've always been surprised that the IRT folks use the information to expand, collapse, amend their measures, but generally that kind of attention doesn't happen with other statistics.
Art, thanks for remembering Proximities. It's a great resource and includes most of the statistics that Podani's article covers.
Brian Dates
From: SPSSX(r) Discussion <[hidden email]> on behalf of Rich Ulrich <[hidden email]>
Sent: Monday, February 5, 2018 4:59:19 PM To: [hidden email] Subject: Re: Kappa in SPSS For two raters, I've long preferred looking at (not the ICC but) the ordinary r (because people are used to the size of it) along with a paired t-test to check any difference.
[For 2x2 data, that could be kappa and Kendall's test for changes.]
For the data I usually dealt with ... For 3 or more raters, I looked at them in pairs.
My emphasis is right, I think, whenever you are developing on your own ad-hoc scales.
However, for the summaries that get published, or for people using the scales
developed by others, what is required (publication) or sufficient (cross-check)
is an overall number like the ICC.
I think that you are right, if you are suggesting that much literature on reliability makes it
easy for people to overlook or to forget the possible complications of differences in level.
-- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]> on behalf of Nina Lasek <[hidden email]>
Sent: Monday, February 5, 2018 4:57:46 AM To: [hidden email] Subject: Re: Kappa in SPSS Hi Brian,
can you suggest any references that discuss the difference(s) between "similarity" and "correlation"? Of course, it makes intuitive sense that a strong covariation/correlation between the ratings of two raters might be based on data that are far from being similar in terms of their absolute levels. But my impression (I might err here...) is that this point is rarely discussed in the literature? Best, Nina ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |