SPSSX Discussion - Re: test-retest reliability and Spearman-Brown split halfcoefficient

Re: test-retest reliability and Spearman-Brown split halfcoefficient

Posted by statisticsdoc on Jan 25, 2007; 4:24pm
URL: http://spssx-discussion.165.s1.nabble.com/test-retest-reliability-and-Spearman-Brown-split-half-coefficient-tp1073459p1073467.html

Paul,

I think that you are right to consider Yes No Don't Know as nominal. Don't
Know is not an intermediate category between Yes and No, the way that
Neutral is an intermediate between Agree and Disagree.

I would suggest that you start with computing the raw percent agreement
between time points for each item.

You want to do more than test whether the ratings are significantly related
(in a larger sample - 9 will not give you much power), you want to assess
the strength of the association between ratings at different time points.
In Crosstabs, if you tabulate Time 1 versus Time 2, and include PHI as a
statistic, you will get Cramer's V (an index of the association between
nominal items). Cramer's V varies between 0 and 1. There are other indices
of association for these data, but I think that V is the most readily
interpretable. V can be viewed as the association between variables as a
proportion of their maximum possible association.

HTH,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: Paul Mcgeoghan [mailto:[hidden email]]
Sent: Thursday, January 25, 2007 10:51 AM
To: Statisticsdoc; [hidden email]
Subject: RE: test-retest reliability and Spearman-Brown split
halfcoefficient

Stephen,

Thanks, I was also thinking along the lines of Spearman rank-order as a
possibility.
She is only interested in comparing the single Likert items.

However, she also has the problem with Yes, No and Don't Know questions
which to me would be
considered nominal rather than likert scale so what does she do for these
type of questions?

Paul

==================
Paul McGeoghan,
Application support specialist (Statistics and Databases),
University Infrastructure Group (UIG),
Information Services,
Cardiff University.
Tel. 02920 (875035).

>>> "Statisticsdoc" <[hidden email]> 25/01/2007 15:45:25 >>>
Paul,

I imagine that you have suggested that more data be collected (a sample of
nine is not enough to accurately estimate test-retest reliability).

Having said that, for the Likert items, you want to compute the correlation
between two time points, not the split-half reliability (which applies to
ratings that are summed into scale scores at the same point in time). Is
your customer interested in knowing about the test-retest reliability of
single items, or the test-retest reliability of a score that has been
computed from these items. For single Likert items, a Spearman rank-order
correlation would be appropriate, given the ordinal nature of the data. For
computing the test-retest reliability of the scale total, you could use
Pearson's correlation if you are willing to assume that the sums behave like
interval level data (this is a topic that is frequently discussed on this
list, as you know), otherwise stick with Spearman's coefficient.

HTH,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Paul Mcgeoghan
Sent: Thursday, January 25, 2007 10:03 AM
To: [hidden email]
Subject: test-retest reliability and Spearman-Brown split half
coefficient

Hi,

I have been reading a lot of websites relating to reliability and agreement
this week and have a
customer who has carried out the following:

She has 9 people who answered nominal questions:
Yes, No, Don't Know
on 2 seperate occasions.
She is interested in knowing whether the respondents reliably answer the
questions the same over
the 2 time periods.

She also has a number of likert questions also where the same 9 respondents
answered the questions
at time 1 and time 2 and wants to see if they have reliably answered each
question the same over the
2 time periods.

From what I have read, this seems to be referred to as test-retest
reliability and a correlation
between each pair of questions seems to be the way to approach it.

So for the likert scale questions, I could just do a Spearman's correlation
coefficient?
For the nominal questions, what can I use?

I have also seen reference to Spearman-Brown split half coefficient (Analyse
Scale Reliability and
Split-Half).
Can this be used to compare each pair of questions at the 2 time intervals,
and does the data have
to be likert-scale in this instance?

Hope I have explained the problem clearly.

Thanks,
Paul

==================
Paul McGeoghan,
Application support specialist (Statistics and Databases),
University Infrastructure Group (UIG),
Information Services,
Cardiff University.
Tel. 02920 (875035).