SPSSX Discussion - Re: Pretest to Posttest: A question of reliability

Re: Pretest to Posttest: A question of reliability

Posted by Ryan on Dec 29, 2012; 5:12am
URL: http://spssx-discussion.165.s1.nabble.com/Pretest-to-Posttest-A-question-of-reliability-tp5717078p5717165.html

Art,

This post is filled with commentary in-between code. Please let me kow if there is any part of my response which requires clarification. (For those not intererested in this discussion around estimating reliability [defined as true score variance / observed score variance] via structural equation modeling, I suggest you stop reading the post now.)

Let's begin by creating data (means, SDs, and rhos) for 8 items (x1, x2, ..., x8) based on an artificial sample of N=500 using the following SPSS code:

MATRIX DATA VARIABLES=ROWTYPE_ x1 x2 x3 x4 x5 x6 x7 x8.

BEGIN DATA

N 500 500 500 500 500 500 500 500

MEAN .017 .040 .022 .056 -.019 -.023 .033 .020

SD 1.032 1.032 1.038 1.028 .946 .985 .993 1.070

CORR 1.000

CORR .276 1.000

CORR .255 .332 1.000

CORR .222 .<a href="tel:213%20.552%201.000" target="_blank" value="+12135521000">213 .552 1.000

CORR .255 .211 .232 .213 1.000

CORR .192 .235 .240 .182 .269 1.000

CORR .221 .262 .277 .217 .202 .531 1.000

CORR .555 .275 .275 .231 .169 .184 .169 1.000

END DATA.

Next, let's estimate reliability by calculating Cronbach's alpha using SPSS code:

RELIABILITY

/VARIABLES=X1 X2 X3 X4 X5 X6 X7 X8

/SCALE('ALL VARIABLES') ALL

/MODEL=ALPHA

/STATISTICS=CORR

/SUMMARY=TOTAL

/MATRIX IN(*).

After running the RELIABILITY procedure above, you should obtain a Cronbach's alpha coefficient = .7441720125. As I mentioned in a previous post, Cronbach's alpha is based on the "essentially tau-equivalent model", and as a result, can be calculated by employing a single-factor confirmatory factor analysis where the factor loadings are constrained to be equal and the errors are estimated freely and assumed to be independent using the unweighted least squares estimation method (which is analogous to OLS) via the following AMOS code:

#Region "Header"

Imports System

Imports System.Diagnostics

Imports Microsoft.VisualBasic

Imports AmosEngineLib

Imports AmosGraphics

Imports AmosEngineLib.AmosEngine.TMatrixID

Imports PBayes

#End Region

Module MainModule

Public Sub Main()

Dim Sem As AmosEngine

Sem = New AmosEngine

Sem.TextOutput

AnalysisProperties(Sem)

ModelSpecification(Sem)

Sem.FitAllModels()

Sem.Dispose()

End Sub

Sub ModelSpecification(Sem As AmosEngine)

Sem.GenerateDefaultCovariances(False)

Sem.BeginGroup("C:\<specify path>\reliability_example.sav", "reliability_example")

Sem.GroupName("Group number 1")

Sem.AStructure("x4 = (Loading) Factor + (1) err4")

Sem.AStructure("x3 = (Loading) Factor + (1) err3")

Sem.AStructure("x2 = (Loading) Factor + (1) err2")

Sem.AStructure("x1 = (Loading) Factor + (1) err1")

Sem.AStructure("x5 = (Loading) Factor + (1) err5")

Sem.AStructure("x6 = (Loading) Factor + (1) err6")

Sem.AStructure("x7 = (Loading) Factor + (1) err7")

Sem.AStructure("x8 = (Loading) Factor + (1) err8")

Sem.AStructure("Factor (1)")

Sem.Model("Default model", "")

End Sub

Sub AnalysisProperties(Sem As AmosEngine)

Sem.Uls

Sem.Iterations(50)

Sem.InputUnbiasedMoments

Sem.FitMLMoments

Sem.Standardized

Sem.Mods( 10)

Sem.Seed(1)

End Sub

End Module

We can take the constrained factor loadings and error variances to calculate Cronbach's alpha in SPSS as follows:

**COMPUTE:.

compute Rxx_ess_tau_equiv_model = (0.524194608262145*8)**2 / ((0.524194608262145*8)**2 + (.788113964675750 + .788113964675750 + .800509124675750 + .779890444675750 + .618346180675750 + .693504562675750 + .709296914675750 + .867830212675749) + 2*(0)).
execute.

After running the code above, you will obtain an estimate of reliability that is equal to Cronbach's alpha coefficient of .7441720125. (It should be noted that one could have estimated reliability directly within AMOS by employing a user-defined estimand.)

The next question, however, is whether there is a way to obtain a more accurate estimate of reliability. Answering this question requires that we fit the same CFA model using the maximum likelihood estimation method to obtain a Chi-Square statistic and other fit indices, followed by modifications to the model based on both statistical and substantive reasons. First, let's tackle the statistical reason for re-parameterizing the model successively until we have achieved a model with a superior fit:

1. Baseline Model (equal factor loadings and indpenedent error variances): Chi-Square (df=27)=276.555, p<.001, GFI=.880, CFI=.689, RMSEA=.136, Rxx = .743

Again, we calculate reliability using the equation used previously:

compute Rxx_baseline_MLE = (.523268189314732*8)**2 / ((.523268189314732*8)**2 + (.762278689905565 + .796470532118462 + .723442742446598 + .780165299728391 + .713850584087045 + .709682946567755 +.711514761829753 + .850962087548053 ) + 2*(0)).
execute.

/*Intermediary Post-Hoc Models*/

2. Model with unequal factor loadings: Chi-Square (df=20)=260.078, p<.001, GFI=.885, CFI=.701, RMSEA=.155, Rxx = .746

Again, we calculate reliability using the same equation:

compute true_score_variance_unequal_loadings_MLE = (0.556735618112637 + 0.513993209453236 + 0.648557304729973 + 0.555545309689387 + 0.390226949410027 + 0.475149789170106 + 0.499079633791371 + 0.556659600195888)**2.
compute error_score_variance_unequal_loadings_MLE = .752939403578407 + .798704932657766 + .654662520799237 + .746039841051326 + .740849095967666 +.742517227880893 +.734996421160121 + .832740289549891 + 2*(0).
compute observed_score_variance_unequal_loadings_MLE = true_score_variance_unequal_loadings_MLE + error_score_variance_unequal_loadings_MLE.
compute Rxx_unequal_loadings_MLE = true_score_variance_unequal_loadings_MLE / observed_score_variance_unequal_loadings_MLE.
execute.

3. Model with unequal factor loadings and error cov(1,8): Chi-Square (df=19)=159.068, p<.001, GFI=.923, CFI=.825, RMSEA=.122

4. Model with unequal factor loadings and error cov(1,8),cov(6,7): Chi-Square (df=18)=62.859, p<.001, GFI=.966, CFI=.944, RMSEA=.071

/*Final Model*/

5. Model with unequal factor loadings and error cov(1,8),cov(6,7),cov(3,4): Chi-Square (df=17)=18.196, p=.377, GFI=.991, CFI=.999, RMSEA=.012, Rxx =.648

Finally, we calculate reliability from the best fitting model as follows:

compute true_score_variance_final_model_MLE=(0.504038022382608 + 0.567074781382506 + 0.590454496251624 + 0.453948802985055 + 0.417199949916064 + 0.437615415101698 + 0.459268052742018 + 0.484263617778839)**2.
compute error_score_variance_final_model_MLE=.808839623994243 + .741320142649511 + .726652599858375 + .848600916270107 + .719070369781090 + .776777298466618 + .773149757731824 + .908098948497306 + 2*(.367540217384463 + .317352227947740 + .319805177554936).
compute observed_score_variance_final_model_MLE = true_score_variance_final_model_MLE + error_score_variance_final_model_MLE.
compute Rxx_final_model_MLE = true_score_variance_final_model_MLE / observed_score_variance_final_model_MLE.
execute.

The models stated above are nested, allowing us to test if there is a significant improvement in the model according to a likelihood ratio test. But as most data analysts would agree, making such modifications should be considered post-hoc and should not solely be statistically driven, but also substantively driven. With that in mind, suppose the unidimensional construct is depression, and that: (a) items 1 and 8 are self-dislike and self-critisicm, respectively; (b) items 6 and 7 are loss of pleasure and loss of interest, respectively; and (c) items 3 and 4 are fatigue and loss of energy, respectively. With this in mind, would we be surprised to observe that these pairs of items covary above and beyond that which would be expected given the unidimensional construct the items are intended to measure? Certainly not! Therefore, given the improvement in model-to-data fit as well as a rationale for allowing covarying errors (shared item content), a test developer may very well be comfortable using the reliability estimate from the final model.

The AMOS code for the final model is presented below:

#Region "Header"
Imports System
Imports System.Diagnostics
Imports Microsoft.VisualBasic
Imports AmosEngineLib
Imports AmosGraphics
Imports AmosEngineLib.AmosEngine.TMatrixID
Imports PBayes
#End Region
Module MainModule
Public Sub Main()
  Dim Sem As AmosEngine
  Sem = New AmosEngine
  Sem.TextOutput
  AnalysisProperties(Sem)
  ModelSpecification(Sem)
  Sem.FitAllModels()
  Sem.Dispose()
End Sub

Sub ModelSpecification(Sem As AmosEngine)
  Sem.GenerateDefaultCovariances(False)

  Sem.BeginGroup("C:\<specify path>\reliability_example.sav", "reliability_example")
   Sem.GroupName("Group number 1")
   Sem.AStructure("x4 = Factor + (1) err4")
   Sem.AStructure("x3 = Factor + (1) err3")
   Sem.AStructure("x2 = Factor + (1) err2")
   Sem.AStructure("x1 = Factor + (1) err1")
   Sem.AStructure("x5 = Factor + (1) err5")
   Sem.AStructure("x6 = Factor + (1) err6")
   Sem.AStructure("x7 = Factor + (1) err7")
   Sem.AStructure("x8 = Factor + (1) err8")

   Sem.AStructure("err1 <--> err8")
   Sem.AStructure("err6 <--> err7")
   Sem.AStructure("err3 <--> err4")

   Sem.AStructure("Factor (1)")
  Sem.Model("Default model", "")
End Sub

Sub AnalysisProperties(Sem As AmosEngine)
  Sem.Iterations(50)
  Sem.InputUnbiasedMoments
  Sem.FitMLMoments
  Sem.Standardized
  Sem.Mods( 10)
  Sem.Seed(1)
End Sub
End Module

Ryan

On Sat, Dec 22, 2012 at 6:45 AM, Art Kendall <[hidden email]> wrote:

> Ryan

> Do you have an example set of syntax to do this? Did you use OMS?

> Art Kendall

> Social Research Consultants

> On 12/21/2012 11:38 PM, R B wrote:

> Eins,

> Reliability = true score variance / observed score variance

> where

> observed score variance = true score variance + error score variance

> Within a one-factor confirmatory factor analytic modeling framework, you can estimate true score variance and error score variance as follows:

> estimated true score variance = [sum(factor loadings)]^2

> estimated error score variance = sum(error variances) + 2*[sum(error covariances)]

> estimated reliability = estimated true score variance / (estimated true score variance + estimated error score variance)

> The formula above employed on data from a single testing occasion will yield a more accurate estimate of composite score reliability than Cronbach's alpha.

> Reference: Brown TA (2006). Confirmatory factor analysis for applied research; Kenny DA, editor. New York: The Guilford Press.

> Estimating test-retest reliability using a structural equation model is another matter for another time.

> Ryan

> On Fri, Dec 21, 2012 at 9:35 AM, E. Bernardo <[hidden email]> wrote:

>> Dear RB, Ulrich, et al.

>> Sorry for the poor English.

>> Let me rephrase the scenario. Our variable is latent with 10 items. The ten-item five point likert type questionnaire were administered to a sample in two different occasions, F1 then F2. So we have two correlated factors F1 and F2 and we want to treat them as latent variables using AMOS 20. We expect that the 10 items are loaded significantly (p<.0.05) to F1 and then to F2. However, the actual data showed that some items have insignificant (p>.05) factor loadings on F1 and F2; thus, the set of items that loaded to F1 are different from the set of items that loaded to F2. Our question was: Can we proceed to correlate F1 and F2. Is this not a measurement problem?

>> Eins

>> B <[hidden email]>

>> To: [hidden email]

>> Sent: Thursday, December 20, 2012 9:16 PM

>> Subject: Re: Pretest to Posttest: A question of reliability

>> My responses are interspersed below.

>> On Thu, Dec 20, 2012 at 10:12 PM, E. Bernardo <[hidden email]> wrote:

>> Dear Everyone,

>> We do a pretest-posttest analysis for a unidimentional scale with 10 items (say, Q1, Q2, ...,Q10). Using the pretest data, only four items (Q2, Q3, Q4, Q5) were significant, while using the posttest data six items(Q1, Q2, Q6, Q8, Q9, Q10) were significant.

>> You are providing conflicting information above. You state that you ran a pretest-posttest analysis for a "unidimensional scale with 10 items" which suggests to me that you derived a composite score to use as the dependent variable (e.g., you computed a sum or mean across all items for each subject). However, you go on to suggest that you performed pre-post analyses per item. Please clarify.

>> Noticed that the scale has a different set of items in the pretest and posttest scenario.

>> What do you mean that the scale has a different set of items during both measurement periods? Why?

>> Our question is, is the scale not reliable?

>> The reliability of composite scores on a measuring instrument and/or reliability on change scores are unrelated to what you've been discussing thus far, in my estimation.

>> Suppose we extend the scenario above to more than two measures (say, Pretest, Posttest1, Posttest2, Posttest3, Postest4) so that we will use the Latent Growth Modeling(LGM) to model the latent change. Is it a requirement in LGM that all the scale indicators/items are significant across tests?

>> Technically, just because you have more than two measurement points does not make the analysis an LGC. You can have an LGC with only two points at which each subject was measured. Anyway, the answer to your final question is no.

>> Thank you for your inputs.

>> Eins