ROC curve analysis for clustered data

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

ROC curve analysis for clustered data

drmikegr
Hi,

I am interested in performing ROC curve analysis (and comparison of AUC for different predictors) for clustered data (multiple regions of interest within each patient).
I have used "Complex samples" for performing logistic regression incorporating the clustered nature of the data but there is no "ROC curve" option in Complex Samples.
Is there a script I could use for such an analysis ?

Thanks !
M.

PS. SPSS 17
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

Ryan
I suppose one option would be to construct the ROC curve using
predicted probabilities obtained from a logistic regression model
which takes into account the natural hierarchy of your data. Take a
look at the example below.

HTH,

Ryan
--

*Random Effects Logistic Regression Simulation.
set seed 65923454.

new file.
inp pro.

comp ID_level1 = -99.
comp b0 = -99.
comp b1 = -99.
comp x = -99.
comp rand_eff = -99.
comp ID_level2 = -99.

leave ID_level1 to ID_level2.

loop ID_level2= 1 to 100.
 comp b0 = -1.20.
 comp b1 = 1.75.
 comp rand_eff = sqrt(0.50)*rv.normal(0,1).

 loop ID_level1 = 1 to 5.
 comp x = rv.normal(0,1).
 comp eta = b0 + b1*x + rand_eff.
 comp p = exp(eta) / (1+exp(eta)).
 comp y = rv.bernoulli(p).

 end case.
 end loop.
end loop.
end file.
end inp pro.
exe.

delete variables b0 rand_eff eta p.

*Fit Random Effects Logistic Regression Model (REs LRM).
GENLINMIXED
 /FIELDS TARGET=y
 /TARGET_OPTIONS DISTRIBUTION=BINOMIAL LINK=LOGIT
 /FIXED EFFECTS=x USE_INTERCEPT=TRUE
 /BUILD_OPTIONS TARGET_CATEGORY_ORDER=DESCENDING
 /RANDOM USE_INTERCEPT=TRUE SUBJECTS=ID_level2
COVARIANCE_TYPE=VARIANCE_COMPONENTS
 /SAVE PREDICTED_PROBABILITY(PredictedProbability) MAX_CATEGORIES(25).

*Run ROC using predicted probabilities from REs LRM.
ROC PredictedProbability_01 BY y (1)
  /PLOT=CURVE(REFERENCE)
  /PRINT=SE COORDINATES
  /CRITERIA=CUTOFF(INCLUDE) TESTPOS(LARGE) DISTRIBUTION(FREE) CI(95)
  /MISSING=EXCLUDE.

On Tue, Jul 26, 2011 at 11:02 PM, drmikegr <[hidden email]> wrote:

> Hi,
>
> I am interested in performing ROC curve analysis (and comparison of AUC for
> different predictors) for clustered data (multiple regions of interest
> within each patient).
> I have used "Complex samples" for performing logistic regression
> incorporating the clustered nature of the data but there is no "ROC curve"
> option in Complex Samples.
> Is there a script I could use for such an analysis ?
>
> Thanks !
> M.
>
> PS. SPSS 17
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/ROC-curve-analysis-for-clustered-data-tp4637366p4637366.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

E-mail address for SPSS support?

King Douglas
In reply to this post by drmikegr

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

King Douglas
In reply to this post by drmikegr
Folks,

SPSS (IBM) is being hard to get.  In the old days, I used to get very prompt responses (for licenses, for instance) from [hidden email].

However, I'm recently getting no response from what appears to be their new e-mail address, [hidden email].

Any suggestions?

King Douglas
American Airlines Customer Research

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

drmikegr
In reply to this post by Ryan
Thanks for the response

Unfortunately I have SPSS17 which does not include GENLINMIXED

Saving the predicted probabilities using Complex Samples is feasible, but these are the same as the ones computed by regular logistic regression, since Complex Samples only adjusts the standard error (robust estimation) of the regressors. Therefore, the ROC curve would not really adjust for the clustering of observations.

Would you recommend GENLINMIXED compared to the COMPLEX SAMPLES procedure for clustering adjustment  ?
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

Ryan
I'm not terribly familiar with Complex Samples, so I cannot speak as
to which approach would be preferable.  I will say that the predicted
probability values obtained from the random intercept logistic
regression model (GENLINMIXED code) on the simulated data specified in
the previous post will be different from the predicted probability
values obtained from a standard logistic regression model. One model
takes into account nesting of observations while the other does not.
Now, if you decided to use GENLINMIXED, I would advise you to consider
how you'd like the predicted probability values to be calculated. That
is, I'm not sure if you'd prefer to calculate the predicted
probability values from the fixed effects portion of the model only or
if you'd want the random effects to be included in the predicted
probability equation. Since you do not have GENLINMIXED, I'll refrain
from commenting further on this point.

Ryan

On Fri, Jul 29, 2011 at 6:18 PM, drmikegr <[hidden email]> wrote:

> Thanks for the response
>
> Unfortunately I have SPSS17 which does not include GENLINMIXED
>
> Saving the predicted probabilities using Complex Samples is feasible, but
> these are the same as the ones computed by regular logistic regression,
> since Complex Samples only adjusts the standard error (robust estimation) of
> the regressors. Therefore, the ROC curve would not really adjust for the
> clustering of observations.
>
> Would you recommend GENLINMIXED compared to the COMPLEX SAMPLES procedure
> for clustering adjustment  ?
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/ROC-curve-analysis-for-clustered-data-tp4637366p4648511.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

Kornbrot, Diana
This is a very useful discussion
It would be very helpful if you could supply the syntax for both
modeslmfor those of us fortunate enough to have genlinm ixed
Best
Diana

________________________________________
Professor Diana Kornbrot
email:  [hidden email]
web:    http://web.me.com/kornbrot/KornbrotHome.html
Work
School of Psychology
 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK
   voice:   +44 (0) 170 728 4626
   fax:     +44 (0) 170 728 5073
Home
 19 Elmhurst Avenue
 London N2 0LT, UK
    voice:   +44 (0) 208 883  3657
    mobile: +44 (0) 796 890 2102
    fax:      +44 (0) 870 706 4997








On 01/08/2011 01:12, "R B" <[hidden email]> wrote:

>I'm not terribly familiar with Complex Samples, so I cannot speak as
>to which approach would be preferable.  I will say that the predicted
>probability values obtained from the random intercept logistic
>regression model (GENLINMIXED code) on the simulated data specified in
>the previous post will be different from the predicted probability
>values obtained from a standard logistic regression model. One model
>takes into account nesting of observations while the other does not.
>Now, if you decided to use GENLINMIXED, I would advise you to consider
>how you'd like the predicted probability values to be calculated. That
>is, I'm not sure if you'd prefer to calculate the predicted
>probability values from the fixed effects portion of the model only or
>if you'd want the random effects to be included in the predicted
>probability equation. Since you do not have GENLINMIXED, I'll refrain
>from commenting further on this point.
>
>Ryan
>
>On Fri, Jul 29, 2011 at 6:18 PM, drmikegr <[hidden email]> wrote:
>> Thanks for the response
>>
>> Unfortunately I have SPSS17 which does not include GENLINMIXED
>>
>> Saving the predicted probabilities using Complex Samples is feasible,
>>but
>> these are the same as the ones computed by regular logistic regression,
>> since Complex Samples only adjusts the standard error (robust
>>estimation) of
>> the regressors. Therefore, the ROC curve would not really adjust for the
>> clustering of observations.
>>
>> Would you recommend GENLINMIXED compared to the COMPLEX SAMPLES
>>procedure
>> for clustering adjustment  ?
>>
>> --
>> View this message in context:
>>http://spssx-discussion.1045642.n5.nabble.com/ROC-curve-analysis-for-clus
>>tered-data-tp4637366p4648511.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

Ryan
Hi Diana,

I posted the simulation, GENLINMIXED, and ROC code a few days ago.
Here's the link to that post:

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1107&L=spssx-l&P=R58732

To complete this illustration, all that remains is to run a standard
logistic regression model on the simulated data and compare the
predicted values across models. It might also be of interest to
examine how well the predicted values from each model perform in the
ROC analysis.

HTH,

Ryan

On Mon, Aug 1, 2011 at 7:30 AM, Kornbrot, Diana
<[hidden email]> wrote:

> This is a very useful discussion
> It would be very helpful if you could supply the syntax for both
> modeslmfor those of us fortunate enough to have genlinm ixed
> Best
> Diana
>
> ________________________________________
> Professor Diana Kornbrot
> email:  [hidden email]
> web:    http://web.me.com/kornbrot/KornbrotHome.html
> Work
> School of Psychology
>  University of Hertfordshire
>  College Lane, Hatfield, Hertfordshire AL10 9AB, UK
>   voice:   +44 (0) 170 728 4626
>   fax:     +44 (0) 170 728 5073
> Home
>  19 Elmhurst Avenue
>  London N2 0LT, UK
>    voice:   +44 (0) 208 883 3657
>    mobile: +44 (0) 796 890 2102
>    fax:      +44 (0) 870 706 4997
>
>
>
>
>
>
>
>
> On 01/08/2011 01:12, "R B" <[hidden email]> wrote:
>
>>I'm not terribly familiar with Complex Samples, so I cannot speak as
>>to which approach would be preferable.  I will say that the predicted
>>probability values obtained from the random intercept logistic
>>regression model (GENLINMIXED code) on the simulated data specified in
>>the previous post will be different from the predicted probability
>>values obtained from a standard logistic regression model. One model
>>takes into account nesting of observations while the other does not.
>>Now, if you decided to use GENLINMIXED, I would advise you to consider
>>how you'd like the predicted probability values to be calculated. That
>>is, I'm not sure if you'd prefer to calculate the predicted
>>probability values from the fixed effects portion of the model only or
>>if you'd want the random effects to be included in the predicted
>>probability equation. Since you do not have GENLINMIXED, I'll refrain
>>from commenting further on this point.
>>
>>Ryan
>>
>>On Fri, Jul 29, 2011 at 6:18 PM, drmikegr <[hidden email]> wrote:
>>> Thanks for the response
>>>
>>> Unfortunately I have SPSS17 which does not include GENLINMIXED
>>>
>>> Saving the predicted probabilities using Complex Samples is feasible,
>>>but
>>> these are the same as the ones computed by regular logistic regression,
>>> since Complex Samples only adjusts the standard error (robust
>>>estimation) of
>>> the regressors. Therefore, the ROC curve would not really adjust for the
>>> clustering of observations.
>>>
>>> Would you recommend GENLINMIXED compared to the COMPLEX SAMPLES
>>>procedure
>>> for clustering adjustment  ?
>>>
>>> --
>>> View this message in context:
>>>http://spssx-discussion.1045642.n5.nabble.com/ROC-curve-analysis-for-clus
>>>tered-data-tp4637366p4648511.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>>=====================
>>To manage your subscription to SPSSX-L, send a message to
>>[hidden email] (not to SPSSX-L), with no body text except the
>>command. To leave the list, send the command
>>SIGNOFF SPSSX-L
>>For a list of commands to manage subscriptions, send the command
>>INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

drmikegr
In reply to this post by Ryan
I am trying out the mixed logistic regression model in SPSS and I would be very much obliged if you could give me some details about calculating the predicted probability values from the fixed effects portion of the model only or including also the random effects in the predicted probability equation as you had mentioned in your last message below.
How would you change the code you provided so that in one case you calculate it only from the fixed effects portion and in the other case include the random effects as well ?
Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

Ryan
From what I can tell, the predicted probability values produced by the SAVE statement from my example incorporate solutions from the random effects. Obtaining the predicted probability values from the fixed effects equation only can be calculated by plugging in the fixed effects estimates from your output into the linear equation ("eta"); e.g.,

eta = b0 + b1*x1 + b2*x2 + ... + bk*xk 

and then apply the inverse logit link function:

predicted probability = 1 / [1 + exp(-eta)]

All of this can be done in a single COMPUTE statement. 

Ryan

On Fri, Dec 23, 2011 at 11:30 AM, drmikegr <[hidden email]> wrote:
I am trying out the mixed logistic regression model in SPSS and I would be
very much obliged if you could give me some details about calculating the
predicted probability values from the fixed effects portion of the model
only or including also the random effects in the predicted probability
equation as you had mentioned in your last message below.
How would you change the code you provided so that in one case you calculate
it only from the fixed effects portion and in the other case include the
random effects as well ?

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/ROC-curve-analysis-for-clustered-data-tp4637366p5097560.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Automatic reply: ROC curve analysis for clustered data

MacGillivary Heather L



I am out of the office until January 3rd and will reply to emails at that time. 

Have a wonderful holiday.

Heather

 

Reply | Threaded
Open this post in threaded view
|

Re: ROC curve analysis for clustered data

drmikegr
In reply to this post by Ryan
Thanks for the equations. It seems that the predicted probability from the fixed-effects only of the GENLINMIXED approach is the same as the predicted probability derived from an ordinary logistic regression analysis without random effects.
Would that be expecrted ?

In terms of ROC analysis and finding the optimal cut-off of a continuous variable, should one use the predicted probability of the fixed-effects only or the pred.prob. that incorporates the random effects ?