This post was updated on .
Dear All,
My null model of the probability of being satisfied (y=1) in the j regions have the following form: log (p/(1-p)= Gamma (00) + u (0j) I estimate the null model, so I get the estimation of the gamma (00) and the variance of the u(0j). I have read in Goldstein 2010 a method for estimating the ICC through simulation in a multilevel logistic regression. 1) I simulate a large number m of random draws from the normal distribution (0, var(u(0j)) 2) For each of this random numbers, I calculate a p*(ij), using p*= exp (gamma(00) + u*(0j)) / (1 + exp (gamma(00) + u*(0j))). 3) Then I calculate the bernouilli variance p*(1-p*) for each row. 4) The expectation of all this bernouilli variance should be the variance component for the level 1, as goldstein says v(1)=E(v(1)ij)) 5) Now he says that the estimatior for the level 2 variance v(2) is equal to v(2)=var(p*(ij)) Which I don't know how to calculate it, as it is a variance of many (i x j) probabilities. This is my question: how do I calculate the v(2)? I paste my synthax in SPSS there, the results I get from my model is gamma(00)=-0.205464 and my var(u(0j)= 0.14829 So my synthax for the simulation is: new file . inp pro . loop ID= 1 to 5000 . comp simul = rv.normal (0, 0.14829) . end case . end loop . end file . end inp pro . exe . COMPUTE pstar = (EXP(simul-0.205464)/(1+ EXP(simul-0.205464))) . EXECUTE . compute varl1 = pstar*(1-pstar) . exe . FREQUENCIES VARIABLES=varl1 pstar /FORMAT=NOTABLE /STATISTICS=STDDEV VARIANCE MEAN MEDIAN /ORDER= ANALYSIS . Thanks in advance for all |
Dear all,
When in line 4 I said v(1)=exp(v(1)ij)) I was meaining v(1)=Expectation(v(1)ij)=E(v(1)ij) |
In reply to this post by jmdpulido
I'm not familiar with the Goldstein 2011 reference, but perhaps I can
still help you. First off, I'd probably write the simulation code as: -------------------- *Random Effects Logistic Regression Simulation. set seed 65923454. new file. inp pro. comp ID_level1 = -99. comp b0 = -99. comp rand_eff = -99. comp ID_level2 = -99. leave ID_level1 to ID_level2. loop ID_level2= 1 to 200. comp b0 = -0.40. comp rand_eff = sqrt(0.50)*rv.normal(0,1). loop ID_level1 = 1 to 50. comp eta = b0 + rand_eff. comp p = exp(eta) / (1+exp(eta)). comp y = rv.bernoulli(p). end case. end loop. end loop. end file. end inp pro. exe. delete variables b0 rand_eff eta p. --------------------- As you can see in the simulation code above, the variance component [on the logit scale] has been specified to be 0.50. You could estimate the variance component by fitting the model via the GENLINMIXED procedure on the simulated data as follows: --------------------- *Fit Random Effects Logistic Regression Model. GENLINMIXED /FIELDS TARGET=y /TARGET_OPTIONS DISTRIBUTION=BINOMIAL LINK=LOGIT /FIXED USE_INTERCEPT=TRUE /BUILD_OPTIONS TARGET_CATEGORY_ORDER=DESCENDING /RANDOM USE_INTERCEPT=TRUE SUBJECTS=ID_level2 COVARIANCE_TYPE=VARIANCE_COMPONENTS. --------------------- Regarding the ICC, I do recall Snijders and Bosker (1999) providing the following ICC formula for logit models: ICC = V / ( V + 3.29 ) Employing this formula using the variance component estimate derived from GENLINMIXED, we obtain an ICC of: ICC = 0.49 / (0.49 + 3.29) = 0.130 HTH, Ryan On Fri, Mar 25, 2011 at 6:53 AM, jmdpulido <[hidden email]> wrote: > Dear All, > > > My null model of the probability of being satisfied (y=1) in the j regions > have the following form: > > log (p/(1-p)= Gamma (00) + u (0j) > > I estimate the null model, so I get the estimation of the gamma (00) and the > variance of the u(0j). > > I have read in Goldstein 2010 a method for estimating the ICC through > simulation in a multilevel logistic regression. > > 1) I simulate a large number m of random draws from the normal distribution > (0, var(u(0j)) > 2) For each of this random numbers, I calculate a p*(ij), using p*= exp > (gamma(00) + u*(0j)) / (1 + exp (gamma(00) + u*(0j))). > 3) Then I calculate the bernouilli variance p*(1-p*) for each row. > 4) The expectation of all this bernouilli variance should be the variance > component for the level 1, as goldstein says v(1)=exp(v(1)ij)) > > 5) Now he says that the estimatior for the level 2 variance v(2) is equal to > v(2)=var(p*(ij)) Which I don't know how to calculate it, as it is a variance > of many (i x j) probabilities. > > This is my question: how do I calculate the v(2)? > > I paste my synthax in SPSS there, the results I get from my model is > gamma(00)=-0.205464 and my var(u(0j)= 0.14829 > > So my synthax for the simulation is: > > new file . > inp pro . > loop ID= 1 to 5000 . > comp simul = rv.normal (0, 0.14829) . > compute p_star = > end case . > end loop . > end file . > end inp pro . > exe . > > COMPUTE pstar = (EXP(simul-0.205464)/(1+ EXP(simul-0.205464))) . > EXECUTE . > compute varl1 = pstar*(1-pstar) . > exe . > > FREQUENCIES > VARIABLES=varl1 pstar /FORMAT=NOTABLE > /STATISTICS=STDDEV VARIANCE MEAN MEDIAN > /ORDER= ANALYSIS . > > Thanks in advance for all > > > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multilevel-Logistic-Variance-Decomposition-ICC-tp4263848p4263848.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Dear Ryan,
Thanks for your answer. I always learn from your synthaxis. I've been reading much more about variance decomposition in multilevel logistic. I will post the main conclussions, just in case they are useful for anyone. There are several methods. 1) The latent variable approach (Snijders & Bosker, 1999): you have to assume that your observed outcome Y= 0 or Y = 1 is really a dichotomization of a continuous latent variable. Then you assume that this latent variable distributes as an Standard Logistic. The variance of an standard logistic is pi square divided by three. This is a reasonable approach in the absence of overdispersion for level-1 variance. Level 2 variance is estimated by the model by maximum likelihood. This is the most popular method, as it is very easy to compute. CIC = var(u(0j)) / (var(u(0j)) + (pi^2)/3). Where var(u(0j)) is the variance of the random level 2 intercepts. Nevertheless, this method assumes that level-1 variance is invariant in absence of overdispersion, which is not always very reasonable (see Larsen 2006: New measures for understanding the multilevel regression model). Moreover, the level 2 variance (var(u(0j)) is measured in the square latent variable approach, thus it is not easy to interpret it in terms of p (in fact, for p's near 0,5 the function is very step and small changes in p lead to great increases in the variance, leading sometimes to wrong interpretantions). 2) The linear probability function approach (Goldstein, 2004, and Goldstein, 2010): it consist in estimating a multilevel linear probability model with random effects. This way the ML model estimates both a level 1 variance and a level 2 variance, measured in (square) probabilities. This method is also straight forward, nevetheless, it has the drawback of assumming a linear probability model. When running a null model (fully unconditional) that only depends on which level 2 unit each individual belongs it usually estimates plausible results. Of course, when more level 1 or level 2 predictors are added to the model, some problems arise, specially regarding the domain of the right-hand-side of the estimated equation. 3) The Taylor linearization approach (Goldstein, 2010): It consist on a first order Taylor expansion of the model. This linearization allows the researcher to compute an estimate of the level 1 and the level 2 variance both measured in (square) probability scale, based on the estimates of the common fixed intercept (gamma00) for all regions and the variance of the random level 2 disturbances (var(u(0j)). This approach estimates very similar results of those obtained using technique 2. 4) Simulation approach (Goldstein, 2010): It consist on simulate a large number of values for u(0j) drawn from the normal distribution with expected value=0 and variance=var(u(0j)). If the distribution of your u(0j) is indead normal, it allow you to have a bigger sample size for variance decomposition. Once you have this simulated u(0j)*, the variance of each simulated u(0j)* is just the bernouilli variance var(sim(j))=p*(1-p*). In order to compute p*, you just simply use the usual formula to get probabilities from log odds p*=(exp(u*(0j))/(1+exp(u*(0j))) The expected value of all this bernouilli variances is the level-1 variance, as in usual ANOVA decomposition. So, level-1 variance will be the simple average of all these var(sim(j)). This result is usually very similar than the one obtained by methods 2) and 3). Level 2 variance will be (like in regular ANOVA decomposition) var level-2=var(p*). Here it is usual to obtain a much smaller variance than in the other methods, as you are "forcing" the p* to distribute normally with a small variance. Nevertheless, for big samples where u(0j) is indeed normal, the results are similar. Problems arise when the number of level-2 units is small and they distribute not normally. 5) Alternatively to variance decomposition, there are other measures to judge the strength of the level-2 variability. I reccomend to take a look on the Interval Odds Ratios by Larsen. I hope I have been useful. Please feel free to ask for more clarification to any of you interested in this topic of variance decomposition. |
Very informative post!
Glad to hear the data simulation was helpful. I take little credit. Most of what I know about simulation has come directly from one of the brilliant posters on SAS-L. For those interested in learning about data simulation using SAS, I would strongly urge you to search the SAS-L archives. Data simulation has given me a much deeper appreciation of the fundamental mechanics of various statistical models. Of course, I still have much to learn... Ryan On Tue, Apr 5, 2011 at 4:22 AM, jmdpulido <[hidden email]> wrote: > Dear Ryan, > > Thanks for your answer. I always learn from your synthaxis. > > I've been reading much more about variance decomposition in multilevel > logistic. I will post the main conclussions, just in case they are useful > for anyone. There are several methods. > > 1) The latent variable approach (Snijders & Bosker, 1999): you have to > assume that your observed outcome Y= 0 or Y = 1 is really a dichotomization > of a continuous latent variable. Then you assume that this latent variable > distributes as an Standard Logistic. The variance of an standard logistic is > pi square divided by three. This is a reasonable approach in the absence of > overdispersion for level-1 variance. Level 2 variance is estimated by the > model by maximum likelihood. This is the most popular method, as it is very > easy to compute. CIC = var(u(0j)) / (var(u(0j)) + (pi^2)/3). Where > var(u(0j)) is the variance of the random level 2 intercepts. > > Nevertheless, this method assumes that level-1 variance is invariant in > absence of overdispersion, which is not always very reasonable (see Larsen > 2006: New measures for understanding the multilevel regression model). > Moreover, the level 2 variance (var(u(0j)) is measured in the square latent > variable approach, thus it is not easy to interpret it in terms of p (in > fact, for p's near 0,5 the function is very step and small changes in p lead > to great increases in the variance, leading sometimes to wrong > interpretantions). > > 2) The linear probability function approach (Goldstein, 2004, and Goldstein, > 2010): it consist in estimating a multilevel linear probability model with > random effects. This way the ML model estimates both a level 1 variance and > a level 2 variance, measured in (square) probabilities. This method is also > straight forward, nevetheless, it has the drawback of assumming a linear > probability model. When running a null model (fully unconditional) that only > depends on which level 2 unit each individual belongs it usually estimates > plausible results. Of course, when more level 1 or level 2 predictors are > added to the model, some problems arise, specially regarding the domain of > the right-hand-side of the estimated equation. > > 3) The Taylor linearization approach (Goldstein, 2010): It consist on a > first order Taylor expansion of the model. This linearization allows the > researcher to compute an estimate of the level 1 and the level 2 variance > both measured in (square) probability scale, based on the estimates of the > common fixed intercept (gamma00) for all regions and the variance of the > random level 2 disturbances (var(u(0j)). This approach estimates very > similar results of those obtained using technique 2. > > 4) Simulation approach (Goldstein, 2010): It consist on simulate a large > number of values for u(0j) drawn from the normal distribution with expected > value=0 and variance=var(u(0j)). If the distribution of your u(0j) is indead > normal, it allow you to have a bigger sample size for variance > decomposition. Once you have this simulated u(0j)*, the variance of each > simulated u(0j)* is just the bernouilli variance var(sim(j))=p*(1-p*). In > order to compute p*, you just simply use the usual formula to get > probabilities from log odds p*=(exp(u*(0j))/(1+exp(u*(0j))) > The expected value of all this bernouilli variances is the level-1 variance, > as in usual ANOVA decomposition. So, level-1 variance will be the simple > average of all these var(sim(j)). This result is usually very similar than > the one obtained by methods 2) and 3). > Level 2 variance will be (like in regular ANOVA decomposition) var > level-2=var(p*). Here it is usual to obtain a much smaller variance than in > the other methods, as you are "forcing" the p* to distribute normally with a > small variance. Nevertheless, for big samples where u(0j) is indeed normal, > the results are similar. Problems arise when the number of level-2 units is > small and they distribute not normally. > > 5) Alternatively to variance decomposition, there are other measures to > judge the strength of the level-2 variability. I reccomend to take a look on > the Interval Odds Ratios by Larsen. > > I hope I have been useful. > > Please feel free to ask for more clarification to any of you interested in > this topic of variance decomposition. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multilevel-Logistic-Variance-Decomposition-ICC-tp4263848p4283482.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |