|
Hello all, I have finished my research project. It was a cluster randomized trial. I want to compare the changes of post test and pre test score between intervention group & control group. I planned to use spss mixed model. Unfortunately my data are not normally distributed. If I use Friedman test, I cannot control for the pre etst & other covariates. Do you have any suggestion? Thank you very much for your help. Sincerely, Wienta
Make your browsing faster, safer, and easier with the new Internet Explorer® 8. Optimized for Yahoo! Get it Now for Free! |
|
Wienta,
Please describe in what way the data are not normally distributed and to what extent. I am assuming that you do not have dichotomous DVs but I just want to ask. Are the DVs Likert scale items? If so, are you wanting to analyze them as ordinal variables? Are the DV distributions J-shaped, i.e., high and rapidly declining percentages for low values but with a (very)long tail? Within spss, I think you have no alternative but spss Mixed because you have time nested within persons and persons nested within cluster (assuming I have understood you design correctly). It may be possible to transform your data to get better distributions but you'll need to consider how transformed results will play in your publishing venue. Spss has increased their capacity for categorical data through the Genlin procedure but I'm not sure it can handle a problem such as yours. I may be wrong; perhaps others will correct me if so. Gene Maguin >>I have finished my research project. It was a cluster randomized trial. I want to compare the changes of post test and pre test score between intervention group & control group. I planned to use spss mixed model. Unfortunately my data are not normally distributed. If I use Friedman test, I cannot control for the pre etst & other covariates. Do you have any suggestion? Thank you very much for your help. Sincerely, Wienta ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Wienta,
If you need to model binary or ordinal outcomes in multilevel modeling, you may want to look at HLM software. Best, Steve Brand www.StatisticsDoc.com -----Original Message----- From: Gene Maguin <[hidden email]> Date: Mon, 11 Jan 2010 09:38:15 To: <[hidden email]> Subject: Re: Mixed model for data that are not normally distributed. Wienta, Please describe in what way the data are not normally distributed and to what extent. I am assuming that you do not have dichotomous DVs but I just want to ask. Are the DVs Likert scale items? If so, are you wanting to analyze them as ordinal variables? Are the DV distributions J-shaped, i.e., high and rapidly declining percentages for low values but with a (very)long tail? Within spss, I think you have no alternative but spss Mixed because you have time nested within persons and persons nested within cluster (assuming I have understood you design correctly). It may be possible to transform your data to get better distributions but you'll need to consider how transformed results will play in your publishing venue. Spss has increased their capacity for categorical data through the Genlin procedure but I'm not sure it can handle a problem such as yours. I may be wrong; perhaps others will correct me if so. Gene Maguin >>I have finished my research project. It was a cluster randomized trial. I want to compare the changes of post test and pre test score between intervention group & control group. I planned to use spss mixed model. Unfortunately my data are not normally distributed. If I use Friedman test, I cannot control for the pre etst & other covariates. Do you have any suggestion? Thank you very much for your help. Sincerely, Wienta ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Maguin, Eugene
Wienta,
Ok. Negative skew. A power transformation, squared, cubed, etc stretches the upper end of the distribution and reduces skew. I'd like to suggest that you try out different transformations as well as the untransformed data. It may be that the coefficient estimates and standard errors are little affected by the skew. You have a three level model. There was a reply today to an earlier message by a different poster and that bears on your planned analysis. You may have seen it. 16 level 3 units (schools) would, I think, be regarded as too few to give good estimates of coefficients and standard errors. Gene Maguin >>My research was carried out among grade 11 students in 16 senior high schools in 2 provinces in Indonesia. 8 schools are in the intervention group & the other 8 schools are in the control group. There are 1079 students participated in my research. I gave pre test in January 2009 & post test in March 2009. Yes, my dependent variables are originally in Likert scale (knowledge, attitude & behaviour intent tests). I take the total scores of each test and test their normality. The data are negatively skewed. Wienta >>>Wienta, Please describe in what way the data are not normally distributed and to what extent. I am assuming that you do not have dichotomous DVs but I just want to ask. Are the DVs Likert scale items? If so, are you wanting to analyze them as ordinal variables? Are the DV distributions J-shaped, i.e., high and rapidly declining percentages for low values but with a (very)long tail? Within spss, I think you have no alternative but spss Mixed because you have time nested within persons and persons nested within cluster (assuming I have understood you design correctly). It may be possible to transform your data to get better distributions but you'll need to consider how transformed results will play in your publishing venue. Spss has increased their capacity for categorical data through the Genlin procedure but I'm not sure it can handle a problem such as yours. I may be wrong; perhaps others will correct me if so. Gene Maguin >>>>I have finished my research project. It was a cluster randomized trial. I want to compare the changes of post test and pre test score between intervention group & control group. I planned to use spss mixed model. Unfortunately my data are not normally distributed. If I use Friedman test, I cannot control for the pre etst & other covariates. Do you have any suggestion? Thank you very much for your help. Sincerely, Wienta ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Gene,
You and another poster yesterday raised an interesting point about the required sample size at the higher levels (i.e. communities) with the higher level subjects variable treated as a random effects variable. I have seen references that advise at least 20 or even 30 subjects at each of the higher levels. In my own work, I've gone as low as ~10 higher level units with the understanding that I am below the typically recommended sample size. Determining whether a sample size is appropriate is, I believe, in part study specific. It is certainly possible to obtain reliable estimates at the smaller sample sizes, but one should proceed with caution. I welcome further discussion on this topic. For the specific analysis in question, I'd consider running a linear mixed model [if assumptions are met], and I'd probably leave out Province at least initially and include a REPEATED or RANDOM statement to account for covariation across both time points, and include a random intercept for School. This, of course, is not the "full" model but I won't go beyond this recommendation until I know more about the data. Ryan
|
|
Wienta,
Sample size in HLM is a complex issue. With 16 schools, your estimates of between school effects would be biased (standard errors would be too small) but your interests probably lie in the lower level effects. Would I be correct in assuming that you are testing the hypothesis that level 1 change in the dependent variable differs between the experimental and comparison schools? For some guidelines and possible solutions you might want to start with Maas and Hox (2005) in the journal Methodology. HTH Steve Brand www.StatisticsDoc.com -----Original Message----- From: rblack <[hidden email]> Date: Tue, 12 Jan 2010 05:49:42 To: <[hidden email]> Subject: Re: Mixed model for data that are not normally distributed. Gene, You and another poster yesterday raised an interesting point about the required sample size at the higher levels (i.e. communities) with the higher level subjects variable treated as a random effects variable. I have seen references that advise at least 20 or even 30 subjects at each of the higher levels. In my own work, I've gone as low as ~10 higher level units with the understanding that I am below the typically recommended sample size. Determining whether a sample size is appropriate is, I believe, in part study specific. It is certainly possible to obtain reliable estimates at the smaller sample sizes, but one should proceed with caution. I welcome further discussion on this topic. For the specific analysis in question, I'd consider running a linear mixed model [if assumptions are met], and I'd probably leave out Province at least initially and include a REPEATED or RANDOM statement to account for covariation across both time points, and include a random intercept for School. This, of course, is not the "full" model but I won't go beyond this recommendation until I know more about the data. Ryan Gene Maguin wrote: > > Wienta, > > Ok. Negative skew. A power transformation, squared, cubed, etc stretches > the > upper end of the distribution and reduces skew. I'd like to suggest that > you > try out different transformations as well as the untransformed data. It > may > be that the coefficient estimates and standard errors are little affected > by > the skew. > > You have a three level model. There was a reply today to an earlier > message > by a different poster and that bears on your planned analysis. You may > have > seen it. 16 level 3 units (schools) would, I think, be regarded as too few > to give good estimates of coefficients and standard errors. > > Gene Maguin > > > >>>My research was carried out among grade 11 students in 16 senior high > schools in 2 provinces in Indonesia. 8 schools are in the intervention > group > & the other 8 schools are in the control group. There are 1079 students > participated in my research. I gave pre test in January 2009 & post test > in > March 2009. > > Yes, my dependent variables are originally in Likert scale (knowledge, > attitude & behaviour intent tests). I take the total scores of each test > and > test their normality. The data are negatively skewed. > > Wienta > > >>>>Wienta, > > Please describe in what way the data are not normally distributed and to > what extent. I am assuming that you do not have dichotomous DVs but I just > want to ask. Are the DVs Likert scale items? If so, are you wanting to > analyze them as ordinal variables? Are the DV distributions J-shaped, > i.e., > high and rapidly declining percentages for low values but with a > (very)long > tail? > > Within spss, I think you have no alternative but spss Mixed because you > have > time nested within persons and persons nested within cluster (assuming I > have understood you design correctly). It may be possible to transform > your > data to get better distributions but you'll need to consider how > transformed > results will play in your publishing venue. Spss has increased their > capacity for categorical data through the Genlin procedure but I'm not > sure > it can handle a problem such as yours. I may be wrong; perhaps others will > correct me if so. > > Gene Maguin > > >>>>>I have finished my research project. It was a cluster randomized trial. > I > want to compare the changes of post test and pre test score between > intervention group & control group. I planned to use spss mixed model. > Unfortunately my data are not normally distributed. If I use Friedman > test, > I cannot control for the pre etst & other covariates. > Do you have any suggestion? > > Thank you very much for your help. > > Sincerely, > Wienta > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > -- View this message in context: http://old.nabble.com/Mixed-model-for-data-that-are-not-normally-distributed.-tp27105939p27127804.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
In reply to this post by Ryan
I posted somewhere recently that Snijders & Bosker (1999) use 10 groups as a dividing line in their rule of thumb for choosing between fixed vs. random effects. Here is the excerpt: "In order to choose between regarding the group-dependent intercepts U_0j as fixed statistical parameters and regarding them as random variables, a rule of thumb that often works well in educational and social research is the following. This rule mainly depends on N, the number of groups in the data. If N is small, say N < 10, then use the analysis of covariance approach: the problem with viewing the groups as a sample from a population is in this case, that the data will contain only scanty information about this population. If N is not small, say N >= 10, while n_j is small or intermediate, say n_j < 100, then use the random coefficient approach: 10 or more groups is usually too large a number to be regarded as unique entities. If the group sizes n_j are large, say n_j >= 100, then it does not matter much which view we take. However, this rule of thumb should be take [sic] with a large grain of salt and serves only to give a first hunch, not to determine the choice between fixed and random effects." (Snijders & Bosker, 1999, p. 44) Gene commented that 16 schools would be considered too few higher level units, and I think another poster suggested that anything less than 20 or 30 higher level units is too few. So is the advice from Snijders & Bosker out of step with current thinking in the multilevel community?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Bruce.
Maas and Hox (2005) suggest that 50 level 2 units are needed for accurate estimation of level 2 effects, but lower level parameters are OK with fewer level 2 units. This recommendation is based on simulation studies. However, they also suggest that goof lower level estimates can be obtained with fewer higher level units. Best Steve www.StatisticsDoc.com -----Original Message----- From: Bruce Weaver <[hidden email]> Date: Tue, 12 Jan 2010 10:43:18 To: <[hidden email]> Subject: Re: Mixed model for data that are not normally distributed. rblack wrote: > > Gene, > > You and another poster yesterday raised an interesting point about the > required sample size at the higher levels (i.e. communities) with the > higher level subjects variable treated as a random effects variable. I > have seen references that advise at least 20 or even 30 subjects at each > of the higher levels. In my own work, I've gone as low as ~10 higher level > units with the understanding that I am below the typically recommended > sample size. Determining whether a sample size is appropriate is, I > believe, in part study specific. It is certainly possible to obtain > reliable estimates at the smaller sample sizes, but one should proceed > with caution. I welcome further discussion on this topic. > > For the specific analysis in question, I'd consider running a linear mixed > model [if assumptions are met], and I'd probably leave out Province at > least initially and include a REPEATED or RANDOM statement to account for > covariation across both time points, and include a random intercept for > School. This, of course, is not the "full" model but I won't go beyond > this recommendation until I know more about the data. > > Ryan > > I posted somewhere recently that Snijders & Bosker (1999) use 10 groups as a dividing line in their rule of thumb for choosing between fixed vs. random effects. Here is the excerpt: "In order to choose between regarding the group-dependent intercepts U_0j as fixed statistical parameters and regarding them as random variables, a rule of thumb that often works well in educational and social research is the following. This rule mainly depends on N, the number of groups in the data. If N is small, say N < 10, then use the analysis of covariance approach: the problem with viewing the groups as a sample from a population is in this case, that the data will contain only scanty information about this population. If N is not small, say N >= 10, while n_j is small or intermediate, say n_j < 100, then use the random coefficient approach: 10 or more groups is usually too large a number to be regarded as unique entities. If the group sizes n_j are large, say n_j >= 100, then it does not matter much which view we take. However, this rule of thumb should be take [sic] with a large grain of salt and serves only to give a first hunch, not to determine the choice between fixed and random effects." (Snijders & Bosker, 1999, p. 44) Gene commented that 16 schools would be considered too few higher level units, and I think another poster suggested that anything less than 20 or 30 higher level units is too few. So is the advice from Snijders & Bosker out of step with current thinking in the multilevel community? ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://old.nabble.com/Mixed-model-for-data-that-are-not-normally-distributed.-tp27105939p27131839.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Thanks Steve. I'll have to take a look at that sometime.
Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
| Free forum by Nabble | Edit this page |
