Dear SPSS listserv,
I am trying to determine the best way to examine patterns of missing data on a depression measure by gender. I want to see if any significant differences in missing data occur between men and women. Does anyone know if SPSS MVA can do this or should I take a different approach. Thank you. Dan Van Dussen Doctoral Candidate in Gerontology UMBC |
Dan,
When I have done this, I have recoded the original variables into a set of new dichtomous variables. Then crosstabed them against gender. Gene Maguin |
Dear All,
I am an Ecologist/physiologist and thought I had sufficient understanding of statistics for my purposes. However, I am being trouble by something simple. When using something like ANOVA if for example assumptions of normality or equality of variances are not met is this less important if the output is highly significant e.g. P less than 0.00001 rather than P only slightly less than 0.05? I am aware that this is a stats question rather than an SPSS question so you may simply want to suggest another listserv I could use. I am not finding the answer in textbooks. Thanks in advance, Deborah |
Deborah, I'll be interested to see responses from others, as I don't think there will be an ironclad truism, but here is my two-pence
(1) even though often we hear that ANOVA is "robust" to moderate violations of the assumptions, there are some papers that show this is not necessarily the case when sample size markedly varies across groups (e.g., if the larger variance is associated with the larger group tests of signficance tend to be conservative); so, to the extent that your design is relatively balanced, that will be of less concern that if it is not. (2) Though there are myriad opinions about transformations (e.g., log, reciprocal, etc.), if the normality assumption is not tenable, attempt a transformation (one ideally that can be justified) and see if your general results/conclusion holds (3) You can always resort to a nonparametric analogue (e.g., Kruskal-Wallis) and again check if the general results obtain (4) and for the kicker: a p-value, regardless of the value, does not not necessarily mitigate violations of assumptions or relieve the researcher of concern about such. Null hypothesis significance testing has a long and storied history of misapplication and misinterpretation (see the edited text by Harlow et al. titled "what if there were no significance tests), and a 'smaller' p-value, though some would say this is prima facie evidence of "more" power, does not tell us anything more than the probability of the result given a true null hypothesis.....it can't tell us about replicability, and as many authors from Cohen on have emphasized, it doesn't yield information about magnitude (i.e, effect size). So, I would be very wary about abdicating concern about violations based on a p-value, as it should be subordinated to design and measurement issues. Again, just my opinion! Dale Glaser Deborah Pearce <[hidden email]> wrote: Dear All, I am an Ecologist/physiologist and thought I had sufficient understanding of statistics for my purposes. However, I am being trouble by something simple. When using something like ANOVA if for example assumptions of normality or equality of variances are not met is this less important if the output is highly significant e.g. P less than 0.00001 rather than P only slightly less than 0.05? I am aware that this is a stats question rather than an SPSS question so you may simply want to suggest another listserv I could use. I am not finding the answer in textbooks. Thanks in advance, Deborah Dale Glaser, Ph.D. Principal--Glaser Consulting Lecturer--SDSU/USD/CSUSM/AIU 4003 Goldfinch St, Suite G San Diego, CA 92103 phone: 619-220-0602 fax: 619-220-0412 email: [hidden email] website: www.glaserconsult.com |
Hi
DG> Deborah, I'll be interested to see responses from others, as DG> I don't think there will be an ironclad truism, but here is my DG> two-pence Here's mine. DG> (1) even though often we hear that ANOVA is "robust" to DG> moderate violations of the assumptions, there are some papers that DG> show this is not necessarily the case when sample size markedly DG> varies across groups (e.g., if the larger variance is associated DG> with the larger group tests of signficance tend to be DG> conservative); so, to the extent that your design is relatively DG> balanced, that will be of less concern that if it is not. Normality is in general considered less important than homogeneity of variances. As you point out, unbalanced designs are more affected by lack of HOV. I read (but I don't have the reference here right now) that lack of HOV will severely affect the ANOVA p-value if the smallest sample is lower than 10 cases and the biggest sample is more than four times the smallest. If the biggest samples has the smallest variances, true significance level increases, and if biggest samples have biggest variances, true significance level disminishes (I hope my explanation is clear, even in Spanish it was a bit difficult to understand, and the translation hasn't improved it). For Oneway ANOVA, SPSS incorporated (since version 9, I believe) robust tests: Brown-Forsythe and Welch (this last is more adecuated for heavily unbalanced designs). Even if lack of HOV doesn't have much impact on the overall p-value (in balanced or moderately unbalanced designs), it can have a lot of effect on multiple comparison methods. Replace Tukey..., or any post-hoc method you use, by Tamhane test. Also, contrasts (orthogonal or not) are adjusted for lack of HOV (I'm talking about SPSS' procedure ONEWAY). DG> (2) Though there are myriad opinions about transformations DG> (e.g., log, reciprocal, etc.), if the normality assumption is not DG> tenable, attempt a transformation (one ideally that can be DG> justified) and see if your general results/conclusion holds As I mentioned before, ANOVA is quite robust to departures of normality (as a matter of fact, Levene test is an ANOVA with the absolute values of the residuals, which have a highly skewed distribution). Besides, transformations that fix lack of HOV usually improve normality, therefore I recommend you to focus on those transformations that stabilize variances and see the effect on normality. A list of the most popular transformations: - If SD is proportional to the mean, then a log transformation will improve both HOV & normality (distributions tipically log-normal, positively skewed). This transformation has the advantage of being "reversable": you can back transform the data and obtain geometric means or ratio of geometric means (when you back transform logmean differences). Use x'=(log(1+x) if there are zeroes (problems back transforming data can arise in this particular case). - If variance is proportional to the mean, then you have distributions that follow Poisson distributions (or overdispersed Poisson distributions: Negative binomial) and square root can help. Again, add 1 before taking the square root if zeroes are present. This transformation can't be back transformed for mean differences. - For binomial proportions with constant denominators, you can use the angular transformation: x'=arcsin(sqrt(p)). Again, it can't be back transformed for mean differences. - Reciprocal transformation: x'=1/x. I can't remember right now when it was indicated. See: http://bmj.bmjjournals.com/cgi/content/full/312/7039/1153 for an interesting Statistics Note on the problems of back transforming CI for mean differences. Also, this note http://bmj.bmjjournals.com/cgi/content/full/312/7038/1079 focuses on the problem of trying to back transform the SD after a log transform. DG> (3) You can always resort to a nonparametric analogue DG> (e.g., Kruskal-Wallis) and again check if the general results DG> obtained... This could be OK if the only problem is lack of normality, but if you also have lack of HOV you should not use Kruskal-Wallis test. Citing a previous message of mine (from the Tutorial on non-parametrics series I started in April): "Data requirements for Kruskal-Wallis test: distributions similar in shape (this means that dispersion is something to be considered too; see: "Statistical Significance Levels of Nonparametric Tests Biased by Heterogeneous Variances of Treatment Groups" Journal of General Psychology, Oct, 2000 by Donald W. Zimmerman. Available at: http://www.findarticles.com/p/articles/mi_m2405/is_4_127/ai_68025177 )" HTH, Marta |
In reply to this post by Dan-92
Hi Dan,
I split my file by gender first. Go to "Data" on the toolbar > "split file", select "organize output by groups", move "gender" across then click "OK". All analyses you do from this point onwards using this data file will be by gender. Remember to turn split file off before running your normal analyses though (just select "analyze all cases by group"). Hope this helps Good luck. Kathryn > Date: Tue, 20 Jun 2006 14:53:24 -0400> From: [hidden email]> Subject: SPSS MVA> To: [hidden email]> > Dear SPSS listserv,> > I am trying to determine the best way to examine patterns of missing data> on a depression measure by gender. I want to see if any significant> differences in missing data occur between men and women. Does anyone know> if SPSS MVA can do this or should I take a different approach. Thank you.> > Dan Van Dussen> Doctoral Candidate in Gerontology> UMBC _________________________________________________________________ Try Live.com - your fast, personalized homepage with all the things you care about in one place. http://www.live.com/getstarted |
In reply to this post by Dan-92
Hi Dan,
I forgot to say...I "think" I remember using separate variance t tests to examine missing data patterns (I say think because it was a while ago & I can't remember for certain whether this is what I used to examine missing patterns by group). In MVA select "descriptives" then "t tests...". I followed the instructions in the SPSS MVA manual for version 7.5 of SPSS. As far as I know there are no newer manuals available for MVA. You can find the manual at this website: http://www.siue.edu/IUR/SPSS/SPSS%20Missing%20Value%20Analysis%207.5.pdf Try reading around pages 21-26. Kathryn > Date: Tue, 20 Jun 2006 14:53:24 -0400> From: [hidden email]> Subject: SPSS MVA> To: [hidden email]> > Dear SPSS listserv,> > I am trying to determine the best way to examine patterns of missing data> on a depression measure by gender. I want to see if any significant> differences in missing data occur between men and women. Does anyone know> if SPSS MVA can do this or should I take a different approach. Thank you.> > Dan Van Dussen> Doctoral Candidate in Gerontology> UMBC _________________________________________________________________ Try Live.com - your fast, personalized homepage with all the things you care about in one place. http://www.live.com/getstarted |
In reply to this post by Marta García-Granero
Excellent summary from Marta.
Another thing to recall is that the assumption for the tests in any glm (regression, anova, etc) is that the RESIDUALS are not overly discrepant from normal etc. Art Social Research Consultants Marta García-Granero wrote: >Hi > >DG> Deborah, I'll be interested to see responses from others, as >DG> I don't think there will be an ironclad truism, but here is my >DG> two-pence > >Here's mine. > >DG> (1) even though often we hear that ANOVA is "robust" to >DG> moderate violations of the assumptions, there are some papers that >DG> show this is not necessarily the case when sample size markedly >DG> varies across groups (e.g., if the larger variance is associated >DG> with the larger group tests of signficance tend to be >DG> conservative); so, to the extent that your design is relatively >DG> balanced, that will be of less concern that if it is not. > >Normality is in general considered less important than homogeneity of >variances. As you point out, unbalanced designs are more affected by >lack of HOV. I read (but I don't have the reference here right now) >that lack of HOV will severely affect the ANOVA p-value if the >smallest sample is lower than 10 cases and the biggest sample is more >than four times the smallest. If the biggest samples has the smallest >variances, true significance level increases, and if biggest samples >have biggest variances, true significance level disminishes (I hope my >explanation is clear, even in Spanish it was a bit difficult to >understand, and the translation hasn't improved it). > >For Oneway ANOVA, SPSS incorporated (since version 9, I believe) >robust tests: Brown-Forsythe and Welch (this last is more adecuated >for heavily unbalanced designs). > >Even if lack of HOV doesn't have much impact on the overall p-value >(in balanced or moderately unbalanced designs), it can have a lot of >effect on multiple comparison methods. Replace Tukey..., or any >post-hoc method you use, by Tamhane test. Also, contrasts (orthogonal >or not) are adjusted for lack of HOV (I'm talking about SPSS' >procedure ONEWAY). > >DG> (2) Though there are myriad opinions about transformations >DG> (e.g., log, reciprocal, etc.), if the normality assumption is not >DG> tenable, attempt a transformation (one ideally that can be >DG> justified) and see if your general results/conclusion holds > >As I mentioned before, ANOVA is quite robust to departures of >normality (as a matter of fact, Levene test is an ANOVA with the >absolute values of the residuals, which have a highly skewed >distribution). Besides, transformations that fix lack of HOV usually >improve normality, therefore I recommend you to focus on those >transformations that stabilize variances and see the effect on >normality. > >A list of the most popular transformations: > >- If SD is proportional to the mean, then a log transformation will >improve both HOV & normality (distributions tipically log-normal, >positively skewed). This transformation has the advantage of being >"reversable": you can back transform the data and obtain geometric >means or ratio of geometric means (when you back transform logmean >differences). Use x'=(log(1+x) if there are zeroes (problems back >transforming data can arise in this particular case). > >- If variance is proportional to the mean, then you have distributions >that follow Poisson distributions (or overdispersed Poisson >distributions: Negative binomial) and square root can help. Again, add >1 before taking the square root if zeroes are present. This >transformation can't be back transformed for mean differences. > >- For binomial proportions with constant denominators, you can use the >angular transformation: x'=arcsin(sqrt(p)). Again, it can't be back >transformed for mean differences. > >- Reciprocal transformation: x'=1/x. I can't remember right now when >it was indicated. > >See: http://bmj.bmjjournals.com/cgi/content/full/312/7039/1153 for an >interesting Statistics Note on the problems of back transforming CI >for mean differences. > >Also, this note >http://bmj.bmjjournals.com/cgi/content/full/312/7038/1079 focuses on >the problem of trying to back transform the SD after a log transform. > >DG> (3) You can always resort to a nonparametric analogue >DG> (e.g., Kruskal-Wallis) and again check if the general results >DG> obtained... > >This could be OK if the only problem is lack of normality, but if you >also have lack of HOV you should not use Kruskal-Wallis test. Citing a >previous message of mine (from the Tutorial on non-parametrics series >I started in April): > >"Data requirements for Kruskal-Wallis test: distributions similar in >shape (this means that dispersion is something to be considered too; >see: "Statistical Significance Levels of Nonparametric Tests Biased by >Heterogeneous Variances of Treatment Groups" Journal of General >Psychology, Oct, 2000 by Donald W. Zimmerman. Available at: >http://www.findarticles.com/p/articles/mi_m2405/is_4_127/ai_68025177 )" > >HTH, > >Marta > > > >
Art Kendall
Social Research Consultants |
In reply to this post by Marta García-Granero
Dear All,
Thank you Marta and Dale for your detailed responses in relation to my question about the lack of HOV and the effect on P values. This is a follow-up question. My designs are not Oneway ANOVAs but balanced two-way ANOVAs with replication (e.g. 3 sites, 4 depths and 4 replicates). P was generally less than 0.00001 for the interaction, the site effect and the depth effect. I followed your advice and transformed my data (following your recommended criteria/justification) and it did increase the HOV for many of my experiments and the residuals followed a normal distribution. Thanks. For other experiments no transformations gave equality of variances. In these cases if the sites are looked at separately i.e. just looking at the data for the change of depth within each site then I do get equality of variances. For these experiments is it better to just look at each site separately using a Oneway ANOVA? (The sites are actually so different for the variable I am looking at). When is Tukey better to use than the Tamhane test? I mean what if P=0.06 for Leven HOV test? For ANOVA does one need to check that the data as well as the residuals follow a normal distribution of is just the residuals sufficient? I will be very grateful of any help. Best wishes, Deborah > Hi > > DG> Deborah, I'll be interested to see responses from others, as > DG> I don't think there will be an ironclad truism, but here is my > DG> two-pence > > Here's mine. > > DG> (1) even though often we hear that ANOVA is "robust" to > DG> moderate violations of the assumptions, there are some papers that > DG> show this is not necessarily the case when sample size markedly > DG> varies across groups (e.g., if the larger variance is associated > DG> with the larger group tests of signficance tend to be > DG> conservative); so, to the extent that your design is relatively > DG> balanced, that will be of less concern that if it is not. > > Normality is in general considered less important than homogeneity of > variances. As you point out, unbalanced designs are more affected by > lack of HOV. I read (but I don't have the reference here right now) > that lack of HOV will severely affect the ANOVA p-value if the > smallest sample is lower than 10 cases and the biggest sample is more > than four times the smallest. If the biggest samples has the smallest > variances, true significance level increases, and if biggest samples > have biggest variances, true significance level disminishes (I hope my > explanation is clear, even in Spanish it was a bit difficult to > understand, and the translation hasn't improved it). > > For Oneway ANOVA, SPSS incorporated (since version 9, I believe) > robust tests: Brown-Forsythe and Welch (this last is more adecuated > for heavily unbalanced designs). > > Even if lack of HOV doesn't have much impact on the overall p-value > (in balanced or moderately unbalanced designs), it can have a lot of > effect on multiple comparison methods. Replace Tukey..., or any > post-hoc method you use, by Tamhane test. Also, contrasts (orthogonal > or not) are adjusted for lack of HOV (I'm talking about SPSS' > procedure ONEWAY). > > DG> (2) Though there are myriad opinions about transformations > DG> (e.g., log, reciprocal, etc.), if the normality assumption is not > DG> tenable, attempt a transformation (one ideally that can be > DG> justified) and see if your general results/conclusion holds > > As I mentioned before, ANOVA is quite robust to departures of > normality (as a matter of fact, Levene test is an ANOVA with the > absolute values of the residuals, which have a highly skewed > distribution). Besides, transformations that fix lack of HOV usually > improve normality, therefore I recommend you to focus on those > transformations that stabilize variances and see the effect on > normality. > > A list of the most popular transformations: > > - If SD is proportional to the mean, then a log transformation will > improve both HOV & normality (distributions tipically log-normal, > positively skewed). This transformation has the advantage of being > "reversable": you can back transform the data and obtain geometric > means or ratio of geometric means (when you back transform logmean > differences). Use x'=(log(1+x) if there are zeroes (problems back > transforming data can arise in this particular case). > > - If variance is proportional to the mean, then you have distributions > that follow Poisson distributions (or overdispersed Poisson > distributions: Negative binomial) and square root can help. Again, add > 1 before taking the square root if zeroes are present. This > transformation can't be back transformed for mean differences. > > - For binomial proportions with constant denominators, you can use the > angular transformation: x'=arcsin(sqrt(p)). Again, it can't be back > transformed for mean differences. > > - Reciprocal transformation: x'=1/x. I can't remember right now when > it was indicated. > > See: http://bmj.bmjjournals.com/cgi/content/full/312/7039/1153 for an > interesting Statistics Note on the problems of back transforming CI > for mean differences. > > Also, this note > http://bmj.bmjjournals.com/cgi/content/full/312/7038/1079 focuses on > the problem of trying to back transform the SD after a log transform. > > DG> (3) You can always resort to a nonparametric analogue > DG> (e.g., Kruskal-Wallis) and again check if the general results > DG> obtained... > > This could be OK if the only problem is lack of normality, but if you > also have lack of HOV you should not use Kruskal-Wallis test. Citing a > previous message of mine (from the Tutorial on non-parametrics series > I started in April): > > "Data requirements for Kruskal-Wallis test: distributions similar in > shape (this means that dispersion is something to be considered too; > see: "Statistical Significance Levels of Nonparametric Tests Biased by > Heterogeneous Variances of Treatment Groups" Journal of General > Psychology, Oct, 2000 by Donald W. Zimmerman. Available at: > http://www.findarticles.com/p/articles/mi_m2405/is_4_127/ai_68025177 )" > > HTH, > > Marta > |
Hi Deborah,
DP> My designs are not Oneway ANOVAs but balanced two-way ANOVAs with DP> replication (e.g. 3 sites, 4 depths and 4 replicates). P was generally DP> less than 0.00001 for the interaction, the site effect and the depth DP> effect. The presence of significant interaction modifates the scenario quite a lot: you can't test main effects & post-hoc for main effects if interaction is present. You must continue your analysis with a series of partial tests (you split your dataset by one factor and analyze the other factor with ONEWAY ANOVA, then you split for the second and analyze the first). If you need extra information on how to proceed tell me (but on the list, please). DP> For other experiments no transformations gave equality of variances. In DP> these cases if the sites are looked at separately i.e. just looking at the DP> data for the change of depth within each site then I do get equality of DP> variances. For these experiments is it better to just look at each site DP> separately using a Oneway ANOVA? (The sites are actually so different for DP> the variable I am looking at). For balanced designs, the efect of the lack of HOV is minimal. Perform an extra check comparing the smallest SD with the biggest SD, if the ratio biggest/smallest is below 2, then the lack of HOV (whatever Levene test says) is not really important. DP> When is Tukey better to use than the Tamhane test? I mean what if P=0.06 DP> for Leven HOV test? If you suspect lack of HOV (if Levene test is non significant, 0.06 f.i., but ratio of SD -bigger/smaller- is over 2), then Tamhane test will control it better. But remember that you can't use post-hoc tests for main effects when interaction is present. DP> For ANOVA does one need to check that the data as well as the residuals DP> follow a normal distribution of is just the residuals sufficient? Only the residuals. Regards, Marta |
Free forum by Nabble | Edit this page |