|
Dear all, First, let me say hello to the listserv members, as I have recently joined the group. I am trying to establish if average test scores for students from 70 districts differ significantly. When I run ANOVA on SPSS, I receive a warning that Post hoc tests cannot be performed as I have more than 50 groups. Is there any other option you can suggest? Thanks! Maia |
|
Administrator
|
Just out of curiosity, what kind of multiple comparison technique were you planning on using? With that many groups, the per contrast alpha for any procedure will be vanishingly small, I should think. Have you taken a look at using a multilevel model rather than ANOVA? One problem with ANOVA is that the group variable leaves nothing to be explained by higher level explanatory variables (e.g., district level variables). If you're not familiar with multilevel models, I recommend Jos Twisk's "Applied Multilevel Analysis"--I found it very accessible and helpful. You can preview it on Google Books.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Thanks, Bruce & Nath.
I am planning to use the hierarchical modelling. However, before doing that I am interested in observing the relationships between all pairs of outcome-predictor variables. That's the reason that I am using ANOVA, T-test, correlations and chi-square. Also, I do not really want to group my districts, as they are different geographic units. grouping results in 10 categories (regions) for which I easily use ANOVA. Thus, is there no chance of establishing the significance of differences in the mean test scores for 70 districts? On Mon, Oct 4, 2010 at 5:01 PM, Bruce Weaver <[hidden email]> wrote:
|
|
Administrator
|
C(70,2) = 2415. Carrying out that many contrasts seems ill-advised to me.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Maia Chankseliani wrote:
> > Thus, is there no chance of establishing the significance of differences in the mean test scores for 70 districts? > To which bruce weaver replied: >C(70,2) = 2415. Carrying out that many contrasts seems ill-advised to me. I add: Bruce is correct, of course. Imagine using pairwise t-tests which will result in .05 * 2415 ~ 121 false positives assuming no real differences. Correcting for the large number of t-tests with a Bonferroni correction results in an adjusted alpha of about 0.0000207 which might protect from false alarms, but would you actuallybe able to detect any real differences? Either way I don't see anything meaningful resulting from such an analysis. Michael **************************************************** Michael Granaas [hidden email] Assoc. Prof. Phone: 605 677 5295 Dept. of Psychology FAX: 605 677 3195 University of South Dakota 414 E. Clark St. Vermillion, SD 57069 ***************************************************** ________________________________________ From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Bruce Weaver [[hidden email]] Sent: Monday, October 04, 2010 11:19 AM To: [hidden email] Subject: Re: ANOVA with more than 50 groups Maia Chankseliani wrote: > > Thanks, Bruce & Nath. > I am planning to use the hierarchical modelling. However, before doing > that > I am interested in observing the relationships between all pairs of > outcome-predictor variables. That's the reason that I am using ANOVA, > T-test, correlations and chi-square. > > Also, I do not really want to group my districts, as they are different > geographic units. grouping results in 10 categories (regions) for which I > easily use ANOVA. > > Thus, is there no chance of establishing the significance of differences > in > the mean test scores for 70 districts? > > C(70,2) = 2415. Carrying out that many contrasts seems ill-advised to me. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/ANOVA-with-more-than-50-groups-tp3173003p3183923.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Maia Chankseliani
On 10/4/2010 3:27 AM, Maia Chankseliani wrote:
> I am trying to establish if average test scores for students from 70 > districts differ significantly. When I run ANOVA on SPSS, I receive a > warning that Post hoc tests cannot be performed as I have more than 50 > groups. Is there any other option you can suggest? In addition to the many good comments made already, let me suggest that perhaps comparing each district to the overall mean of all districts might be an interesting alternative. This is the approach taken by ANOM (Analysis of Means). I don't think you can do this directly in SPSS, but perhaps someone will prove me wrong. -- Steve Simon, Standard Disclaimer Sign up for The Monthly Mean, the newsletter that dares to call itself "average" at www.pmean.com/news ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Hi Steve. You can get something pretty close to that by using a "deviation" contrast. Here's the Help file description of it: DEVIATION. Deviations from the grand mean. This is the default for factors. Each level of the factor except one is compared to the grand mean. One category (by default, the last) must be omitted so that the effects will be independent of one another. To omit a category other than the last, specify the number of the omitted category (which is not necessarily the same as its value) in parentheses after the keyword DEVIATION. For example, UNIANOVA Y BY B /CONTRAST(B)=DEVIATION(1). To obtain what you're describing, you could run it a second time, changing which category is omitted. This still seems like an awful lot of contrasts to me, though. And I wonder if the OP will run into the same limit of 50 that they described earlier. Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Thanks for the emails.
I could not figure out how to do the ANOM or the deviation-related procedure on SPSS. What I tried is ANALYZE - COMPARE MEANS - MEANS. This procedure displayed mean test scores for all the 70 districts. Also, performed ANOVA and demonstrated that they do differ significantly. What would love to be able to do a post hoc test - to find out which districts differ significantly from one another and/or the mean nationwide score. At this stage, however, I needed to know that districts do differ. Later, when I include the district together with other variables in the hierarchical model, I hope to learn more. Thanks again. Best, Maia On Tue, Oct 5, 2010 at 10:19 PM, Bruce Weaver <[hidden email]> wrote:
|
|
Administrator
|
You are using the MEANS procedure to perform the ANOVA. It does not include contrasts or other multiple comparison procedures. To perform the DEVIANCE contrasts I described, use UNIANOVA. In the menus: Analyze - General Linear Model - Univariate The independent variable goes in the Fixed Factor box. Then click on the Contrasts button to open a dialog that lets you select Deviation contrasts. Run it twice, the first time, run it with the last category as the reference category, and the next time with the first category as the reference category. But...as I mentioned in my earlier post, this still seems like an awful lot of contrasts to me (70 of them). If you do a Bonferroni adjustment, the per contrast alpha is .05/70 = .0007. (I.e., you would only declare a contrast statistically significant if p is less than or equal to .0007.) I don't remember you saying anything about where your research falls on the exploratory to confirmatory spectrum. If it is purely exploratory, you can be a bit less careful about adjusting the per contrast alpha. E.g., if you set your per contrast alpha at .001, the family-wise alpha would be in the vicinity of .07, which is a bit higher than the usual .05, but not ridiculously so. I doubt there would be any serious objections. And finally, it would not surprise me if you run into the same limit you mentioned before--i.e., that contrasts can be done for only 50 levels or less.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
In reply to this post by Maia Chankseliani
Hi Maia, Is there any reason to expect that average test scores across 70
districts won’t differ? With that number, and assuming varying degrees of
social disadvantage, language skills, etc, I would find it unusual if there weren’t
any differences. Would you be more interested in trying to find a model of why
the test scores differ, rather than looking to see if they do? Cheers Michelle From: SPSSX(r) Discussion
[mailto:[hidden email]] On Behalf Of Maia Chankseliani
********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.
This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses.
www.clearswift.com **********************************************************************
|
|
In reply to this post by Bruce Weaver
Maia,
Why is it so important to test for mean differences between 70 districts? Do you view these districts as a [random] sample from which you would like to make inferences about the larger population of districts? What is/are your dependent variable(s)? What are your predictors? At what level are your variables measured (e.g. student, school, district)? How were these data collected? Do you have repeated measures? What is the approximate distribution of your dependent variable? What are your research questions? What are the sample sizes at each level (e.g., number of students, number of schools, number of districts, etc.)?
These are some of the questions I would need answered before providing a recommendation. Based on what I have read, I am not convinced that a general linear model (e.g. ANOVA) is the optimal approach.
Ryan On Tue, Oct 5, 2010 at 4:23 PM, Bruce Weaver <[hidden email]> wrote:
|
|
In reply to this post by Bruce Weaver
|
|
Administrator
|
Hi Dale. I think that what I have done in the past is use the information in the table of estimated marginal means (i.e., the means and confidence intervals) to create a clustered hi-lo plot. You may need to add lines via the chart editor. Or, if you use the new-fangled GGRAPH methods, you can probably do it via syntax. ViAnn or someone else better versed in GGRAPH than me may jump in with advice on that.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
| Free forum by Nabble | Edit this page |
