Hi,
Not strictly a SPSS question but a stats question instead, hoping it may take interest to others and elicit a response from any one of the list members.
Suppose I have multiple records per student, for simplicity say 3 records per student, each record representing each of their A-Levels undertaken in secondary school. Suppose each A-Level is taught by a single teacher and data is collected on the age of the teacher (or years of experience of teacher – essentially an interval measure) and also a dichotomous variable indicating pass or failure of the A-level.
Would it be valid to use an ANOVA to test if the mean age (or years of experience of teacher) is different between the 2 groups (Pass vs. Failure).
One of three assumptions of ANOVA is that the groups must be independent. What I describe above instinctively feels like this assumption is violated. So if ANOVA cannot be used, how should I be analyzing this data to answer this same questions?
Many thanks in advance. Jignesh |
I would state your assumption of independence in a different way.
What is assumed for the sake of testing is that the errors are independent. That is seldom achieved perfectly. What we do insist on is that the major "sources of variance" be accounted for -- by including factors or covariates in the design. You describe a Student-by-Success design, where Success would be a within-student effect. If, as seems likely, the same teachers appear more than once, then there would be a Teacher effect to be taken into account. If there aren't a large number of teachers, you hardly have data to generalize across "Teachers" as a between-Teacher effect, to compare "older" versus "younger" (or by experience) . If there are a lot of years of data, then it is possible that there also could be a Year-of-test effect. (For instance, in the US: It is known that the scoring on SAT college-entrance tests vary a bit by year to year, and they did somewhat dramatically change their standardization at a specific year in the past. Recently, high scores are easier to get.) - I can imagine, for a huge amount of historical data, that one could examine "Success" as a within-Teacher effect, to see whether it changes systematically with Years-of-Experience. If there is anything there, I think I would expect it to be some improvement in the first few years. I expect competition between experience and enthusiasm -- Rich Ulrich Date: Fri, 19 Apr 2013 09:13:34 +0100 From: [hidden email] Subject: ANOVA Assumption of Independence To: [hidden email] Hi,
Not strictly a SPSS question but a stats question instead, hoping it may take interest to others and elicit a response from any one of the list members.
Suppose I have multiple records per student, for simplicity say 3 records per student, each record representing each of their A-Levels undertaken in secondary school. Suppose each A-Level is taught by a single teacher and data is collected on the age of the teacher (or years of experience of teacher – essentially an interval measure) and also a dichotomous variable indicating pass or failure of the A-level.
Would it be valid to use an ANOVA to test if the mean age (or years of experience of teacher) is different between the 2 groups (Pass vs. Failure).
One of three assumptions of ANOVA is that the groups must be independent. What I describe above instinctively feels like this assumption is violated. So if ANOVA cannot be used, how should I be analyzing this data to answer this same questions?
Many thanks in advance. Jignesh |
I am out of the office until Monday 29th April. Please contact my colleagues Niall or Stephen if you need assistance.
|
In reply to this post by Jignesh Sutar
Hi Jignesh. Good observation. While I am not an expert in random effects modeling, it sounds like your data structure is consistent with such an analysis. Essentially you do not have "independent" observations (or errors) since you have observations nested within student nested within teachers. In other words, your data have an intraclass correlation that should not be ignored (ignoring while bias, sometimes dramatically, ultimately your obtained significance level. In psychology, the analytic approach used in such data scenarios is often referred to as multilevel modeling (aka in various disciplines with mostly minor differences as random effects models, random regression models, random and/or mixed effects, linear mixed modeling, generalized estimating equations (GEE), etc). SPSS has the linear mixed models module (and I believe GEE) that I recommend you look into (the linear mixed model vs. GEE may be appropriated depending on whether you utilize a the full exam score, a continuously distributed variable, or the dichotomous pass/fail as your DV).
Others on the List will hopefully shout out my errors in this recommendation? :) All the best. |
Free forum by Nabble | Edit this page |