SPSS Question
I have been struggling lately with the statistical analysis of a complex experiment. The experiment was conducted in 3 environments, under 2 scenarios where each participant had to execute 3 tasks. Here's where it gets tricky.. The tasks were self-paced, so each participant executed each task a different number of times. Furthermore, only part of the experiment is within-subjects; i.e. the same 12 participants took part in Environments 1 and 2 and a different 11 participants took part in Environment 3 (feel free to ask me for any clarifications if my description does not give you a clear idea of the design). I am interested in investigating effect and interaction effect sizes as well as statistical significance between all factors (task, environment, scenario and participant) but am not sure what would be the right way to go about it. Is there a single ANOVA or other type of analysis (regression, Bayesian, etc.) that could be used in this case? This design is factorial, repeated measures and unbalanced at the same time. One approach would be to attempt pair-wise ANOVAs (one for the within-subject part and two for each one of those with the remaining). In that case, however, I can't see how I could combine effect sizes to interpret overall interactions.. I have tried used the mixed ANOVA model command but am not sure if that is the correct approach. I attach a small sample of my data so you can have a look or play around with it if you want. dataSample.csv <http://spssx-discussion.1045642.n5.nabble.com/file/t341315/dataSample.csv> The data are presented as follows Column 1: metric (in this case task completion times in sec.) Column 2: participant ID (ranging from 1 to 23 to denote the individuals that took part in the study - NOTE that the same 12 people were in environments 1 and 2 and a different set of 11 in environment 3) Column 3: task ID (ranging from 1 to 3 to denote the different tasks the participants performed during the experiment) Column 4: environment ID (ranging from 1 to 3 to denote the different environments the participants experienced during the experiment) Column 5: scenario ID (ranging from 1 to 2 to denote the different scenarios the participants experienced during the experiment). Any stats wiz out there who could help me have a good night's sleep?? Any advice at this point would be highly appreciated and valuable! Thanks in advance!! :) -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I find your description clear and confusing at the same time. For my benefit and maybe other people's benefit, I like to work on a better, clearer explanation of the design and a main analysis question. All that said, I don't know that I'll be of any help. Ok. So: a between effect in group 1 and group 2. Environment is a within effect for group 1 but also a between effect for group 1 and group 2.
Scenario is a within effect for both group 1 and group 2. Task is a within effect for both group 1 and group 2. Now let's suppose that you had (just one) one DV, which was time to complete task(i). In group 1 you have a three within factor analysis (environment by scenario by task) In group 2 you have a two within factor analysis (scenario by task) If you used both groups you could analyze environment 1=group 1 against environment 3=group 2 in a one between (environment), two within (scenario by task). You could say that group 2 has a missing environment level (level 4). GLM has a sum of square type 4 for the situation where there are missing cells (see page 835 in IBM SPSS Statistics 24 Command Syntax Reference). I have never had this situation. I can't help at all. But then you say, " The tasks were self-paced, so each participant executed each task a different number of times." So, that says to me that there's another within variable, call it "rep-time" and which, I don't think, is represented in your data, and which has missing data because some people did task 1 once within scenario 1 and others did it three times. So maybe something is missing/not explained. Lastly, what is the primary analysis question to answer? Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of tsps Sent: Wednesday, September 6, 2017 12:47 PM To: [hidden email] Subject: ANOVA / Regression in complex experiment I have been struggling lately with the statistical analysis of a complex experiment. The experiment was conducted in 3 environments, under 2 scenarios where each participant had to execute 3 tasks. Here's where it gets tricky.. The tasks were self-paced, so each participant executed each task a different number of times. Furthermore, only part of the experiment is within-subjects; i.e. the same 12 participants took part in Environments 1 and 2 and a different 11 participants took part in Environment 3 (feel free to ask me for any clarifications if my description does not give you a clear idea of the design). I am interested in investigating effect and interaction effect sizes as well as statistical significance between all factors (task, environment, scenario and participant) but am not sure what would be the right way to go about it. Is there a single ANOVA or other type of analysis (regression, Bayesian, etc.) that could be used in this case? This design is factorial, repeated measures and unbalanced at the same time. One approach would be to attempt pair-wise ANOVAs (one for the within-subject part and two for each one of those with the remaining). In that case, however, I can't see how I could combine effect sizes to interpret overall interactions.. I have tried used the mixed ANOVA model command but am not sure if that is the correct approach. I attach a small sample of my data so you can have a look or play around with it if you want. dataSample.csv <http://spssx-discussion.1045642.n5.nabble.com/file/t341315/dataSample.csv> The data are presented as follows Column 1: metric (in this case task completion times in sec.) Column 2: participant ID (ranging from 1 to 23 to denote the individuals that took part in the study - NOTE that the same 12 people were in environments 1 and 2 and a different set of 11 in environment 3) Column 3: task ID (ranging from 1 to 3 to denote the different tasks the participants performed during the experiment) Column 4: environment ID (ranging from 1 to 3 to denote the different environments the participants experienced during the experiment) Column 5: scenario ID (ranging from 1 to 2 to denote the different scenarios the participants experienced during the experiment). Any stats wiz out there who could help me have a good night's sleep?? Any advice at this point would be highly appreciated and valuable! Thanks in advance!! :) -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |