Restructure Aggregated Data for t-test

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Restructure Aggregated Data for t-test

Ben Cohen
All:

I have pre-post data on from students without individual identifiers. I have
school, grade and year identifiers. I’m comparing levels of a bullying score
over time, using an age-cohorts analysis design.  My data are in this format
(following an aggregation of the raw data by school-grade-year):

school grade year cohort score
A 3 9 2 2.12
A 3 10 2 2.62
A 4 9 2 1.75
A 4 10 2 1.84
A 5 9 2 1.96
A 5 10 2 2.0
A 6 9 2 2.1
A 6 10 2 1.91
B 3 8 1 2.19
B 3 9 1 1.63
B 4 8 1 2.66
B 4 9 1 2.55

To perform the age cohort analysis, I need to compare for school A, grade 5
in year 9 to grade 6 in year 10. Or, for school B, grade 3 in year 8 to
grade 4 and year 9, etc., for several schools in the dataset.

To proceed, I’d like to get my data in this format to perform a statistical
test (t-test; anova):

School Score Year Grade Time
A 2.12 9 3 1
A 1.84 10 4 2
A 1.75 9 4 1
A 2.0 10 5 2
B 2.19 8 3 1
B 2.55 9 4 2


What SPSS steps might be used here to restructure the data? It's not simply
a recode on year, because for one cohort of students, time one was year=8,
while another cohort has time 1=9.

Thanks for any ideas!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructure Aggregated Data for t-test

David Marso
Administrator
Hi Ben,
There might be more elegant ways to achieve this but  I believe this does what you wish.
HTH, David
---
data list free / school (A) grade year cohort score.
begin data
A 3 9 2 2.12
A 3 10 2 2.62
A 4 9 2 1.75
A 4 10 2 1.84
A 5 9 2 1.96
A 5 10 2 2.0
A 6 9 2 2.1
A 6 10 2 1.91
B 3 8 1 2.19
B 3 9 1 1.63
B 4 8 1 2.66
B 4 9 1 2.55
end data

FORMATS grade year cohort (F2.0).
compute ID=$CASENUM.
save outfile "copy.sav".
add files / file * / file "copy.sav" /IN=time.
do if time.
compute grade=grade-1.
compute year=year-1.
end if.
sort cases by school grade year time.
match files / file * / by school grade year / first=top / last=bot.
select if NOT (top AND bot).
do if time.
compute grade=grade+1.
compute year=year+1.
end if.
compute time=time+1.
exe.
list var school score year grade time.


SCHOOL    SCORE YEAR GRADE TIME

A          2.12   9     3    1
A          1.84  10     4    2
A          1.75   9     4    1
A          2.00  10     5    2
A          1.96   9     5    1
A          1.91  10     6    2
B          2.19   8     3    1
B          2.55   9     4    2


Number of cases read:  8    Number of cases listed:  8
Ben Cohen wrote
All:

I have pre-post data on from students without individual identifiers. I have
school, grade and year identifiers. I’m comparing levels of a bullying score
over time, using an age-cohorts analysis design.  My data are in this format
(following an aggregation of the raw data by school-grade-year):

school grade year cohort score
A 3 9 2 2.12
A 3 10 2 2.62
A 4 9 2 1.75
A 4 10 2 1.84
A 5 9 2 1.96
A 5 10 2 2.0
A 6 9 2 2.1
A 6 10 2 1.91
B 3 8 1 2.19
B 3 9 1 1.63
B 4 8 1 2.66
B 4 9 1 2.55

To perform the age cohort analysis, I need to compare for school A, grade 5
in year 9 to grade 6 in year 10. Or, for school B, grade 3 in year 8 to
grade 4 and year 9, etc., for several schools in the dataset.

To proceed, I’d like to get my data in this format to perform a statistical
test (t-test; anova):

School Score Year Grade Time
A 2.12 9 3 1
A 1.84 10 4 2
A 1.75 9 4 1
A 2.0 10 5 2
B 2.19 8 3 1
B 2.55 9 4 2


What SPSS steps might be used here to restructure the data? It's not simply
a recode on year, because for one cohort of students, time one was year=8,
while another cohort has time 1=9.

Thanks for any ideas!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Restructure Aggregated Data for t-test

Bruce Weaver
Administrator
In reply to this post by Ben Cohen
Given the lack of individual identifiers, I suppose the unit of analysis is "class", is it?  (Beware the ecological fallacy.)

I assume grade 5 in year 9 and grade 6 in year 10 (both from School A) consist of mostly the same students, right?  So it would be a paired t-test (or repeated measures ANOVA), would it not?  If so, you'll need to follow up David's solution with restructuring via CASESTOVARS to get the paired means for each group onto the same row.  


Ben Cohen wrote
All:

I have pre-post data on from students without individual identifiers. I have
school, grade and year identifiers. I’m comparing levels of a bullying score
over time, using an age-cohorts analysis design.  My data are in this format
(following an aggregation of the raw data by school-grade-year):

school grade year cohort score
A 3 9 2 2.12
A 3 10 2 2.62
A 4 9 2 1.75
A 4 10 2 1.84
A 5 9 2 1.96
A 5 10 2 2.0
A 6 9 2 2.1
A 6 10 2 1.91
B 3 8 1 2.19
B 3 9 1 1.63
B 4 8 1 2.66
B 4 9 1 2.55

To perform the age cohort analysis, I need to compare for school A, grade 5
in year 9 to grade 6 in year 10. Or, for school B, grade 3 in year 8 to
grade 4 and year 9, etc., for several schools in the dataset.

To proceed, I’d like to get my data in this format to perform a statistical
test (t-test; anova):

School Score Year Grade Time
A 2.12 9 3 1
A 1.84 10 4 2
A 1.75 9 4 1
A 2.0 10 5 2
B 2.19 8 3 1
B 2.55 9 4 2


What SPSS steps might be used here to restructure the data? It's not simply
a recode on year, because for one cohort of students, time one was year=8,
while another cohort has time 1=9.

Thanks for any ideas!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Restructure Aggregated Data for t-test

Rich Ulrich
In reply to this post by Ben Cohen
I'm pretty sure that the what you *say*  that you want is not
nearly as useful as something that identifies the "initial grade"
and then gives the year number.  You want something that says
explicitly which lines go together.  To get that....

 - Aggregate, to add variables for the minimum of Grade and Year
(for the school) to each line.

Then you can subtract (and add 1) to get the Time = 1 or 2.

Then subtract (Time-1) from the Grade, to get the InitialGrade.
(A similar computation can give you InitialAge, if your
eventual design wants to look at that.)

 - The cases you want to drop will then be missing data for
either Time 1 or Time 2.

 -You don't have data for "t-tests" which you mentioned unless
you are ignoring the data for age and grade.  If the data are
available to estimate the "Within class" standard deviation,
and the correlation between years for individuals, *then*  you
would have data for paired t-tests for individual classes.

The analyses of repeated measures could use "only complete data,"
as in the file that you are trying to create. That lets you look
at changes.  If you also are interested in Grade as an effect
(which may or may not be wise), you probably could use the data
that you are currently intending to drop.


--
Rich Ulrich
----------------------------------------

> Date: Thu, 19 May 2011 12:55:19 -0400
> From: [hidden email]
> Subject: Restructure Aggregated Data for t-test
> To: [hidden email]
>
> All:
>
> I have pre-post data on from students without individual identifiers. I have
> school, grade and year identifiers. I’m comparing levels of a bullying score
> over time, using an age-cohorts analysis design. My data are in this format
> (following an aggregation of the raw data by school-grade-year):

[snip, initial list of data, without "Time"]

>
> To perform the age cohort analysis, I need to compare for school A, grade 5
> in year 9 to grade 6 in year 10. Or, for school B, grade 3 in year 8 to
> grade 4 and year 9, etc., for several schools in the dataset.
>
> To proceed, I’d like to get my data in this format to perform a statistical
> test (t-test; anova):
>
> School Score Year Grade Time
> A 2.12 9 3 1
> A 1.84 10 4 2
> A 1.75 9 4 1
> A 2.0 10 5 2
> B 2.19 8 3 1
> B 2.55 9 4 2
>



=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD