SPSSX Discussion

Data restructuring problem

Classic

List

Threaded

11 messages Options

Jennifer Thompson

Data restructuring problem

Dear SPSSers,

I have what I think is a fairly complicated (for me at least) data
restructuring problem and would be very grateful for any help or
suggestions.

I have 30+ files (one per experiment participant) with the same basic
structure, consisting of 5 variables:

1. Condition - which can be one of is one of 3 basic tasks: tapping in
synch with a sound (Synch), tapping without a sound (Unpaced) and tapping in
synch with a sound in preparation for Unpaced (Paced), each of which is done
seperately with the right hand (R), left hand (L) and with both together
(B), and repeated 3 times (so 27 different 'conditions'.)

2. RT - which is a timestamp for a response press or release

3. Response - which is 8 for a press and 8.5 for a release with the right
hand and 7 for a press / 7.5 for a release with the left hand

4. soundon - the timestamp for the onset of the pacing sound (not present
in the unpaced conditions)

5. soundoff - the timestamp for the offset of the pacing sound (not present
in the unpaced conditions)

The difficulty is that the timestamps occur consecutively in the file,
regardless of whether they are a press or release, or produced with the
right or left hand.

To begin to analyse the data I think I need to get it in a format where the
variables are something like this:

1. Condition

2. Press_R (press with right hand)

3. Release_R (release the right hand)

4. Press_L (press with left hand)

5. Release_L (release with right hand)

6. Sound onset

7. Sound offset

I've done this manually (!) for the first file, by sorting the data by
condition, response and RT and 'simply' copying / pasting the data as
appropriate, which took quite some time... There has to be an easier way to
do this, surely?

There is another problem. Some of the responses (releases only, from what I
can see so far) are missing, so the press / release RT columns don't always
match up. In this case, the 'release' RT will be a higher number than the
next 'press' RT, so I've been flagging this with the following syntax:

IF (Press_L < LAG(Release_L)) Error_L = 1 .

IF (Press_R < LAG(Release_R)) Error_R = 1 .

And then manually (copy/paste) lining up the data. But again, is there a
more 'automated' way I could do this?

Any thoughts / suggestions? I've struggled to explain the data but will
gladly send an example datafile if anyone would like to take a look.

Thanks for reading,

Jennifer

Beadle, ViAnn

Re: Data restructuring problem

There is information not specified here--what hypotheses are you testing, or in less formal terms, what are you trying to find out? This, in turn, will determine the types of analyses to perform. And, those in turn, will determine the appropriate data structure(s).

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jennifer Thompson
Sent: Monday, March 05, 2007 6:48 AM
To: [hidden email]
Subject: Data restructuring problem

Dear SPSSers,

I have what I think is a fairly complicated (for me at least) data
restructuring problem and would be very grateful for any help or
suggestions.

I have 30+ files (one per experiment participant) with the same basic
structure, consisting of 5 variables:

1. Condition - which can be one of is one of 3 basic tasks: tapping in
synch with a sound (Synch), tapping without a sound (Unpaced) and tapping in
synch with a sound in preparation for Unpaced (Paced), each of which is done
seperately with the right hand (R), left hand (L) and with both together
(B), and repeated 3 times (so 27 different 'conditions'.)

2. RT - which is a timestamp for a response press or release

3. Response - which is 8 for a press and 8.5 for a release with the right
hand and 7 for a press / 7.5 for a release with the left hand

4. soundon - the timestamp for the onset of the pacing sound (not present
in the unpaced conditions)

5. soundoff - the timestamp for the offset of the pacing sound (not present
in the unpaced conditions)

The difficulty is that the timestamps occur consecutively in the file,
regardless of whether they are a press or release, or produced with the
right or left hand.

To begin to analyse the data I think I need to get it in a format where the
variables are something like this:

1. Condition

2. Press_R (press with right hand)

3. Release_R (release the right hand)

4. Press_L (press with left hand)

5. Release_L (release with right hand)

6. Sound onset

7. Sound offset

I've done this manually (!) for the first file, by sorting the data by
condition, response and RT and 'simply' copying / pasting the data as
appropriate, which took quite some time... There has to be an easier way to
do this, surely?

There is another problem. Some of the responses (releases only, from what I
can see so far) are missing, so the press / release RT columns don't always
match up. In this case, the 'release' RT will be a higher number than the
next 'press' RT, so I've been flagging this with the following syntax:

IF (Press_L < LAG(Release_L)) Error_L = 1 .

IF (Press_R < LAG(Release_R)) Error_R = 1 .

And then manually (copy/paste) lining up the data. But again, is there a
more 'automated' way I could do this?

Any thoughts / suggestions? I've struggled to explain the data but will
gladly send an example datafile if anyone would like to take a look.

Thanks for reading,

Jennifer

Jennifer Thompson

Re: Data restructuring problem

Thanks for the comments, and I'm sorry that my initial post was vague. Here
is some more information:

The study is a comparison of neurological patients and healthy controls.
The task involves unimanual or bimanual finger tapping, which is either
paced with a tone (every 600 milliseconds) [the 'synch' condition] or
unpaced [the 'unpaced' condition], in which 10 pacing tones are initially
presented and the participant is required to continue tapping at the same
rate [performance on the initial pacing practice is also recorded as the
'paced' condition - although it isn't actually a condition as such].

I will be calculating the mean inter-tap interval and variability for paced
vs unpaced tapping, and unimanual vs bimanual tapping for each participant.
I was planning to do this using the LAG function, using something like:

COMPUTE Press_ITI_R = Press_R - LAG(Press_R)

I'll also be calculating the degree to which responses with either hand in
the bimanual condition are synchronised, by looking at the lag between
them.

The hypothesis is that the neurological patients will have a more variable
inter-tap interval, and that they will show less synchronisation of the two
hands when tapping bimanually. It is also hypothesised that the inter-tap
interval variability of neurological patients will be disproportionately
affected in the unpaced condition. We may also look at how the two groups
compare in the ability to actually synchronise their response with the
tones. The analysis will generally focus on the tap onset (press) but we'll
also be calculating the average response duration (hence the need for both
press and release reaction time data).

I hope this makes my problem more clear.

Many thanks,

Jennifer

On 3/5/07, Beadle, ViAnn <[hidden email]> wrote:

>
> There is information not specified here--what hypotheses are you testing,
> or in less formal terms, what are you trying to find out? This, in turn,
> will determine the types of analyses to perform. And, those in turn, will
> determine the appropriate data structure(s).
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Jennifer Thompson
> Sent: Monday, March 05, 2007 6:48 AM
> To: [hidden email]
> Subject: Data restructuring problem
>
> Dear SPSSers,
>
> I have what I think is a fairly complicated (for me at least) data
> restructuring problem and would be very grateful for any help or
> suggestions.
>
> I have 30+ files (one per experiment participant) with the same basic
> structure, consisting of 5 variables:
>
> 1. Condition - which can be one of is one of 3 basic tasks: tapping in
> synch with a sound (Synch), tapping without a sound (Unpaced) and tapping
> in
> synch with a sound in preparation for Unpaced (Paced), each of which is
> done
> seperately with the right hand (R), left hand (L) and with both together
> (B), and repeated 3 times (so 27 different 'conditions'.)
>
> 2. RT - which is a timestamp for a response press or release
>
> 3. Response - which is 8 for a press and 8.5 for a release with the
> right
> hand and 7 for a press / 7.5 for a release with the left hand
>
> 4. soundon - the timestamp for the onset of the pacing sound (not present
> in the unpaced conditions)
>
> 5. soundoff - the timestamp for the offset of the pacing sound (not
> present
> in the unpaced conditions)
>
> The difficulty is that the timestamps occur consecutively in the file,
> regardless of whether they are a press or release, or produced with the
> right or left hand.
>
>
> To begin to analyse the data I think I need to get it in a format where
> the
> variables are something like this:
>
> 1. Condition
>
> 2. Press_R (press with right hand)
>
> 3. Release_R (release the right hand)
>
> 4. Press_L (press with left hand)
>
> 5. Release_L (release with right hand)
>
> 6. Sound onset
>
> 7. Sound offset
>
> I've done this manually (!) for the first file, by sorting the data by
> condition, response and RT and 'simply' copying / pasting the data as
> appropriate, which took quite some time... There has to be an easier way
> to
> do this, surely?
>
> There is another problem. Some of the responses (releases only, from what
> I
> can see so far) are missing, so the press / release RT columns don't
> always
> match up. In this case, the 'release' RT will be a higher number than the
> next 'press' RT, so I've been flagging this with the following syntax:
>
> IF (Press_L < LAG(Release_L)) Error_L = 1 .
>
> IF (Press_R < LAG(Release_R)) Error_R = 1 .
>
> And then manually (copy/paste) lining up the data. But again, is there a
> more 'automated' way I could do this?
>
> Any thoughts / suggestions? I've struggled to explain the data but will
> gladly send an example datafile if anyone would like to take a look.
>
> Thanks for reading,
>
> Jennifer
>

Zdaniuk, Bozena

disappearing data

In reply to this post by Jennifer Thompson

Hello, I opened a data file in spss 14 that was created in earlier
version of spss and about a half of the cases did not appear in the
'data' window. The analyses still ran on all the cases even though I
could not see half of them. However, once I made a small change in the
file, the analyses started showing only the number of cases seen in the
file. The rest mysteriously disappeared... Any idea what happened and
how I can fix it? Thanks.
bozena

Bozena Zdaniuk, Ph.D.

University of Pittsburgh

UCSUR, 6th Fl.

121 University Place

Pittsburgh, PA 15260

Ph.: 412-624-5736

Fax: 412-624-4810

email: [hidden email]

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Jennifer Thompson
Sent: Monday, March 05, 2007 7:48 AM
To: [hidden email]
Subject: Data restructuring problem

Dear SPSSers,

I have what I think is a fairly complicated (for me at least) data
restructuring problem and would be very grateful for any help or
suggestions.

I have 30+ files (one per experiment participant) with the same basic
structure, consisting of 5 variables:

1. Condition - which can be one of is one of 3 basic tasks: tapping in
synch with a sound (Synch), tapping without a sound (Unpaced) and
tapping in
synch with a sound in preparation for Unpaced (Paced), each of which is
done
seperately with the right hand (R), left hand (L) and with both together
(B), and repeated 3 times (so 27 different 'conditions'.)

2. RT - which is a timestamp for a response press or release

3. Response - which is 8 for a press and 8.5 for a release with the
right
hand and 7 for a press / 7.5 for a release with the left hand

4. soundon - the timestamp for the onset of the pacing sound (not
present
in the unpaced conditions)

5. soundoff - the timestamp for the offset of the pacing sound (not
present
in the unpaced conditions)

The difficulty is that the timestamps occur consecutively in the file,
regardless of whether they are a press or release, or produced with the
right or left hand.

To begin to analyse the data I think I need to get it in a format where
the
variables are something like this:

1. Condition

2. Press_R (press with right hand)

3. Release_R (release the right hand)

4. Press_L (press with left hand)

5. Release_L (release with right hand)

6. Sound onset

7. Sound offset

I've done this manually (!) for the first file, by sorting the data by
condition, response and RT and 'simply' copying / pasting the data as
appropriate, which took quite some time... There has to be an easier
way to
do this, surely?

There is another problem. Some of the responses (releases only, from
what I
can see so far) are missing, so the press / release RT columns don't
always
match up. In this case, the 'release' RT will be a higher number than
the
next 'press' RT, so I've been flagging this with the following syntax:

IF (Press_L < LAG(Release_L)) Error_L = 1 .

IF (Press_R < LAG(Release_R)) Error_R = 1 .

And then manually (copy/paste) lining up the data. But again, is there
a
more 'automated' way I could do this?

Any thoughts / suggestions? I've struggled to explain the data but will
gladly send an example datafile if anyone would like to take a look.

Thanks for reading,

Jennifer

Anton Balabanov

Stat question: means ratio test

Dear Listers,

I feel stucked with the following question.
Suppose, I have two samples from independent populations. And I whant to
test the hypothesis that one mean is as twice as the other mean, that is:
H0: m1 = 2*m2.

It seems, doubling values for the second sample (before using standard
t-test) is not correct practice, because it will affect variance.
Is there a simple way to do such a test, and if "yes", how can I manage it
in SPSS?

Many thanks in advance,

Anton

Ornelas, Fermin

Re: Stat question: means ratio test

Correct me if my interpretation of this is not right?
But you do not go and alter the means of the second sample. What you
should do is calculate their means as they are and do the test to verify
that the means do not met the condition that you just mentioned.

That is you Ho: m1_sample1= 2*(m2_sample2);
Ha: m1_sample1 <> 2*(m2_sample2);
Then use the two tail t-table values to see if you results validate or
fail to reject Ho.

Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Anton Balabanov
Sent: Monday, March 05, 2007 8:17 AM
To: [hidden email]
Subject: Stat question: means ratio test

Dear Listers,

I feel stucked with the following question.
Suppose, I have two samples from independent populations. And I whant to
test the hypothesis that one mean is as twice as the other mean, that
is:
H0: m1 = 2*m2.

It seems, doubling values for the second sample (before using standard
t-test) is not correct practice, because it will affect variance.
Is there a simple way to do such a test, and if "yes", how can I manage
it
in SPSS?

Many thanks in advance,

Anton

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the
specific individual(s) to whom it is addressed. It may contain
information that is privileged and confidential under state and federal
law. This information may be used or disclosed only in accordance with
law, and you may be subject to penalties under law for improper use or
further disclosure of the information in this e-mail and its
attachments. If you have received this e-mail in error, please
immediately notify the person named above by reply e-mail, and then
delete the original e-mail. Thank you.

Ornelas, Fermin

Re: Stat question: means ratio test

In reply to this post by Anton Balabanov

Note also that your example is just increasing the second mean by a
constant. For examples on how to do this type of testing look at Netter,
et. Al. Applied Linear Regression in the ANOVA Chapters for examples on
means testing similar to the one you are doing.

Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Anton Balabanov
Sent: Monday, March 05, 2007 8:17 AM
To: [hidden email]
Subject: Stat question: means ratio test

Dear Listers,

I feel stucked with the following question.
Suppose, I have two samples from independent populations. And I whant to
test the hypothesis that one mean is as twice as the other mean, that
is:
H0: m1 = 2*m2.

It seems, doubling values for the second sample (before using standard
t-test) is not correct practice, because it will affect variance.
Is there a simple way to do such a test, and if "yes", how can I manage
it
in SPSS?

Many thanks in advance,

Anton

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the
specific individual(s) to whom it is addressed. It may contain
information that is privileged and confidential under state and federal
law. This information may be used or disclosed only in accordance with
law, and you may be subject to penalties under law for improper use or
further disclosure of the information in this e-mail and its
attachments. If you have received this e-mail in error, please
immediately notify the person named above by reply e-mail, and then
delete the original e-mail. Thank you.

Anton Balabanov

Re: Stat question: means ratio test

In reply to this post by Ornelas, Fermin

Ah... Did you mean, I should calculate all the statistics (means and
standard deviations) and then compute t-statistic "manually", substituting
sample mean m2 with 2*m2?

Indeed, it seems enough for me. I need to do it only once and do not need
more general solution. Thank you very much!

> -----Original Message-----
> From: Ornelas, Fermin [mailto:[hidden email]]
> Sent: Monday, March 05, 2007 6:31 PM
> To: Anton Balabanov; [hidden email]
> Subject: RE: Stat question: means ratio test
>
> Correct me if my interpretation of this is not right?
> But you do not go and alter the means of the second sample.
> What you should do is calculate their means as they are and
> do the test to verify that the means do not met the condition
> that you just mentioned.
>
> That is you Ho: m1_sample1= 2*(m2_sample2);
> Ha: m1_sample1 <> 2*(m2_sample2); Then use the
> two tail t-table values to see if you results validate or
> fail to reject Ho.
>
>
> Fermin Ornelas, Ph.D.
> Management Analyst III, AZ DES
> Tel: (602) 542-5639
> E-mail: [hidden email]
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]]
> On Behalf Of Anton Balabanov
> Sent: Monday, March 05, 2007 8:17 AM
> To: [hidden email]
> Subject: Stat question: means ratio test
>
> Dear Listers,
>
> I feel stucked with the following question.
> Suppose, I have two samples from independent populations. And
> I whant to test the hypothesis that one mean is as twice as
> the other mean, that
> is:
> H0: m1 = 2*m2.
>
> It seems, doubling values for the second sample (before using standard
> t-test) is not correct practice, because it will affect variance.
> Is there a simple way to do such a test, and if "yes", how
> can I manage it in SPSS?
>
> Many thanks in advance,
>
> Anton
>
> NOTICE: This e-mail (and any attachments) may contain
> PRIVILEGED OR CONFIDENTIAL information and is intended only
> for the use of the specific individual(s) to whom it is
> addressed. It may contain information that is privileged and
> confidential under state and federal law. This information
> may be used or disclosed only in accordance with law, and you
> may be subject to penalties under law for improper use or
> further disclosure of the information in this e-mail and its
> attachments. If you have received this e-mail in error,
> please immediately notify the person named above by reply
> e-mail, and then delete the original e-mail. Thank you.
>

Ornelas, Fermin

Re: Stat question: means ratio test

In reply to this post by Anton Balabanov

Yes, that is the way to proceed. Once you get your means multiply the
second mean by the constant and calculate your t-test, then compute the
t-table values for a two tail test (assuming that you are only
interested in the means not being equal to such values) with the
corresponding degrees of freedom and the confidence level alpha.

Anytime, ...

Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]

-----Original Message-----
From: Anton Balabanov [mailto:[hidden email]]
Sent: Monday, March 05, 2007 8:53 AM
To: Ornelas, Fermin; [hidden email]
Subject: RE: Stat question: means ratio test

Ah... Did you mean, I should calculate all the statistics (means and
standard deviations) and then compute t-statistic "manually",
substituting
sample mean m2 with 2*m2?

Indeed, it seems enough for me. I need to do it only once and do not
need
more general solution. Thank you very much!

Ornelas, Fermin

Re: Stat question: means ratio test

In reply to this post by Anton Balabanov

I see the point of you question. For example in ANOVA analysis when
there are multiple tests using SAS one can actually manually figure it
out what hypotheses about the means one is interested in testing and
these are put as statements in the program then once the output comes
out the testing of the results is conducted against the table values.

Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
Tel: (602) 542-5639
E-mail: [hidden email]

-----Original Message-----
From: Anton Balabanov [mailto:[hidden email]]
Sent: Monday, March 05, 2007 8:53 AM
To: Ornelas, Fermin; [hidden email]
Subject: RE: Stat question: means ratio test

Ah... Did you mean, I should calculate all the statistics (means and
standard deviations) and then compute t-statistic "manually",
substituting
sample mean m2 with 2*m2?

Indeed, it seems enough for me. I need to do it only once and do not
need
more general solution. Thank you very much!

Richard Ristow

Re: Data restructuring problem

In reply to this post by Jennifer Thompson

At 07:47 AM 3/5/2007, Jennifer Thompson wrote:

>I have files consisting of 5 variables:
>
>1. Condition - which can be one of is one of 3 basic tasks: tapping
>in synch with a sound (Synch), tapping without a sound (Unpaced) and
>tapping in synch with a sound in preparation for Unpaced (Paced), each
>of which is done seperately with the right hand (R), left hand (L) and
>with both together (B), and repeated 3 times (so 27 different
>'conditions'.)
>
>2. RT - which is a timestamp for a response press or release
>
>3. Response - which is 8 for a press and 8.5 for a release with the
>right hand and 7 for a press / 7.5 for a release with the left hand
>
>4. soundon - the timestamp for the onset of the pacing sound (not
>present in the unpaced conditions)
>
>5. soundoff - the timestamp for the offset of the pacing sound (not
>present in the unpaced conditions)
>
>The the timestamps occur consecutively in the file, regardless of
>whether they are a press or release, or produced with the right or
>left hand.
>
>I need to get it in a format where the variables are something like
>this:
>
>1. Condition
>2. Press_R (press with right hand)
>3. Release_R (release the right hand)
>4. Press_L (press with left hand)
>5. Release_L (release with right hand)
>6. Sound onset
>7. Sound offset

This is a bit of a bear, and I don't understand all your needs.

It looks like an experimental session is a run of what I'll call
'ticks', not being able to think of a better word. The 'ticks' are
about 600 msec apart (from your reply to Viann Beadle). A tick includes
various of the following 'events' you've listed as 2-7, above. Some
events may be missing inadvertently (you wrote that a press may be
missing), some by the experimental condition (the pacing sound may not
be being used).

Presently, you have separates records for all presses and releases,
giving their times; the times of sound onset and offset are in the
press-release records.

I'd like to have some test data to play with, covering at least several
'ticks'. But it sounds like the approach is to,
a. Assign each record to some 'tick'. Perhaps, any event more than
400msec after the preceding one is part of a new 'tick'.
b. Probably, use VARSTOCASES or XSAVE to create separate records for
the sound onset and offset times
c. Code records by type, regardless of the order they appear in within
a tick.
d. Sort records by tick number and type
e. Use CASESTOVARS to get the records you want.

Sound good?

-Good luck,
Richard