SPSSX Discussion

creating a fixed variable per subject ID in longitudinal dataset

Classic

List

Threaded

5 messages Options

Bonnie Solomon

creating a fixed variable per subject ID in longitudinal dataset

I found a way to do this, but suspect there is a more straightforward way.
I have a large data file where each subject ID appears in numerous rows (ie,
it's longitudinal data), and for each ID, a single row contains an important
value that I want to copy over into all other rows for the same ID. The
value of interest does not always appear in the Nth row per ID-- sometimes
it's the first row, sometimes the last, or anywhere in between. So the file
looks something like this (but with many more variables):

SUBJECT 1 - DESIRED VALUE
SUBJECT 1
SUBJECT 1
SUBJECT 2
SUBJECT 2
SUBJECT 2 - DESIRED VALUE
SUBJECT 2
SUBJECT 3
SUBJECT 3 - DESIRED VALUE

I ended up saving a copy of the file, deleting cases with missing values on
this particular variable, and then merging the two files, matching on
subject ID. That worked fine, but I think there must be a way to tell the
program that all rows with the same subject ID are related, and therefore to
copy a value into all related rows. Does anyone know if there is?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

JKRockStomper

Re: creating a fixed variable per subject ID in longitudinal dataset

SORT CASES BY desired_variable(A).
SORT CASES BY subject_id(A).
DO IF subject_id = LAG(subject_id).
COMPUTE desired_variable = LAG(desired_variable).
END IF.

Rich Ulrich

Re: creating a fixed variable per subject ID in longitudinal dataset

In reply to this post by Bonnie Solomon

AGGREGATE will let you save one value to all the records of an ID.

If you know this one by the fact that it is non-missing, and there
is only one value that is non-missing across the ID, then you can use
any of a bunch of functions. Make sure of that one-value assumption....

--
Rich Ulrich

> Date: Thu, 21 Jul 2011 14:22:22 -0400

> From: [hidden email]
> Subject: creating a fixed variable per subject ID in longitudinal dataset
> To: [hidden email]
>
> I found a way to do this, but suspect there is a more straightforward way.
> I have a large data file where each subject ID appears in numerous rows (ie,
> it's longitudinal data), and for each ID, a single row contains an important
> value that I want to copy over into all other rows for the same ID. The
> value of interest does not always appear in the Nth row per ID-- sometimes
> it's the first row, sometimes the last, or anywhere in between. So the file
> looks something like this (but with many more variables):
>
> SUBJECT 1 - DESIRED VALUE
> SUBJECT 1
> SUBJECT 1
> SUBJECT 2
> SUBJECT 2
> SUBJECT 2 - DESIRED VALUE
> SUBJECT 2
> SUBJECT 3
> SUBJECT 3 - DESIRED VALUE
>
> I ended up saving a copy of the file, deleting cases with missing values on
> this particular variable, and then merging the two files, matching on
> subject ID. That worked fine, but I think there must be a way to tell the
> program that all rows with the same subject ID are related, and therefore to
> copy a value into all related rows. Does anyone know if there is?
>

[snip]

Bruce Weaver

Re: creating a fixed variable per subject ID in longitudinal dataset

Administrator

I like Rich's suggestion. Here's an example:

data list list / id dv (2f5.0).
begin data
1 5
1
1
2
2
2 8
2
3
3 2
end data.

AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=id
/dv_first=FIRST(dv).

list.

If you want to overwrite the desired value variable (DV), then you have to change the AGGREGATE a bit, like this:

AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES
/BREAK=id
/dv=FIRST(dv).

HTH.

Rich Ulrich-2 wrote

AGGREGATE will let you save one value to all the records of an ID.

If you know this one by the fact that it is non-missing, and there
is only one value that is non-missing across the ID, then you can use
any of a bunch of functions. Make sure of that one-value assumption....

--
Rich Ulrich

> Date: Thu, 21 Jul 2011 14:22:22 -0400
> From: [hidden email]
> Subject: creating a fixed variable per subject ID in longitudinal dataset
> To: [hidden email]
>
> I found a way to do this, but suspect there is a more straightforward way.
> I have a large data file where each subject ID appears in numerous rows (ie,
> it's longitudinal data), and for each ID, a single row contains an important
> value that I want to copy over into all other rows for the same ID. The
> value of interest does not always appear in the Nth row per ID-- sometimes
> it's the first row, sometimes the last, or anywhere in between. So the file
> looks something like this (but with many more variables):
>
> SUBJECT 1 - DESIRED VALUE
> SUBJECT 1
> SUBJECT 1
> SUBJECT 2
> SUBJECT 2
> SUBJECT 2 - DESIRED VALUE
> SUBJECT 2
> SUBJECT 3
> SUBJECT 3 - DESIRED VALUE
>
> I ended up saving a copy of the file, deleting cases with missing values on
> this particular variable, and then merging the two files, matching on
> subject ID. That worked fine, but I think there must be a way to tell the
> program that all rows with the same subject ID are related, and therefore to
> copy a value into all related rows. Does anyone know if there is?
>
[snip]

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

David Marso

Re: creating a fixed variable per subject ID in longitudinal dataset

Administrator

MAX might be better than FIRST if dv is a string variable.

Bruce Weaver wrote

I like Rich's suggestion. Here's an example:

data list list / id dv (2f5.0).
begin data
1 5
1
1
2
2
2 8
2
3
3 2
end data.

AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=id
/dv_first=FIRST(dv).

list.

If you want to overwrite the desired value variable (DV), then you have to change the AGGREGATE a bit, like this:

AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES
/BREAK=id
/dv=FIRST(dv).

HTH.

Rich Ulrich-2 wrote

AGGREGATE will let you save one value to all the records of an ID.

If you know this one by the fact that it is non-missing, and there
is only one value that is non-missing across the ID, then you can use
any of a bunch of functions. Make sure of that one-value assumption....

--
Rich Ulrich

> Date: Thu, 21 Jul 2011 14:22:22 -0400
> From: [hidden email]
> Subject: creating a fixed variable per subject ID in longitudinal dataset
> To: [hidden email]
>
> I found a way to do this, but suspect there is a more straightforward way.
> I have a large data file where each subject ID appears in numerous rows (ie,
> it's longitudinal data), and for each ID, a single row contains an important
> value that I want to copy over into all other rows for the same ID. The
> value of interest does not always appear in the Nth row per ID-- sometimes
> it's the first row, sometimes the last, or anywhere in between. So the file
> looks something like this (but with many more variables):
>
> SUBJECT 1 - DESIRED VALUE
> SUBJECT 1
> SUBJECT 1
> SUBJECT 2
> SUBJECT 2
> SUBJECT 2 - DESIRED VALUE
> SUBJECT 2
> SUBJECT 3
> SUBJECT 3 - DESIRED VALUE
>
> I ended up saving a copy of the file, deleting cases with missing values on
> this particular variable, and then merging the two files, matching on
> subject ID. That worked fine, but I think there must be a way to tell the
> program that all rows with the same subject ID are related, and therefore to
> copy a value into all related rows. Does anyone know if there is?
>
[snip]

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"