How to add records with missing data for missing observations

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to add records with missing data for missing observations

Steve West
Hope someone can help me with this.

I have a data set made up of observations of monthly patient visits to a
clinic over a five-year period of time. Patients came in about once every
four months to have their blood pressure checked, but this varies and some
patients have only a few visits while others have several. The data looks
like this, where V1-V3 are background variables, Month is number of months
since firs visit, and BP is blood pressure reading:

ID V1 V2 V3 Month BP
01 1  3  2  1     120
01 1  3  2  4     119
01 1  3  2  10    110
02 2  1  1  1     90
02 2  1  1  7     100
03 1  2  2  1     110
03 1  2  2  5     111
03 1  2  2  7     110
03 1  2  2  9     115


I need to restructure the data set so that each case has 60 records, one for
each month over the five-year period, by inserting records that has the
patient ID number,their background variables, the number for the month,  but
only has missing values for the Blood pressure reading for those months
where the patient did not come in for a visit.

Any suggestions for how to program this? The data file has about 600
patients, each with an average of 7 monthly observations.

Thanks.

Steve

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to add records with missing data for missing observations

Bruce Weaver
Administrator
Steve West wrote
Hope someone can help me with this.

I have a data set made up of observations of monthly patient visits to a
clinic over a five-year period of time. Patients came in about once every
four months to have their blood pressure checked, but this varies and some
patients have only a few visits while others have several. The data looks
like this, where V1-V3 are background variables, Month is number of months
since firs visit, and BP is blood pressure reading:

ID V1 V2 V3 Month BP
01 1  3  2  1     120
01 1  3  2  4     119
01 1  3  2  10    110
02 2  1  1  1     90
02 2  1  1  7     100
03 1  2  2  1     110
03 1  2  2  5     111
03 1  2  2  7     110
03 1  2  2  9     115


I need to restructure the data set so that each case has 60 records, one for
each month over the five-year period, by inserting records that has the
patient ID number,their background variables, the number for the month,  but
only has missing values for the Blood pressure reading for those months
where the patient did not come in for a visit.

Any suggestions for how to program this? The data file has about 600
patients, each with an average of 7 monthly observations.

Thanks.

Steve
The same issue came up in the "how to restructure data" thread earlier this month.  Richard Ristow gave a nice solution using XSAVE inside a LOOP.  For your situation, it would look something like this.

* Open your data file, keeping just the ID variable .

get file = "C:\temp\the_data.sav" / keep = id.

* Keep only the first record for each ID .
* The following assumes records are already sorted by ID .

select if ($casenum EQ 1) or (ID NE lag(ID)).
exe.

* Use LOOP and XSAVE to write out a file that
* has 60 rows per ID.  Note that XSAVE can only
* write to a disk-file, it cannot write to a data set.

loop Month = 1 to 60.
- xsave outfile = "C:\Temp\All_Months.sav" .
end loop.
exe.

* Open the file with 60 records per ID, and merge with the original file.

get file = "C:\Temp\All_Months.sav" .
match files
 file = * /
 file = "C:\temp\the_data.sav" /
 by ID month.
exe.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).