(no subject)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

(no subject)

Suh-Ing Amy Hsieh
Hi listers:



I am so sorry for the prior
posted information. I do my best to explain it.



The original big claims data
(hospitalization) are “dd2001, dd2002, dd2003, dd2004, dd2005, and
dd2006.” They are monthly claims data and have same variables. If
patients were hospitalized longer than the monthly reporting date, the
claims data had ³ 1 record for the patients at the
same admission and discharge dates. I saw one patient (identified by id,
birthday, in_date, and out_date) who was hospitalized for ³ 1
year, the claims data had around 12 records (or lines or rows) at the same
date of admission (e.g., 20010101) and discharge (e.g., 20020202). In_date
is the admission date and the out_date is discharge date.







My target population is adults
(³ 18
years) with hematological cancers receiving bone marrow transplant (BMT)
from 2001 to 2005. First, I have selected hematological cancers from
dd2001 to dd2006 using ICD-9-CM diagnostic codes (from icd9cd to icd9cd4)
and added annual data set as DATA1. Second, I have limited the target
population to patients undergoing BMT using 10 ICD-9-CM procedure codes
(from icdopcd to icdopcd4). 10 ICD-9-CM procedure codes for BMT are from
4100 to 4109. Third,
I converted birthday and admission dates and calculated ages. Fourth, I
recoded age into 2 groups and selected age ³ 18 years old.  Fifth, I have created an index
dd2001_2006 using aggregating (selecting the first record and last record
and summing different fees) and merging functions (adding cases
again).  Thus, DATA1 is an
index dd2001_2006 and only 1 record per patient. If patients had received
2nd, 3rd, 4th, or subsequent BMT, those
variables will be added to the DATA1 using different names of
variables.  It is occasionally
hard to judge the admission date only for BMT due to coding problems so
that I need pre-BMT chemotherapy records for checking and making decisions
(exclude or not exclude patients).







2 outcomes are overall
survival (from Jan 1, 2001 to Dec 31, 2005) and 30 day readmission of
discharge. The variables of death and date of death have existed in the
DATA1 for several patients because patients have died during BMT. Thus,
the variables of overall survival for remaining patients, who survive
during BMT, will be obtained from dd2001 to dd2006. Also, the variable of
with readmission or without readmission will be obtained from dd2001 to
dd2006 again. Hence, I have created syntax for selecting those adult
patients undergoing BMT using their unique ID (32 length) and saved as
“DATA2.” However, data2 include all records (rows) with
respect to pre-, during, and post-BMT records.  I am thinking how to create syntax for keeping pre-BMT
chemotherapy records as one dataset and post-BMT records as one dataset or
dropping BMT records from DATA2.
The key variables for identifying pre-, during, or post-BMT are
each admission date and discharge date from dd2001 to dd2006, although
patients have same id and birthday. The in_date and out_date of pre-BMT
records occur before in_date and out_date of BMT procedures, whereas the
in_date and out_date of post-BMT occur after in_date and out_date of BMT
procedures. Please see below examples:







DATA1 (Index dd2001_2006
à only BMT records):



id
id_sex       birthday
in_date

1122ab33c5..
F       19580210
20011215




1134ac34c6..   M       19751122
20050719



2456b578ef..   F       19690516
20030113



ab2457cdg3..   M       19501030
20050413








out_date
e_bedd
tran_cd
icd9cd
icd9cd1
icdopcd




20020208
48
1       20500   6822    4103



20051130
134
4       20153   99685   4105



20030204
22
3       20021   2880    8607



20050720
98
3       20500   2880    9925







icdopcd1      dx_am      room_am       drug_am         med_am…
9925            11664      44160            315227          473461



8607      69120   904218     722973     2579172



4101      11897   137262     138717     378661



4105      40099   358053     831632     1482244







DATA2 (including pre-BMT, during BMT, and
post-BMT records):



id                          id_sex  birthday      in_date
out_date
2456b578ef..  F     19690516 20030113



2456b578ef..  F     19690516 20031025 20031204




2456b578ef..  F     19690516 20031025 20031204




1122ab33c5..  F     19580210 20030805 20031001



1122ab33c5..  F     19580210 20030805 20031001



1122ab33c5..  F     19580210 20011215 20020208




ab2457cdg3..  M     19501030 20050413 20050720




ab2457cdg3..  M     19501030 20050413 20050720



ab2457cdg3..  M     19501030 20050413 20050720



ab2457cdg3..  M     19501030 20050817 20051011




ab2457cdg3..  M     19501030 20050817 20051011







e_bedd
tran_cd
icd9cd
icd9cd1  icdopcd
22      3       20021   2880    8607



40      2       20400   2880    Blank



40      4       20400   486     0392



57      2       20500   03482   9925



1       3
20500   1975    9925



48      1       20500   6822    4101



49      2       20500   2880    9925



30      2       20500   2880    9925



19      3       20500   03842   3324



45      2       20500   Blank   Blank



10      5       20500   2880    9925







icdopcd1      dx_am      room_am       drug_am         med_am…
4101      11897   137262     138717     378661



Blank     9963    34155      59627      177133



9925      2184    7245       55737      88606




8607      15942   61320      237694     462431



8607      546     1095       0          2005



8607      69120   904218     722973     2579172



3893      16107   55125      364826     633212



3893      15075   196530     119210     471444



9925      10469   147747     80434      295218



Blank     13885   50625      190254     418414



Blank     3573    11250      95807      173013








The same color is stand for the same patients.
 Please show me how to create syntax for keeping pre-BMT and post-BMT
records as two separated files. Thank you so much.
Amy Hsieh

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for keeping or dropping records

Richard Ristow
(It is helpful always to use a subject line, and to keep the subject
line the same for follow-ups in the same thread.)

At 01:55 PM 12/19/2008, SUH-ING (AMY) HSIEH wrote:

>I am so sorry for the prior posted information. I do my best to explain it.

Here, I'm editing what you wrote, for readability (and greetings to
umaryland, since I'm briefly in the Washington area):

>The original big claims data (hospitalization) are "dd2001, dd2002,
>dd2003, dd2004, dd2005, and dd2006" They are monthly claims data and
>have same variables. If patients were hospitalized longer than the
>monthly reporting date, the claims data had >= 1 record for the
>patients at the same admission and discharge dates. I saw one
>patient (identified by id, birthday, in_date, and out_date) who was
>hospitalized for >= 1 year, the claims data had around 12 records
>(or lines or rows) at the same date of admission (e.g., 20010101)
>and discharge (e.g., 20020202). In_date is the admission date and
>the out_date is discharge date.
>
>My target population is adults (>= 18 years) with hematological
>cancers receiving bone marrow transplant (BMT) from 2001 to 2005.
>[Details of selection logic omitted.] DATA1 is an index dd2001_2006
>and only 1 record per patient.
>
>2 outcomes are overall survival (from Jan 1, 2001 to Dec 31, 2005)
>and 30 day readmission of discharge. The variables of death and date
>of death have existed in the DATA1 for several patients because
>patients have died during BMT. Thus, the variables of overall
>survival for remaining patients, who survive during BMT, will be
>obtained from dd2001 to dd2006 [of other records?]. Also, the
>variable of with readmission or without readmission will be obtained
>from dd2001 to dd2006 [of other admission records?] again. I have
>created syntax for selecting those adult patients undergoing BMT
>using their unique ID (32 length) and saved as "DATA2" However,
>data2 include all records (rows) with respect to pre-, during, and
>post-BMT records.  I am thinking how to create syntax for keeping
>pre-BMT chemotherapy records as one dataset and post-BMT records as
>one dataset or dropping BMT records from DATA2.
>
>The key variables for identifying pre-, during, or post-BMT are each
>admission date and discharge date from dd2001 to dd2006, although
>patients have same id and birthday. The in_date and out_date of
>pre-BMT records occur before in_date and out_date of BMT procedures,
>whereas the in_date and out_date of post-BMT occur after in_date and
>out_date of BMT procedures. Please see below examples:
========
The test data came through very, very badly unwrapped, not only with
every column head and datum on a separate line, but many additional
line breaks. (I wonder why that happens so often?) See if this is
easier to understand:

>DATA1 (Index dd2001_2006 [with?] only BMT records):
>
>  id           Id_sex  birthday    In_date     Out_date
>
>  1122ab33c5..    F    19580210    20011215    20020208
>  1134ac34c6..    M    19751122    20050719    20051130
>  2456b578ef..    F    19690516    20030113    20030204
>  ab2457cdg3..    M    19501030    20050413    20050720
>
>  E_bedd    Tran_cd Icd9cd    Icd9cd1 icdopcd Icdopcd1
>  48        1       20500      6822   4103    9925
>  134       4       20153      99685  4105    8607
>  22        3       20021      2880   8607    4101
>  98        3       20500      2880   9925    4105
>
>DATA2 (including pre-BMT, during BMT, and post-BMT records):
>
>  id           Id_sex  birthday In_date  Out_date
>  1122ab33c5.. F       19580210 20030805 20031001
>  1122ab33c5.. F       19580210 20030805 20031001
>  1122ab33c5.. F       19580210 20011215 20020208
>  1134ac34c6.. M       19751122 20050719 20051130
>  2456b578ef.. F       19690516 20030113 20030204
>  2456b578ef.. F       19690516 20031025 20031204
>  2456b578ef.. F       19690516 20031025 20031204
>  ab2457cdg3.. M       19501030 20050413 20050720
>  ab2457cdg3.. M       19501030 20050413 20050720
>  ab2457cdg3.. M       19501030 20050413 20050720
>  ab2457cdg3.. M       19501030 20050817 20051011
>  ab2457cdg3.. M       19501030 20050817 20051011
>
>  E_bedd    Tran_cd Icd9cd    Icd9cd1 icdopcd Icdopcd1
>  57        2       20500      03482   9925    8607
>  1         3       20500      1975    9925    8607
>  48        1       20500      6822    4103    9925
>  134       4       20153      99685   4105    8607
>  22        3       20021      2880    8607    4101
>  40        2       20400      2880    Blank   Blank
>  40        4       20400      486     0392    9925
>  49        2       20500      2880    9925    4105
>  30        2       20500      2880    9925    3893
>  19        3       20500      03842   3324    9925
>  45        2       20500      Blank   Blank   Blank
>  10        5       20500      2880    9925    Blank
>
>  Dx_am       Room_am       Drug_am    Med_am
>  15942       61320         237694    462431
>  546         1095          0         2005
>  69120       904218        722973    2579172
>  69120       904218        722973    2579172
>  11897       137262        138717    378661
>  9963        34155         59627     177133
>  2184        7245          55737     88608
>  16107       55125         364826    633212
>  15075       196530        119210    471444
>  10469       147747        80434     295218
>  13885       50625         190254    418414
>  3573        11250         95807     173013

It sounds like you want to attach data regarding the bone-marrow
transplant (BMT) from DATA1, to every record in DATA2, for selection
and comparison.

See how far this gets you. It's not tested, and I don't think I
understand everything:

  id           Id_sex  birthday    In_date     Out_date

  1122ab33c5..    F    19580210    20011215    20020208
  1134ac34c6..    M    19751122    20050719    20051130
  2456b578ef..    F    19690516    20030113    20030204
  ab2457cdg3..    M    19501030    20050413    20050720

GET FILE=DATA1
   /RENAME= (In_date  Out_date =
             BMT_InDt BMT_OutDt)
   /KEEP= id BMT_InDt BMT_OutDt.

MATCH FILES
   /TABLE=*
   /FILE =DATA2
   /BY   id.

Then, you have the BMT dates on all the admission records, and you
can compare for 'before' and 'after', etc.  I assume that you have
all your dates stored as SPSS date-format variables; if you don't, you should.

-Good luck, and onward,
  Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for keeping or dropping records

Maguin, Eugene
In reply to this post by Suh-Ing Amy Hsieh
Amy,

You should know that color coding does not come through to the list.

OK. I've rearranged what you posted in far more usable structure (see
below). To summarize how I now understand things. You have two files: Data1
and Data2, each made as you describe. Data1 is a file of patients meeting
your selection criteria and having one record per patient. That record is
for the bone marrow transplant (BMT) treatment. Data2 has multiple records
per patient, each record being an incident of chemotherapy. You want to
separate the chemotherapy incidents in Data2 into two groups based on the
BMT incident date in Data1.

I'm now going to assume that you are very skilled with spss. I think you can
do a match files using the table subcommand to match Data1 as the table file
to Data2 using ID. I think you need only a subset of the variables in Data1,
probably just ID and the in and out date variables. This little operation
explicitly assumes that you have exactly one record per patient in Data1 and
exactly one record in Data2 for each combination of ID and in and out date.
If you don't, then you have more trouble. Not insurmountable trouble, but
definitely more.

Once the match files is complete, you can compare in and out dates from the
Data2 records  against those from the Data1 records to identify pre and post
BMT incidents.

Does this help you?

Gene Maguin


****************************************
The examples of data are messy. So, I repost it again. The original big
claims data (hospitalization) are “dd2001, dd2002, dd2003, dd2004,
dd2005, and dd2006.” They are monthly claims data and have same
variables. If patients were hospitalized longer than the monthly reporting
date, the claims data had > 1 record for the patients at the same admission
and discharge dates. I saw one patient (identified by id, birthday, in_date,
and out_date) who was hospitalized for > 1 year, the claims data had around
12 records (or lines or rows) at the same date of admission (e.g., 20010101)
and discharge (e.g., 20020202). In_date is the admission date and the
out_date is discharge date.

My target population is adults (> 18 years) with hematological cancers
receiving bone marrow transplant (BMT) from 2001 to 2005. First, I have
selected hematological cancers from dd2001 to dd2006 using ICD-9-CM
diagnostic codes (from icd9cd to icd9cd4) and added annual data set as
DATA1. Second, I have limited the target population to patients undergoing
BMT using 10 ICD-9-CM procedure codes (from icdopcd to icdopcd4). 10
ICD-9-CM procedure codes for BMT are from 4100 to 4109. Third, I converted
birthday and admission dates and calculated ages. Fourth, I recoded age into
2 groups and selected age ³ 18 years old.  Fifth, I have created an index
dd2001_2006 using aggregating (selecting the first record and last record
and summing different fees) and merging functions (adding cases again).
Thus, DATA1 is an index dd2001_2006 and only 1 record per patient. If
patients had received 2nd, 3rd, 4th, or subsequent BMT, those variables will
be added to the DATA1 using different names of variables.  It is
occasionally hard to judge the admission date only for BMT due to coding
problems so that I need pre-BMT chemotherapy records for checking and making
decisions (exclude or not exclude patients).

2 outcomes are overall survival (from Jan 1, 2001 to Dec 31, 2005) and 30
day readmission of discharge. The variables of death and date of death have
existed in the DATA1 for several patients because patients have died during
BMT. Thus, the variables of overall survival for remaining patients, who
survive during BMT, will be obtained from dd2001 to dd2006. Also, the
variable of with readmission or without readmission will be obtained from
dd2001 to dd2006 again. Hence, I have created syntax for selecting those
adult patients undergoing BMT using their unique ID (32 length) and saved as
“DATA2.” However, data2 include all records (rows) with respect
to pre-, during, and post-BMT records.  I am thinking how to create syntax
for keeping pre-BMT chemotherapy records as one dataset and post-BMT records
as one dataset or dropping BMT records from DATA2.  The key variables for
identifying pre-, during, or post-BMT are each admission date and discharge
date from dd2001 to dd2006, although patients have same id and birthday. The
in_date and out_date of pre-BMT records occur before in_date and out_date of
BMT procedures, whereas the in_date and out_date of post-BMT occur after
in_date and out_date of BMT procedures. Please see below examples:


DATA1 (Index dd2001_2006 à only BMT records):

id Id_sex birthday In_date Out_date E_bedd Tran_cd Icd9cd Icd9cd1 icdopcd
Icdopcd1 Dx_am Room_am Drug_am Med_am
1122ab33c5.. F 19580210 20011215 20020208 48 1 20500 6822 4103 9925   11664
44160 315227 473461
1134ac34c6.. M 19751122 20050719 20051130 134 4 20153 99685 4105 8607 69120
904218 722973 2579172
2456b578ef.. F 19690516 20030113 20030204 22 3 20021 2880 8607 4101   11897
137262 138717 378661
ab2457cdg3.. M 19501030 20050413 20050720 98 3 20500 2880 9925 4105   40099
358053 831632 1482244


DATA2 (including pre-BMT, during BMT, and post-BMT records):

id Id_sex birthday In_date Out_date E_bedd Tran_cd Icd9cd Icd9cd1 icdopcd
Icdopcd1 Dx_am Room_am Drug_am Med_am
1122ab33c5.. F 19580210 20030805 20031001 57 2 20500 03482 9925 8607 15942
61320 237694 462431
1122ab33c5.. F 19580210 20030805 20031001 1 3 20500 1975 9925 8607 546 1095
0 2005
1122ab33c5.. F 19580210 20011215 20020208 48 1 20500 6822 4103 9925 69120
904218 722973 2579172
1134ac34c6.. M 19751122 20050719 20051130 134 4 20153 99685 4105 8607 69120
904218 722973 2579172
2456b578ef.. F 19690516 20030113 20030204 22 3 20021 2880 8607 4101 11897
137262 138717 378661
2456b578ef.. F 19690516 20031025 20031204 40 2 20400 2880 Blank Blank 9963
34155 59627 177133
2456b578ef.. F 19690516 20031025 20031204 40 4 20400 486 0392 9925 2184 7245
55737 88608
ab2457cdg3.. M 19501030 20050413 20050720 49 2 20500 2880 9925 4105 16107
55125 364826 633212
ab2457cdg3.. M 19501030 20050413 20050720 30 2 20500 2880 9925 3893 15075
196530 119210 471444
ab2457cdg3.. M 19501030 20050413 20050720 19 3 20500 03842 3324 9925 10469
147747 80434 295218
ab2457cdg3.. M 19501030 20050817 20051011 45 2 20500 Blank Blank Blank 13885
50625 190254 418414
ab2457cdg3.. M 19501030 20050817 20051011 10 5 20500 2880 9925 Blank 3573
11250 95807 173013

Please show me how to create syntax for keeping pre-BMT and post-BMT records
as two separated files. Thank you so much.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax for keeping or dropping records

Clive Downs
Hi Gene and Amy,

Assuming Gene's interpretation of the problem is correct, I have suggested
some syntax to do what I think is needed. I have used a highly simplified
version of the two datasets that I hope captures the essential part of the
problem. the second dataset has records of chemo only for patient 001.

The syntax should identify each chemo record as pre, during or post. You
can then filter as needed.

I hope this helps.

--------------------------------------------------------

* set up data 1 patients and BMT in and out dates.
*--------------------------------------------------------------------.

DATA LIST FREE/  id(A3)  BMin(DATE) BMout(DATE).
BEGIN DATA

001 1/Mar/2002 30/Jun/2002
002 1/Jun/2003 31/Dec/2003
003 1/Jan/2004 30/Nov/2004
END DATA.


SAVE OUTFILE='H:\SPSS-listserve\BMTdata.sav'
 /COMPRESSED.

* set up data 2 - same patients but chemo in and out dates , with comment
showing pre-, during- or post BMT.
*------------------------------------------------.

DATA LIST FREE/  id(A3)  Chemo_in(DATE) Chemo_out(DATE) comment(A6).

BEGIN DATA

001  1/Jan/2002  15/Jan/2002   pre
001  1/Apr/2002  15/Apr/2002    during
001  1/Jul/2002   15/Jul/2002    post
END DATA.

SAVE OUTFILE='H:\SPSS-listserve\Chemodata.sav'
 /COMPRESSED.
GET
  FILE='H:\SPSS-listserve\BMTdata.sav'.
DATASET NAME DataSet2 WINDOW=FRONT.

* match files to get BMT dates for each patient.
MATCH FILES /FILE=*
 /TABLE='DataSet2'
 /BY id.
EXECUTE.

* save resulting dataset with matched records.
SAVE OUTFILE='H:\SPSS-listserve\ChemoBMTmatched.sav'
 /COMPRESSED.

* work out if chemo is pre- during-  or post- BMT or exception (code 1, 2,
3, or 4).
*-----------------------------------------------------------------------.
DO IF Chemo_in <BMin AND Chemo_out < BMin.
COMPUTE time = 1.
ELSE IF Chemo_in > BMin AND Chemo_out < BMout.
COMPUTE time = 2.
ELSE IF Chemo_in > BMout.
COMPUTE time = 3.
ELSE.
COMPUTE time = 4.
END IF.
EXE.
*Define Variable Properties.
*time.
VALUE LABELS  time
     1  'pre'
     2  'during'
     3  'post'
     4 'exception'.
EXECUTE.

Regards

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD