|
Hi guys, I need some help here.
I have a dataset with hospitalization events (1 per row) with the date that the patient was admitted (EntryDate) and discharged (OutDate). The situation is that the hospitals "break" these dates in order to charge the health operators time by time. So, I need to correct the dates of the rows that are for the same patient and have sequential dates or are between the dates of another event. Below I paste an example: Patient ID EntryDate OutDate 0000001 14/nov/14 19/nov/14 0000001 14/nov/14 14/nov/14 0000001 15/nov/14 15/nov/14 0000001 19/nov/14 22/nov/14 These 4 rows represent the same event (you can see that every date is sequential - its starts on 14/nov/14 and goes until 22/nov/14), but it was broken by the hospital in order to charge the health operators more often. One of the solutions I thought was to compare one row with the previous row to check if the EntryDate is between the previous EntryDate and OutDate OR if the EntryDate is equal or ate least have only one day of difference of the previous OutDate. The problem is that this logical test fails in the last row, because I am only able to correct the one date variable at a time. Can someone help me with this? The database has more than 1 million rows so manually correcting is impossible. I'm using SPSS version 24. Thanks! -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Felipe It might work to use the aggregate function to create min max entry dates . These dates with then show up on each record for a given patient thus you do not have to use the lag function to compare the previous records. Maybe something like this would work. Not sure about the details but maybe this is a start. SORT CASES BY PT_ID. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /PRESORTED /BREAK=PT_ID /Entry_min=MIN(entrydate) /Entry_max=MAX(entrydate). Bill On Sat, May 5, 2018 at 9:31 PM, FelipeMS <[hidden email]> wrote: Hi guys, I need some help here. William N. Dudley, PhD 437-L Coleman BuildingProfessor - Public Health Education The School of Health and Human Sciences The University of North Carolina at Greensboro Greensboro, NC 27402-6170 See my research on ResearchGate VOICE 336.256 2475 ![]() |
|
Aggregate will work if you assume a patient ID is unique. If a single patient
can have multiple stays though AGGREGATE will not quite work. See the example below and one potential option for code to work around it -- I imagine though you can find some tricky cases that this does not work for (especially with missing data). *****************************************************. DATA LIST FREE /PatientID (N7) EntryDate OutDate (2DATE8). BEGIN DATA 0000001 14/nov/14 19/nov/14 0000001 14/nov/14 14/nov/14 0000001 15/nov/14 15/nov/14 0000001 19/nov/14 22/nov/14 0000001 25/nov/14 30/nov/14 END DATA. DATASET NAME TestDays. DATASET ACTIVATE TestDays. *Note case #5 should be a separate set of days. SORT CASES BY PatientID EntryDate OutDate. DO IF PatientID <> LAG(PatientID) OR $casenum = 1. COMPUTE Set = 1. COMPUTE End = OutDate. ELSE. *Due to sort, this rows EntryDate should be larger. *Or equal to prior rows EntryDate. DO IF EntryDate <= LAG(End). COMPUTE Set = LAG(Set). COMPUTE End = MAX(LAG(End),OutDate). ELSE. COMPUTE Set = LAG(Set) + 1. COMPUTE End = OutDate. END IF. END IF. EXECUTE. *Before this will want to make sure OutDate >= EntryDate. *And that there is no missing data. *Now we can aggregate up to the different sets of days per each patient. DATASET DECLARE Stays. AGGREGATE OUTFILE='Stays' /BREAK PatientID Set /BeginTime = MIN(EntryDate) /EndTime = MAX(OutDate). *****************************************************. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by William Dudley-2
Hi William, Many thanks for sparing some time to try to solve my problem!Felipe 2018-05-06 15:53 GMT-03:00 William Dudley <[hidden email]>:
|
|
In reply to this post by Andy W
Hi Andy,
Yes, I have multiple events for the same patient. I'm defining separated events as those that have more than 1 day of difference. I will try your output later and I will provide here a feedback, but many thanks for trying to help too! -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
In reply to this post by Andy W
Here is a similar approach. Probably similar caveats ;-)
-- DATA LIST FREE /PatientID (N7) EntryDate OutDate (2DATE9). BEGIN DATA 0000001 14/nov/14 19/nov/14 0000001 14/nov/14 14/nov/14 0000001 15/nov/14 15/nov/14 0000001 19/nov/14 22/nov/14 0000001 25/nov/14 30/nov/14 0000002 14/nov/14 19/nov/14 0000002 14/nov/14 14/nov/14 0000002 15/nov/14 15/nov/14 0000002 19/nov/14 22/nov/14 0000002 25/nov/14 30/nov/14 END DATA. DATASET NAME TestDays. DATASET ACTIVATE TestDays. *Note cases #5 and 10 should be a separate sets of days. SORT CASES BY PatientID EntryDate OutDate. IF ($CASENUM EQ 1 OR PatientID NE LAG(PatientID)) Group=1. IF PatientID EQ LAG(PatientID) Group=SUM(LAG(Group),DATEDIFF(EntryDate,LAG(OutDate),"days") GT 1). AGGREGATE OUTFILE * MODE ADDVARIABLES /BREAK PatientID Group /NewEntry =MIN(EntryDate)/NewOut=MAX(OutDate). IF PatientID EQ LAG(PatientID) Group=SUM(LAG(Group),DATEDIFF(NewEntry,LAG(NewOut),"days") GT 1). AGGREGATE OUTFILE * MODE ADDVARIABLES OVERWRITE=YES /BREAK PatientID Group /NewEntry =MIN(EntryDate)/NewOut=MAX(OutDate). Andy W wrote > Aggregate will work if you assume a patient ID is unique. If a single > patient > can have multiple stays though AGGREGATE will not quite work. See the > example below and one potential option for code to work around it -- I > imagine though you can find some tricky cases that this does not work for > (especially with missing data). > > *****************************************************. > DATA LIST FREE /PatientID (N7) EntryDate OutDate (2DATE8). > BEGIN DATA > 0000001 14/nov/14 19/nov/14 > 0000001 14/nov/14 14/nov/14 > 0000001 15/nov/14 15/nov/14 > 0000001 19/nov/14 22/nov/14 > 0000001 25/nov/14 30/nov/14 > END DATA. > DATASET NAME TestDays. > DATASET ACTIVATE TestDays. > *Note case #5 should be a separate set of days. > > SORT CASES BY PatientID EntryDate OutDate. > DO IF PatientID <> LAG(PatientID) OR $casenum = 1. > COMPUTE Set = 1. > COMPUTE End = OutDate. > ELSE. > *Due to sort, this rows EntryDate should be larger. > *Or equal to prior rows EntryDate. > DO IF EntryDate <= LAG(End). > COMPUTE Set = LAG(Set). > COMPUTE End = MAX(LAG(End),OutDate). > ELSE. > COMPUTE Set = LAG(Set) + 1. > COMPUTE End = OutDate. > END IF. > END IF. > EXECUTE. > *Before this will want to make sure OutDate >= EntryDate. > *And that there is no missing data. > > *Now we can aggregate up to the different sets of days per each patient. > DATASET DECLARE Stays. > AGGREGATE OUTFILE='Stays' > /BREAK PatientID Set > /BeginTime = MIN(EntryDate) > /EndTime = MAX(OutDate). > *****************************************************. > > > > ----- > Andy W > apwheele@ > http://andrewpwheeler.wordpress.com/ > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
| Free forum by Nabble | Edit this page |
