At 12:30 PM 1/31/2011, Ariel Barak wrote:
>I worked with juvenile detention center data too. Here is Richard >Ristow's solution, which worked perfectly for me - I'm delighted that solution worked, and that it worked for Eugene as well -- and that it's still around, and remembered. (I'd forgotten it myself.) As Ariel noted, the code I wrote creates a scratch-file record for every inmate or patient, for every day in the institution. If I recall correctly, the original need included making a daily census for the institution, which required all these records. However, if the data includes many patients and stays are typically long, a file with records for all patient-days may be inconveniently large. 'Unrolling' the data is still a good approach, but the code could write records for every patient, for every MONTH they are in the institution. Its logic would be more complicated, since it's harder to loop through months than through days and you'd have to add code to compute the patient-days each month, but it would be perfectly manageable. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Richard, Thank you for the reply. I have the very situation of which you speak--a very large dataset (over 15,000 youth in residential juvenile settings) with admission and discharge dates for the past 18 years. I don't need (and frankly don't anticipate a need) for a daily census. On the other hand, I do have a need for the number of days per month that each youth is in residence. You mention that writing the code for records per month would be complicated, but manageable. Do you (or anyone else) have any suggestions for how to modify your original code to do that? Eugene -- Eugene Wang, Ph.D. Assistant Professor Educational Psychology Texas Tech University On Tuesday, February 8, 2011 at 5:52 PM, Richard Ristow wrote:
|
At 07:22 PM 2/8/2011, Eugene Wang wrote:
>I have a very large dataset with admission and discharge dates for >the past 18 years. I don't need a daily census. On the other hand, >I need the number of days per month that each youth is in residence. > >You mention that writing the code for records per month would be >complicated, but manageable. Good. I'll admit I partly wanted an excuse to work it out. The following is tested, except that the two lines using DATESUM weren't actually run; see the Appendix for the SPSS 14 code I had to use instead. With this data, for five residents over four months: |-----------------------------|---------------------------| |Output Created |09-FEB-2011 16:53:45 | |-----------------------------|---------------------------| [TestData] ID NAME AdmtDate DschDate 101 Arthur Alpha 17-NOV-2010 08-DEC-2010 102 Bill Beta 28-NOV-2010 14-JAN-2011 103 George Gamma 20-DEC-2010 02-FEB-2011 104 Dick Delta 06-JAN-2011 15-JAN-2011 105 Edward Epsilon 20-JAN-2011 . Number of cases read: 5 Number of cases listed: 5 here's how it worked out: * ... Cutoff date: Should be the latest date covered by the . * input dataset. It is used as the ending date where there . * is no discharge date recorded. . * (Giving a data-specific value here is "data in code", . * which is poor practice in general.) . NUMERIC #CutOff (DATE11). COMPUTE #CutOff = DATE.MDY(02,08,2011). * ... Output variables: . NUMERIC Month (MOYR8) DaysIn (F3) Admit (F2) Dischg (F2). VAR LABELS Month 'Calendar month' DaysIn 'Days in residence, this calendar month' Admit 'Admitted in this calendar month?' Dischg 'Discharged in this calendar month?'. VAL LABELS Admit Dischg 1 'Yes' 0 'No'. * ... Working variables . NUMERIC #FrstDy /* First day of the month */ #LastDy /* Last day of the month */ #NextMo /* First day of the following month */ #EndDate /* Person's last residency day */ (DATE11). * ... Start: . * The first output record is for the month of admission . COMPUTE Month = DATE.MOYR(XDATE.MONTH(AdmtDate), XDATE.YEAR (AdmtDate)). * The last day of residency: discharge date, or cutoff date . COMPUTE #EndDate = #CutOff. IF NOT MISSING(DschDate) #EndDate = DschDate. * ... Repeat: Calculate the length of residency within each month,. * and whether admission and discharge fall within that month;. * and write the results to the by-month output file. . LOOP. * - The first and last days of the current month: . * (The following works because a month and its first day have . * the same numerical value, as SPSS dates) . . COMPUTE #FrstDy = Month. * Find the first day of the following month, then the last . * day of the current month. (This works whether or not the . * two months are in the same year.) . * (NEXT TWO LINES NOT TESTED) . . COMPUTE #NextMo = DATESUM(#FrstDy, 1,"months"). . COMPUTE #LastDy = DATESUM(#NextMo,-1,"days"). * - Length of residency within current month, and whether . * admission and discharge are within the month: . . COMPUTE DaysIn = CTIME.DAYS( MIN(#EndDate,#LastDy) - MAX(AdmtDate,#FrstDy)) + 1. . COMPUTE Admit = 0. . COMPUTE Dischg = 0. . IF RANGE(AdmtDate,#FrstDy,#LastDy) Admit = 1. . IF NOT MISSING(DschDate) AND RANGE(DschDate,#FrstDy,#LastDy) Dischg = 1. * - Write the output record: . . XSAVE OUTFILE= MonthLoop /KEEP= ID NAME Month DaysIn Admit Dischg AdmtDate DschDate. * - Step to the next month, and continue if the person was . * still in residence in that month: . . COMPUTE Month = #NextMo. END LOOP IF Month GT #EndDate. EXECUTE /* This is one of the few times EXECUTE is really needed */ . * ... The records for people, separated by calendar month: . GET FILE=MonthLoop. DATASET NAME MonthRecs WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |09-FEB-2011 16:53:50 | |-----------------------------|---------------------------| [MonthRecs] ID NAME Month DaysIn Admit Dischg AdmtDate DschDate 101 Arthur Alpha NOV 2010 14 1 0 17-NOV-2010 08-DEC-2010 101 Arthur Alpha DEC 2010 8 0 1 17-NOV-2010 08-DEC-2010 102 Bill Beta NOV 2010 3 1 0 28-NOV-2010 14-JAN-2011 102 Bill Beta DEC 2010 31 0 0 28-NOV-2010 14-JAN-2011 102 Bill Beta JAN 2011 14 0 1 28-NOV-2010 14-JAN-2011 103 George Gamma DEC 2010 12 1 0 20-DEC-2010 02-FEB-2011 103 George Gamma JAN 2011 31 0 0 20-DEC-2010 02-FEB-2011 103 George Gamma FEB 2011 2 0 1 20-DEC-2010 02-FEB-2011 104 Dick Delta JAN 2011 10 1 1 06-JAN-2011 15-JAN-2011 105 Edward Epsilon JAN 2011 12 1 0 20-JAN-2011 . 105 Edward Epsilon FEB 2011 8 0 0 20-JAN-2011 . Number of cases read: 11 Number of cases listed: 11 * ... To illustrate possibilities, a summary by calendar month: . DATASET DECLARE MonthSmry. AGGREGATE OUTFILE= MonthSmry /BREAK = Month /NRes 'Number of residents during month' = NU /ResDays 'Total number of resident days' = SUM(DaysIn) /ResAdmt 'Number of residents admitted' = SUM(Admit) /ResDsch 'Number of residents discharged' = SUM(Dischg). DATASET ACTIVATE MonthSmry WINDOW=FRONT. FORMATS ResDays (F6) ResAdmt ResDsch (F3). LIST. List |-----------------------------|---------------------------| |Output Created |09-FEB-2011 16:53:51 | |-----------------------------|---------------------------| [MonthSmry] Month NRes ResDays ResAdmt ResDsch NOV 2010 2 17 2 0 DEC 2010 3 51 1 1 JAN 2011 4 67 2 2 FEB 2011 2 10 0 1 Number of cases read: 4 Number of cases listed: 4 ============================================= APPENDIX: Test data, and code as actually run ============================================= * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z-2011\ . * 2011-02-08 Wang-Number of days in hospital by calendar month.SPS . * In response to posting . * Date: Tue, 8 Feb 2011 18:22:07 -0600 . * From: Eugene Wang <[hidden email]> . * Subject: Re: Number of days in hospital by calendar month . * To: [hidden email] . * I'd written, . * "If data includes many patients and stays are typically long, a . * file with records for all patient-days may be inconveniently . * large. . * . * Code could write records for every patient, for every MONTH . * they are in the institution." . * . * to which he replied, . * "I have the very situation of which you speak. I have over . * 15,000 youth in residential juvenile settings, with admission . * and discharge dates for the past 18 years. I don't need a . * daily census. I do need the number of days per month that each . * youth is in residence. . * . * You mention that writing the code for records per month would . * be complicated, but manageable. Do you have any suggestions for . * how to do that?" . * .................................................................. . * .................. Test data ..................... . DATA LIST LIST/ ID NAME AdmtDate DschDate (N3, A15, DATE11, DATE11). BEGIN DATA 101 Arthur_Alpha 17-NOV-2010 8-DEC-2010 102 Bill_Beta 28-NOV-2010 14-Jan-2011 103 George_Gamma 20-DEC-2010 2-Feb-2011 104 Dick_Delta 6-Jan-2011 15-Jan-2011 105 Edward_Epsilon 20-Jan-2011 END DATA. * Ignore warning 1116, above. It refers to the fifth discharge date . * being missing, which is intended. . DATASET NAME TestData WINDOW=FRONT. COMPUTE NAME = REPLACE(NAME,'_',' '). LIST. * .................................................................. . * ... Scratch file, to receive one record per person per month. ... . * ... It must be a file, not a dataset. ... . FILE HANDLE MonthLoop /NAME='C:\Documents and Settings\Richard\My Documents' + '\Temporary\SPSS\' + '2011-02-08 Wang-Number of days in hospital by calendar month' + ' - ' + 'BY-MONTH RECORDS.SAV'. DATASET ACTIVATE TestData WINDOW=FRONT. * ................................................................. . * ................................................................. . * ... Active code ........................................ . * ... Cutoff date: Should be the latest date covered by the . * input dataset. It is used as the ending date where there . * is no discharge date recorded. . * (Giving a data-specific value here is "data in code", . * which is poor practice in general.) . NUMERIC #CutOff (DATE11). COMPUTE #CutOff = DATE.MDY(02,08,2011). * ... Output variables: . NUMERIC Month (MOYR8) DaysIn (F3) Admit (F2) Dischg (F2). VAR LABELS Month 'Calendar month' DaysIn 'Days in residence, this calendar month' Admit 'Admitted in this calendar month?' Dischg 'Discharged in this calendar month?'. VAL LABELS Admit Dischg 1 'Yes' 0 'No'. * ... Working variables . NUMERIC #FrstDy /* First day of the month */ #LastDy /* Last day of the month */ #NextMo /* First day of the following month */ #EndDate /* Person's last residency day */ (DATE11). * ... Start: . * The first output record is for the month of admission . COMPUTE Month = DATE.MOYR(XDATE.MONTH(AdmtDate), XDATE.YEAR (AdmtDate)). * The last day of residency: discharge date, or cutoff date . COMPUTE #EndDate = #CutOff. IF NOT MISSING(DschDate) #EndDate = DschDate. * ... Repeat: Calculate the length of residency within each month,. * and whether admission and discharge fall within that month;. * and write the results to the by-month output file. . LOOP. * - The first and last days of the current month: . * (The following works because a month and its first day have . * the same numerical value, as SPSS dates) . . COMPUTE #FrstDy = Month. * Find the first day of the following month, then the last . * day of the current month. (This works whether or not the . * two months are in the same year.) . . COMPUTE #NextMo = #FrstDy + TIME.DAYS(32). . COMPUTE #NextMo = DATE.MOYR(XDATE.MONTH(#NextMo), XDATE.YEAR (#NextMo)). . COMPUTE #LastDy = #NextMo - TIME.DAYS(1). * In SPSS 15 and later, the above can be simplified: . * (NOT TESTED) . *xx COMPUTE #NextMo = DATESUM(#FrstDy, 1,"months"). *xx COMPUTE #LastDy = DATESUM(#NextMo,-1,"days"). * - Length of residency within current month, and whether . * admission and discharge are within the month: . . COMPUTE DaysIn = CTIME.DAYS( MIN(#EndDate,#LastDy) - MAX(AdmtDate,#FrstDy)) + 1. . COMPUTE Admit = 0. . COMPUTE Dischg = 0. . IF RANGE(AdmtDate,#FrstDy,#LastDy) Admit = 1. . IF NOT MISSING(DschDate) AND RANGE(DschDate,#FrstDy,#LastDy) Dischg = 1. * - Write the output record: . . XSAVE OUTFILE= MonthLoop /KEEP= ID NAME Month DaysIn Admit Dischg AdmtDate DschDate. * - Step to the next month, and continue if the person was . * still in residence in that month: . . COMPUTE Month = #NextMo. END LOOP IF Month GT #EndDate. EXECUTE /* This is one of the few times EXECUTE is really needed */ . * ... The records for people, separated by calendar month: . GET FILE=MonthLoop. DATASET NAME MonthRecs WINDOW=FRONT. LIST. * ... To illustrate possibilities, a summary by calendar month: . DATASET DECLARE MonthSmry. AGGREGATE OUTFILE= MonthSmry /BREAK = Month /NRes 'Number of residents during month' = NU /ResDays 'Total number of resident days' = SUM(DaysIn) /ResAdmt 'Number of residents admitted' = SUM(Admit) /ResDsch 'Number of residents discharged' = SUM(Dischg). DATASET ACTIVATE MonthSmry WINDOW=FRONT. FORMATS ResDays (F6) ResAdmt ResDsch (F3). LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |