LAG

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

LAG

Jason Burke
I would like to calculate the number of episodes in the last 30 days
for each subject. Each record represents a subject admission record. I
must thank Damir for his sample data which was provided for another
question, I have used it here:

The problem I am grasping with is the use of LAG within a loop.

Thanks,


Jason

========================================================

DATA LIST LIST
 /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
BEGIN DATA
3145 6.00 26-Jan-2005
3145 7.00 28-Jan-2005
5137 19.00 4-Aug-2005
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 24.00 24-Sep-2008
5137 24.00 24-Sep-2008
5722 28.00 4-Feb-2005
5722 29.00 9-Aug-2008
5722 33.00 10-Sep-2009
5722 33.00 10-Sep-2009
7509 72.00 31-Aug-2005
7509 73.00 7-Sep-2005
7509 73.00 7-Sep-2005
END DATA.
DATASET NAME Admissions .

SORT CASES BY
 ID (A)
 AdmissionDate (A) .

DO IF (($CASENUM = 1) OR (ID NE LAG(ID)) ).
+  COMPUTE Episode = 1 .
ELSE .
+  COMPUTE Episode = LAG(Episode) + 1 .
END IF .

SORT CASES BY
 ID (A)
 AdmissionDate (A) .

NUMERIC Episode (F8.0) .
NUMERIC Count (F8.0) .

DO IF (($CASENUM = 1) OR (ID NE LAG(ID)) ).
+  COMPUTE Episode = 1 .
ELSE .
+  COMPUTE Episode = LAG(Episode) + 1 .
END IF .

* Count number of episodes within thirty days of the current admission .
* 30 Days = 18144000 Seconds .

COMPUTE Count = 1 .

LOOP N = 1 TO Episode .
+    DO IF AdmissionDate - LAG(AdmissionDate, N) LE 18144000 .
+        COMPUTE Count = Count + 1 .
+    END IF .
END LOOP .

EXECUTE .

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Bruce Weaver
Administrator
Jason Burke wrote
I would like to calculate the number of episodes in the last 30 days
for each subject. Each record represents a subject admission record. I
must thank Damir for his sample data which was provided for another
question, I have used it here:

The problem I am grasping with is the use of LAG within a loop.

Thanks,


Jason

========================================================

DATA LIST LIST
 /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
BEGIN DATA
3145 6.00 26-Jan-2005
3145 7.00 28-Jan-2005
5137 19.00 4-Aug-2005
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 24.00 24-Sep-2008
5137 24.00 24-Sep-2008
5722 28.00 4-Feb-2005
5722 29.00 9-Aug-2008
5722 33.00 10-Sep-2009
5722 33.00 10-Sep-2009
7509 72.00 31-Aug-2005
7509 73.00 7-Sep-2005
7509 73.00 7-Sep-2005
END DATA.
DATASET NAME Admissions .

--- snip ---
Jason, it's not clear to me what the reference date is for "within last 30 days", but in this syntax, I assumed it was the last admission date for each ID.  Does this do what you want?


* Calculate the number of episodes in the last 30 days for each ID .
* It's not clear to me from the example what the reference date is
* for the "within last 30 days" bit. For now, I'll assume it's the last
* Admission Date for each ID.

sort cases by ID AdmissionDate.

aggregate
 break = ID /
 lastdate = last(AdmissionDate).
format lastdate(date11).

* Now flag records where AdmissionDate is within 30 days of LastDate.

compute within30 = datediff(LastDate, AdmissionDate, "days") LE 30.
format within30(f1.0).
exe.

* Finally, use AGGREGATE to get the number of admissions within last 30 days.

aggregate
 break = ID /
 NumWithin30 = sum(within30).
format NumWithin30(f3.0).

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Jason Burke
Hi Bruce,

Hopefully this provides some more clarity....

A patient will have one or more admissions (episodes) represented in
the dataset. For each episode, I wish to determine how many of the
previous admissions (if any) occured in the previous N days.

The challenge is a creating rolling window. For each record you look
back at previous records for the same patient and count the number of
cases where the admission date is within N days of the current case.

Best,


Jason

On Fri, Nov 27, 2009 at 11:24 PM, Bruce Weaver <[hidden email]> wrote:

> Jason Burke wrote:
>>
>> I would like to calculate the number of episodes in the last 30 days
>> for each subject. Each record represents a subject admission record. I
>> must thank Damir for his sample data which was provided for another
>> question, I have used it here:
>>
>> The problem I am grasping with is the use of LAG within a loop.
>>
>> Thanks,
>>
>>
>> Jason
>>
>> ========================================================
>>
>> DATA LIST LIST
>>  /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
>> BEGIN DATA
>> 3145 6.00 26-Jan-2005
>> 3145 7.00 28-Jan-2005
>> 5137 19.00 4-Aug-2005
>> 5137 20.00 24-Jun-2006
>> 5137 20.00 24-Jun-2006
>> 5137 20.00 24-Jun-2006
>> 5137 20.00 24-Jun-2006
>> 5137 24.00 24-Sep-2008
>> 5137 24.00 24-Sep-2008
>> 5722 28.00 4-Feb-2005
>> 5722 29.00 9-Aug-2008
>> 5722 33.00 10-Sep-2009
>> 5722 33.00 10-Sep-2009
>> 7509 72.00 31-Aug-2005
>> 7509 73.00 7-Sep-2005
>> 7509 73.00 7-Sep-2005
>> END DATA.
>> DATASET NAME Admissions .
>>
>> --- snip ---
>>
>
> Jason, it's not clear to me what the reference date is for "within last 30
> days", but in this syntax, I assumed it was the last admission date for each
> ID.  Does this do what you want?
>
>
> * Calculate the number of episodes in the last 30 days for each ID .
> * It's not clear to me from the example what the reference date is
> * for the "within last 30 days" bit. For now, I'll assume it's the last
> * Admission Date for each ID.
>
> sort cases by ID AdmissionDate.
>
> aggregate
>  break = ID /
>  lastdate = last(AdmissionDate).
> format lastdate(date11).
>
> * Now flag records where AdmissionDate is within 30 days of LastDate.
>
> compute within30 = datediff(LastDate, AdmissionDate, "days") LE 30.
> format within30(f1.0).
> exe.
>
> * Finally, use AGGREGATE to get the number of admissions within last 30
> days.
>
> aggregate
>  break = ID /
>  NumWithin30 = sum(within30).
> format NumWithin30(f3.0).
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Bruce Weaver
Administrator
Jason Burke wrote
Hi Bruce,

Hopefully this provides some more clarity....

A patient will have one or more admissions (episodes) represented in
the dataset. For each episode, I wish to determine how many of the
previous admissions (if any) occured in the previous N days.

The challenge is a creating rolling window. For each record you look
back at previous records for the same patient and count the number of
cases where the admission date is within N days of the current case.

Best,


Jason
Thanks for clarifying.  There are no previous episodes when you're looking at episode 1.  So don't you need to modify your LOOP to start on episode 2?  You could do that by inserting it in a DO IF, I think.  I think you'd also have to change it to N = 1 to Episode -1.  

DO IF (episode GT 1).
LOOP N = 1 TO Episode-1 .
+    DO IF DATEDIFF(AdmissionDate, LAG(AdmissionDate, N),"days") LE 30 .
+        COMPUTE Count = Count + 1 .
+    END IF .
END LOOP .
END IF.

But you have another problem:  I don't think LAG will take a variable N.  I think it has to be an actual number.  You might be able to get around that by sticking your LOOP inside a macro though.  If I have time later, I'll play around with this a bit more.

One other thing I just noticed.  You are setting the initial value of COUNT to 1 rather than 0.  If you want the number of *previous* admissions within N days, I think you need to start with COUNT = 0.  

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Richard Ristow
At 10:24 AM 11/27/2009, Bruce Weaver wrote:

DO IF (episode GT 1).
LOOP N = 1 TO Episode-1 .
+    DO IF DATEDIFF(AdmissionDate, LAG(AdmissionDate, N),"days") LE 30 .
+        COMPUTE Count = Count + 1 .
+    END IF .
END LOOP .
END IF.

But you have another problem:  I don't think LAG will take a variable N.  I
think it has to be an actual number.  You might be able to get around that
by sticking your LOOP inside a macro though.

You're right, that the second argument to LAG must be a constant. Here's a way to get the effect of an indexed lag. Where you want to say LAG(X,N) (function call), say instead LAGX(N) (vector reference).


DO REPEAT  LAGval = 1 TO 5
          /Lagged = LAGX1 TO LAGX5.
.  COMPUTE Lagged = LAG(X,LAGval).
END REPEAT.

VECTOR LAGX = LAGX1 TO LAGX5.
LIST.

 
List
|-----------------------------|---------------------------|
|Output Created               |27-NOV-2009 12:34:51       |
|-----------------------------|---------------------------|
RecordID    X    LAGX1    LAGX2    LAGX3    LAGX4    LAGX5

   01       1      .        .        .        .        .
   02       3     1.00      .        .        .        .
   03       5     3.00     1.00      .        .        .
   04       7     5.00     3.00     1.00      .        .
   05       9     7.00     5.00     3.00     1.00      .
   06      11     9.00     7.00     5.00     3.00     1.00
   07      13    11.00     9.00     7.00     5.00     3.00
   08      15    13.00    11.00     9.00     7.00     5.00
   09      17    15.00    13.00    11.00     9.00     7.00
   10      19    17.00    15.00    13.00    11.00     9.00

Number of cases read:  10    Number of cases listed:  10
=============================
APPENDIX: Test data, and code
=============================
INPUT PROGRAM.
.  NUMERIC RecordID (N2) X(F4).
.  LOOP    RecordID = 1 TO 10.
.     COMPUTE X = 2*RecordID -1.
.     END CASE.
.  END LOOP.
END FILE.
END INPUT PROGRAM.


DO REPEAT  LAGval = 1 TO 5
          /Lagged = LAGX1 TO LAGX5.
.  COMPUTE Lagged = LAG(X,LAGval).
END REPEAT.

VECTOR LAGX = LAGX1 TO LAGX5.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Art Kendall
In reply to this post by Jason Burke
See if this example syntax does what you want.
It assumes admissions are in datetime order so that multiple admissions on the same day will have the last admission with the count of previous admissions.
There are 3 hard coded numbers. ndays the criterion number of days,  lagspan the number of lags to look back over, and isin1 to isinX.




DATA LIST LIST
 /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
BEGIN DATA
3145 6.00 26-Jan-2005
3145 7.00 28-Jan-2005
5137 19.00 4-Aug-2005
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 24.00 24-Sep-2008
5137 24.00 24-Sep-2008
5722 28.00 4-Feb-2005
5722 29.00 9-Aug-2008
5722 33.00 10-Sep-2009
5722 33.00 10-Sep-2009
7509 72.00 31-Aug-2005
7509 73.00 7-Sep-2005
7509 73.00 7-Sep-2005
END DATA.
DATASET NAME Admissions .
numeric ndays(f2).
compute ndays= 90.

*check up to 10 previous admissions.
numeric nadmits (f2).
numeric isin1 to isin10(f1).
do repeat  isin = isin1 to isin10/lagspan=1 to 10.
do if lag (id,lagspan) eq id.
compute isin = datediff(admissionDate,lag(admissionDate,lagspan), "days")  lt ndays.
end if.
end repeat.
count nadmits = isin1 to isin10(1).
LIST .

Art Kendall
Social Research Consultants

Jason Burke wrote:
Hi Bruce,

Hopefully this provides some more clarity....

A patient will have one or more admissions (episodes) represented in
the dataset. For each episode, I wish to determine how many of the
previous admissions (if any) occured in the previous N days.

The challenge is a creating rolling window. For each record you look
back at previous records for the same patient and count the number of
cases where the admission date is within N days of the current case.

Best,


Jason

On Fri, Nov 27, 2009 at 11:24 PM, Bruce Weaver [hidden email] wrote:
  
Jason Burke wrote:
    
I would like to calculate the number of episodes in the last 30 days
for each subject. Each record represents a subject admission record. I
must thank Damir for his sample data which was provided for another
question, I have used it here:

The problem I am grasping with is the use of LAG within a loop.

Thanks,


Jason

========================================================

DATA LIST LIST
 /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
BEGIN DATA
3145 6.00 26-Jan-2005
3145 7.00 28-Jan-2005
5137 19.00 4-Aug-2005
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 24.00 24-Sep-2008
5137 24.00 24-Sep-2008
5722 28.00 4-Feb-2005
5722 29.00 9-Aug-2008
5722 33.00 10-Sep-2009
5722 33.00 10-Sep-2009
7509 72.00 31-Aug-2005
7509 73.00 7-Sep-2005
7509 73.00 7-Sep-2005
END DATA.
DATASET NAME Admissions .

--- snip ---

      
Jason, it's not clear to me what the reference date is for "within last 30
days", but in this syntax, I assumed it was the last admission date for each
ID.  Does this do what you want?


* Calculate the number of episodes in the last 30 days for each ID .
* It's not clear to me from the example what the reference date is
* for the "within last 30 days" bit. For now, I'll assume it's the last
* Admission Date for each ID.

sort cases by ID AdmissionDate.

aggregate
 break = ID /
 lastdate = last(AdmissionDate).
format lastdate(date11).

* Now flag records where AdmissionDate is within 30 days of LastDate.

compute within30 = datediff(LastDate, AdmissionDate, "days") LE 30.
format within30(f1.0).
exe.

* Finally, use AGGREGATE to get the number of admissions within last 30
days.

aggregate
 break = ID /
 NumWithin30 = sum(within30).
format NumWithin30(f3.0).



-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

    

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

  
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Richard Ristow
In reply to this post by Jason Burke
At 11:03 PM 11/26/2009, Jason Burke wrote:

I would like to calculate the number of episodes in the last 30 days for each subject. Each record represents a subject admission record. I must thank Damir for his sample data which was provided for another question, I have used it here:

[Admissions]
      ID      ADID AdmissionDate

    3145      6.00  26-JAN-2005
    3145      7.00  28-JAN-2005
    5137     19.00  04-AUG-2005
    5137     20.00  24-JUN-2006
    5137     20.00  24-JUN-2006
    5137     20.00  24-JUN-2006
    5137     20.00  24-JUN-2006
    5137     24.00  24-SEP-2008
    5137     24.00  24-SEP-2008
    5722     28.00  04-FEB-2005
    5722     29.00  09-AUG-2008
    5722     33.00  10-SEP-2009
    5722     33.00  10-SEP-2009
    7509     72.00  31-AUG-2005
    7509     73.00  07-SEP-2005
    7509     73.00  07-SEP-2005

Number of cases read:  16    Number of cases listed:  16

At Sat, 28 Nov 2009 00:21:18 +1100, he clarified,
A patient will have one or more admissions (episodes) represented in the dataset. For each episode, I wish to determine how many of the previous admissions (if any) occured in the previous N days.

There's one glitch in the data: in several instances (#5137 on 24-JUN-2006 and 24-SEP-2008; #5722 on 10-SEP-2009; #7509 on 07-SEP-2005), a patient has more than one admission on the same day:

DATASET DECLARE  AdmissionDays.

AGGREGATE OUTFILE=AdmissionDays
   /BREAK=ID AdmissionDate
   /NAdm    'Number of admissions, this date' = NU
   /FrstAdm 'Lowest  ADID, this date'         = MIN(ADID)
   /LastAdm 'Highest ADID, this date'         = MAX(ADID).

DATASET ACTIVATE AdmissionDays WINDOW=FRONT.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |28-NOV-2009 20:36:26       |
|-----------------------------|---------------------------|
[AdmissionDays]
      ID AdmissionDate    NAdm   FrstAdm   LastAdm

    3145  26-JAN-2005        1      6.00      6.00
    3145  28-JAN-2005        1      7.00      7.00
    5137  04-AUG-2005        1     19.00     19.00
    5137  24-JUN-2006        4     20.00     20.00
    5137  24-SEP-2008        2     24.00     24.00
    5722  04-FEB-2005        1     28.00     28.00
    5722  09-AUG-2008        1     29.00     29.00
    5722  10-SEP-2009        2     33.00     33.00
    7509  31-AUG-2005        1     72.00     72.00
    7509  07-SEP-2005        2     73.00     73.00

Number of cases read:  10    Number of cases listed:  10

That's a nuisance computationally and conceptually: If an admission occurs on one date, are other admissions on the same date "in the last 30 days"?

Going forward, in a later posting, I will not count them as prior admissions; and will untangle some of the code by using the admission date, not the individual admission, as the unit for computation.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LAG

Bruce Weaver
Administrator
In reply to this post by Art Kendall
Art Kendall wrote

See if this example syntax does what you want.
It assumes admissions are in datetime order so that multiple admissions
on the same day will have the last admission with the count of previous
admissions.
There are 3 hard coded numbers. ndays the criterion number of days, 
lagspan the number of lags to look back over, and isin1 to isinX.


DATA LIST LIST
 /ID (F8.0)  ADID (F8.2) AdmissionDate (DATE11).
BEGIN DATA
3145 6.00 26-Jan-2005
3145 7.00 28-Jan-2005
5137 19.00 4-Aug-2005
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 20.00 24-Jun-2006
5137 24.00 24-Sep-2008
5137 24.00 24-Sep-2008
5722 28.00 4-Feb-2005
5722 29.00 9-Aug-2008
5722 33.00 10-Sep-2009
5722 33.00 10-Sep-2009
7509 72.00 31-Aug-2005
7509 73.00 7-Sep-2005
7509 73.00 7-Sep-2005
END DATA.
DATASET NAME Admissions .
numeric ndays(f2).
compute ndays= 90.

*check up to 10 previous admissions.
numeric nadmits (f2).
numeric isin1 to isin10(f1).
do repeat  isin = isin1 to isin10/lagspan=1 to 10.
do if lag (id,lagspan) eq id.
compute isin = datediff(admissionDate,lag(admissionDate,lagspan), "days")  lt ndays.
end if.
end repeat.
count nadmits = isin1 to isin10(1).
LIST .


Art Kendall
Social Research Consultants
Nice one, Art!

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).