computations with dates

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

computations with dates

Tanya Temkin
I have a data set with each patient call made to our call center within a
certain time period; each call is a record. There is a unique patient ID,
call date, and call time (among many other variables). What I'm trying to
do for each patient is count up all calls that preceded each call within
the last 90 days. This is turning out to be more complicated than I
thought.

PRIOR calls were all initialized to zero, then the **first** record for
each ID was updated with data from another file.
There are no missing values. Cases are sorted by ID  CALL_DATE  CALL_TIME.
Some patients made more than one call on the same day.

At first I thought this would work:
  If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
    prior_calls=lag(prior_calls)+1.

This is what I got:

 ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          6
 1     02-APR-05     16:16          7
 1     14-MAY-05     08:19          8

Problem here. What would be accurate would be only the number of calls
made in the 90 days prior to each call, i.e.

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          5
 1     02-APR-05     16:16          6
 1     14-MAY-05     08:19          4

But I can't figure out the syntax needed to do this - at least with
current data structure. What approach could I use?

I'm wondering if this is going to entail restructuring my data long to
wide and then applying vector/loop syntax to loop through all the call
dates for each record (not in my comfort zone!)

Thanks!

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: computations with dates

Tanya Temkin
Thanks for the advice, Dennis .... I started to follow your suggestion,
and got this far - up to the summing of days from prior call date (this is
different set of test data from one in prior posting):

(days diff=time in days from previous call; r_sum=running sum of time from
1st call)

ID      call-num        call_date       days_diff       r_sum
1000xx  1       29-Dec-200y     0       0
1000xx  2       12-Jan-200x     14      14
1000xx  3       02-Feb-200x     21      35
1000xx  4       11-Feb-200x     9       44
1000xx  5       09-Mar-200x     26      70
1000xx  6       10-Mar-200x     1       71
1000xx  7       02-Apr-200x     23      94
1000xx  8       04-Apr-200x     2       96
1000xx  9       09-Apr-200x     5       101
1000xx  10      14-May-200x     35      136
1000xx  11      20-May-200x     6       142
1000xx  12      03-Jun-200x     14      156
1000xx  13      26-Jun-200x     23      179
1000xx  14      27-Jun-200x     1       180
1000xx  15      27-Jun-200x     0       180
1000xx  16      28-Jun-200x     1       181
1000xx  17      29-Jun-200x     1       182
1000xx  18      05-Jul-200x     6       188
1000xx  19      10-Jul-200x     5       193
1000xx  20      12-Aug-200x     33      226
1001xx  0       25-Aug-200x     0       0
1001xx  1       12-Nov-200x     79      79
1002xx  1       06-Feb-200x     0       0
1002xx  2       09-Feb-200x     3       3
1002xx  3       08-Aug-200x     180     183
1002xx  4       09-Aug-200x     1       184
1002xx  5       12-Aug-200x     3       187
1002xx  6       14-Aug-200x     2       189
1002xx  7       15-Aug-200x     1       190
1002xx  8       31-Aug-200x     16      206
1002xx  9       05-Sep-200x     5       211
1002xx  10      03-Nov-200x     59      270
1002xx  11      09-Dec-200x     36      306
1002xx  12      11-Dec-200x     2       308

The wheels started to come off when I got to the first 90+ day value (7th
record) ....the previous dates back to 12 Jan 05 are in the prior 90 day
period. It's not just a matter of incrementing by 1 for each 90 day period
following first call, but checking to see if each prior call is in the
90-day period preceding the current call. Are you sure I am not going to
have to resort to a vector/loop routine to loop through each call for each
ID? Or am I missing something in your instructions?

Thanks again....


Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Dennis Deck" <[hidden email]>
07/17/2008 12:42 AM

To
Tanya L TemKin/CA/KAIPERM@KAIPERM
cc

Subject
RE: computations with dates






That is tricky because the number of prior records to sum across is
unknown and
keeps changing from ID to ID.

One approach might be something like:
  number the records within ID
  track the prior comparison record number starting with 1
  sum the days from the prior comparison record and keep a running sum
from prior comparion
  if prior comparison is more than 90 days then increment and resum
until days are less than 90

Dennis

-----Original Message-----
From: Tanya Temkin [mailto:[hidden email]]
Sent: Wednesday, July 16, 2008 1:39 PM
Subject: computations with dates

I have a data set with each patient call made to our call center within
a certain time period; each call is a record. There is a unique patient
ID, call date, and call time (among many other variables). What I'm
trying to do for each patient is count up all calls that preceded each
call within the last 90 days. This is turning out to be more complicated
than I thought.

PRIOR calls were all initialized to zero, then the **first** record for
each ID was updated with data from another file.
There are no missing values. Cases are sorted by ID  CALL_DATE
CALL_TIME.
Some patients made more than one call on the same day.

At first I thought this would work:
  If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
    prior_calls=lag(prior_calls)+1.

This is what I got:

 ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          6
 1     02-APR-05     16:16          7
 1     14-MAY-05     08:19          8

Problem here. What would be accurate would be only the number of calls
made in the 90 days prior to each call, i.e.

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          5
 1     02-APR-05     16:16          6
 1     14-MAY-05     08:19          4

But I can't figure out the syntax needed to do this - at least with
current data structure. What approach could I use?

I'm wondering if this is going to entail restructuring my data long to
wide and then applying vector/loop syntax to loop through all the call
dates for each record (not in my comfort zone!)

Thanks!

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: computations with dates

Marks, Jim
Tanya:

A brute force solution, based on a solution to a similar problem from
Art Kendall:

*** SAMPLE DATA.
data list free /id (f8.0)  date (date10) time (time5).
Begin data
 1     29-DEC-04     16:12
 1     12-JAN-05     11:57
 1     01-FEB-05     14:26
 1     11-FEB-05     11:14
 1     14-FEB-05     18:39
 1     10-MAR-05     15:19
 1     02-APR-05     12:34
 1     02-APR-05     16:16
 1     14-MAY-05     08:19
 2     14-FEB-05     18:39
 2     10-MAR-05     15:19
 2     02-APR-06     12:34
 2     11-APR-06     16:16
 2     14-JUN-06     08:19
end data.

*** create variables to flag if the nth lag is within 90 days.
*** edit for the maximum number of cases within id.
numeric pr_1 to pr_10.

*** calculate the flags.
do repeat v = pr_1 to pr_10 /i = 1 to 10 .
if datediff(date,lag(date,i),"days") LE 90  and id = lag(id,i)  v = 1.
end repeat.

*** create a variable to hold the number of lags.
count priors =pr_1 to pr_10 (1).

*** cleanup-- drop the flag variables.
add files file = * /drop pr_1 to pr_10.

*** trigger data pass.
LIST.

HTH
--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tanya Temkin
Sent: Thursday, July 17, 2008 1:54 PM
To: [hidden email]
Subject: Re: computations with dates

Thanks for the advice, Dennis .... I started to follow your suggestion,
and got this far - up to the summing of days from prior call date (this
is
different set of test data from one in prior posting):

(days diff=time in days from previous call; r_sum=running sum of time
from
1st call)

ID      call-num        call_date       days_diff       r_sum
1000xx  1       29-Dec-200y     0       0
1000xx  2       12-Jan-200x     14      14
1000xx  3       02-Feb-200x     21      35
1000xx  4       11-Feb-200x     9       44
1000xx  5       09-Mar-200x     26      70
1000xx  6       10-Mar-200x     1       71
1000xx  7       02-Apr-200x     23      94
1000xx  8       04-Apr-200x     2       96
1000xx  9       09-Apr-200x     5       101
1000xx  10      14-May-200x     35      136
1000xx  11      20-May-200x     6       142
1000xx  12      03-Jun-200x     14      156
1000xx  13      26-Jun-200x     23      179
1000xx  14      27-Jun-200x     1       180
1000xx  15      27-Jun-200x     0       180
1000xx  16      28-Jun-200x     1       181
1000xx  17      29-Jun-200x     1       182
1000xx  18      05-Jul-200x     6       188
1000xx  19      10-Jul-200x     5       193
1000xx  20      12-Aug-200x     33      226
1001xx  0       25-Aug-200x     0       0
1001xx  1       12-Nov-200x     79      79
1002xx  1       06-Feb-200x     0       0
1002xx  2       09-Feb-200x     3       3
1002xx  3       08-Aug-200x     180     183
1002xx  4       09-Aug-200x     1       184
1002xx  5       12-Aug-200x     3       187
1002xx  6       14-Aug-200x     2       189
1002xx  7       15-Aug-200x     1       190
1002xx  8       31-Aug-200x     16      206
1002xx  9       05-Sep-200x     5       211
1002xx  10      03-Nov-200x     59      270
1002xx  11      09-Dec-200x     36      306
1002xx  12      11-Dec-200x     2       308

The wheels started to come off when I got to the first 90+ day value
(7th
record) ....the previous dates back to 12 Jan 05 are in the prior 90 day
period. It's not just a matter of incrementing by 1 for each 90 day
period
following first call, but checking to see if each prior call is in the
90-day period preceding the current call. Are you sure I am not going to
have to resort to a vector/loop routine to loop through each call for
each
ID? Or am I missing something in your instructions?

Thanks again....


Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Dennis Deck" <[hidden email]>
07/17/2008 12:42 AM

To
Tanya L TemKin/CA/KAIPERM@KAIPERM
cc

Subject
RE: computations with dates






That is tricky because the number of prior records to sum across is
unknown and
keeps changing from ID to ID.

One approach might be something like:
  number the records within ID
  track the prior comparison record number starting with 1
  sum the days from the prior comparison record and keep a running sum
from prior comparion
  if prior comparison is more than 90 days then increment and resum
until days are less than 90

Dennis

-----Original Message-----
From: Tanya Temkin [mailto:[hidden email]]
Sent: Wednesday, July 16, 2008 1:39 PM
Subject: computations with dates

I have a data set with each patient call made to our call center within
a certain time period; each call is a record. There is a unique patient
ID, call date, and call time (among many other variables). What I'm
trying to do for each patient is count up all calls that preceded each
call within the last 90 days. This is turning out to be more complicated
than I thought.

PRIOR calls were all initialized to zero, then the **first** record for
each ID was updated with data from another file.
There are no missing values. Cases are sorted by ID  CALL_DATE
CALL_TIME.
Some patients made more than one call on the same day.

At first I thought this would work:
  If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
    prior_calls=lag(prior_calls)+1.

This is what I got:

 ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          6
 1     02-APR-05     16:16          7
 1     14-MAY-05     08:19          8

Problem here. What would be accurate would be only the number of calls
made in the 90 days prior to each call, i.e.

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          5
 1     02-APR-05     16:16          6
 1     14-MAY-05     08:19          4

But I can't figure out the syntax needed to do this - at least with
current data structure. What approach could I use?

I'm wondering if this is going to entail restructuring my data long to
wide and then applying vector/loop syntax to loop through all the call
dates for each record (not in my comfort zone!)

Thanks!

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: computations with dates using many to many

hillel vardi
In reply to this post by Tanya Temkin
Shalom

Your task can be solved in more then one way , here is a many to many
approach  that is more common in SQ L  programing  .
This approach  may not work if you have a huge file ( like 1000000
subject with 500 call per subject ) .


data list free /ID (f5)call_num (f4) call_date (edate14) days_diff (f5)
r_sum(f6).
Begin data
1000  1   29-Dec-2006   0     0
1000  2   12-Jan-2007  14    14
1000  3   02-Feb-2007  21    35
1000  4   11-Feb-2007   9    44
1000  5   09-Mar-2007   26   70
1000  6   10-Mar-2007   1    71
1000  7   02-Apr-2007   23   94
1000  8   04-Apr-2007   2    96
1000  9   09-Apr-2007   5   101
1000  10  14-May-2007  35   136
1000  11  20-May-2007   6   142
1000  12  03-Jun-2007  14   156
1000  13  26-Jun-2007  23   179
1000  14  27-Jun-2007   1   180
1000  15  27-Jun-2007   0   180
1000  16  28-Jun-2007   1   181
1000  17  29-Jun-2007   1   182
1000  18  05-Jul-2007   6   188
1000  19  10-Jul-2007   5   193
1000  20  12-Aug-2007  33   226
1001  0   25-Aug-2007   0     0
1001  1   12-Nov-2007  79    79
1002  1   06-Feb-2007   0     0
1002  2   09-Feb-2007   3     3
1002  3   08-Aug-2007 180   183
1002  4   09-Aug-2007   1   184
1002  5   12-Aug-2007   3   187
1002  6   14-Aug-2007   2   189
1002  7   15-Aug-2007   1   190
1002  8   31-Aug-2007  16   206
1002  9   05-Sep-2007   5   211
1002  10  03-Nov-2007  59   270
1002  11  09-Dec-2007  36   306
1002  12  11-Dec-2007   2   308
end data.

save outfile='c:\temp\tmp.sav' .

SORT CASES BY ID call_num .
CASESTOVARS
 /ID = ID
 /INDEX = call_num
 /drop=days_diff  r_sum
 /GROUPBY = VARIABLE .
MATCH FILES /TABLE=*
            /FILE='C:\Temp\tmp.sav'
            /BY ID.
execute .
*  chnage call_date.20 to the max call per subject in your data .
VARSTOCASES  /MAKE dates FROM call_date.0  to call_date.20
 /INDEX = seq(21)
 /KEEP =  ID call_num call_date days_diff r_sum
 /NULL = KEEP.
compute   days=datedif(call_date,dates,'days').
select if (days gt 0) and (days le 90) .
execute .
SORT CASES BY ID call_num .

**** if you need the data per call .
CASESTOVARS
 /ID = ID call_num
 /drop=days_diff r_sum seq
 /GROUPBY = VARIABLE .


Hillel Vardi
BGU


Tanya Temkin wrote:

> Thanks for the advice, Dennis .... I started to follow your suggestion,
> and got this far - up to the summing of days from prior call date (this is
> different set of test data from one in prior posting):
>
> (days diff=time in days from previous call; r_sum=running sum of time from
> 1st call)
>
> ID      call-num        call_date       days_diff       r_sum
> 1000xx  1       29-Dec-200y     0       0
> 1000xx  2       12-Jan-200x     14      14
> 1000xx  3       02-Feb-200x     21      35
> 1000xx  4       11-Feb-200x     9       44
> 1000xx  5       09-Mar-200x     26      70
> 1000xx  6       10-Mar-200x     1       71
> 1000xx  7       02-Apr-200x     23      94
> 1000xx  8       04-Apr-200x     2       96
> 1000xx  9       09-Apr-200x     5       101
> 1000xx  10      14-May-200x     35      136
> 1000xx  11      20-May-200x     6       142
> 1000xx  12      03-Jun-200x     14      156
> 1000xx  13      26-Jun-200x     23      179
> 1000xx  14      27-Jun-200x     1       180
> 1000xx  15      27-Jun-200x     0       180
> 1000xx  16      28-Jun-200x     1       181
> 1000xx  17      29-Jun-200x     1       182
> 1000xx  18      05-Jul-200x     6       188
> 1000xx  19      10-Jul-200x     5       193
> 1000xx  20      12-Aug-200x     33      226
> 1001xx  0       25-Aug-200x     0       0
> 1001xx  1       12-Nov-200x     79      79
> 1002xx  1       06-Feb-200x     0       0
> 1002xx  2       09-Feb-200x     3       3
> 1002xx  3       08-Aug-200x     180     183
> 1002xx  4       09-Aug-200x     1       184
> 1002xx  5       12-Aug-200x     3       187
> 1002xx  6       14-Aug-200x     2       189
> 1002xx  7       15-Aug-200x     1       190
> 1002xx  8       31-Aug-200x     16      206
> 1002xx  9       05-Sep-200x     5       211
> 1002xx  10      03-Nov-200x     59      270
> 1002xx  11      09-Dec-200x     36      306
> 1002xx  12      11-Dec-200x     2       308
>
> The wheels started to come off when I got to the first 90+ day value (7th
> record) ....the previous dates back to 12 Jan 05 are in the prior 90 day
> period. It's not just a matter of incrementing by 1 for each 90 day period
> following first call, but checking to see if each prior call is in the
> 90-day period preceding the current call. Are you sure I am not going to
> have to resort to a vector/loop routine to loop through each call for each
> ID? Or am I missing something in your instructions?
>
> Thanks again....
>
>
> Tanya Temkin
> Research Associate
> AACC Reporting
> Northern California Regional Office
> The Permanente Medical Group
> (510) 625-6680
>
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
>
>
>
>
> "Dennis Deck" <[hidden email]>
> 07/17/2008 12:42 AM
>
> To
> Tanya L TemKin/CA/KAIPERM@KAIPERM
> cc
>
> Subject
> RE: computations with dates
>
>
>
>
>
>
> That is tricky because the number of prior records to sum across is
> unknown and
> keeps changing from ID to ID.
>
> One approach might be something like:
>   number the records within ID
>   track the prior comparison record number starting with 1
>   sum the days from the prior comparison record and keep a running sum
> from prior comparion
>   if prior comparison is more than 90 days then increment and resum
> until days are less than 90
>
> Dennis
>
> -----Original Message-----
> From: Tanya Temkin [mailto:[hidden email]]
> Sent: Wednesday, July 16, 2008 1:39 PM
> Subject: computations with dates
>
> I have a data set with each patient call made to our call center within
> a certain time period; each call is a record. There is a unique patient
> ID, call date, and call time (among many other variables). What I'm
> trying to do for each patient is count up all calls that preceded each
> call within the last 90 days. This is turning out to be more complicated
> than I thought.
>
> PRIOR calls were all initialized to zero, then the **first** record for
> each ID was updated with data from another file.
> There are no missing values. Cases are sorted by ID  CALL_DATE
> CALL_TIME.
> Some patients made more than one call on the same day.
>
> At first I thought this would work:
>   If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
>     prior_calls=lag(prior_calls)+1.
>
> This is what I got:
>
>  ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
>  1     29-DEC-04     16:12          0
>  1     12-JAN-05     11:57          1
>  1     01-FEB-05     14:26          2
>  1     11-FEB-05     11:14          3
>  1     14-FEB-05     18:39          4
>  1     10-MAR-05     15:19          5
>  1     02-APR-05     12:34          6
>  1     02-APR-05     16:16          7
>  1     14-MAY-05     08:19          8
>
> Problem here. What would be accurate would be only the number of calls
> made in the 90 days prior to each call, i.e.
>
> ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
>  1     29-DEC-04     16:12          0
>  1     12-JAN-05     11:57          1
>  1     01-FEB-05     14:26          2
>  1     11-FEB-05     11:14          3
>  1     14-FEB-05     18:39          4
>  1     10-MAR-05     15:19          5
>  1     02-APR-05     12:34          5
>  1     02-APR-05     16:16          6
>  1     14-MAY-05     08:19          4
>
> But I can't figure out the syntax needed to do this - at least with
> current data structure. What approach could I use?
>
> I'm wondering if this is going to entail restructuring my data long to
> wide and then applying vector/loop syntax to loop through all the call
> dates for each record (not in my comfort zone!)
>
> Thanks!
>
> Tanya Temkin
> Research Associate
> AACC Reporting
> Northern California Regional Office
> The Permanente Medical Group
> (510) 625-6680
>
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: computations with dates

Tanya Temkin
In reply to this post by Marks, Jim
Thanks Jim! I tried this on the test data and it worked. Hillel, I'll hold
on to your solution for use in the future...good to know a number of ways
to solve this problem.

But please help me understand exactly what I'm doing with some of this
syntax.... specifically

do repeat v = pr_1 to pr_10 /i = 1 to 10 .
if datediff(date,lag(date,i),"days") LE 90  and id = lag(id,i)  v = 1.
end repeat.

I understand the pr_1 to pr_10 creates a set of new variables.
But what is the function of "i"? I'm guessing it's some kind of index. But
is this somehow looping through all the previous cases for each ID? I
thought indexes could just be used in computations across variables, not
cases. Anyway, thanks for helping me understand what you helped me do.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Marks, Jim" <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
07/17/2008 06:03 PM
Please respond to
"Marks, Jim" <[hidden email]>


To
[hidden email]
cc

Subject
Re: computations with dates






Tanya:

A brute force solution, based on a solution to a similar problem from
Art Kendall:

*** SAMPLE DATA.
data list free /id (f8.0)  date (date10) time (time5).
Begin data
 1     29-DEC-04     16:12
 1     12-JAN-05     11:57
 1     01-FEB-05     14:26
 1     11-FEB-05     11:14
 1     14-FEB-05     18:39
 1     10-MAR-05     15:19
 1     02-APR-05     12:34
 1     02-APR-05     16:16
 1     14-MAY-05     08:19
 2     14-FEB-05     18:39
 2     10-MAR-05     15:19
 2     02-APR-06     12:34
 2     11-APR-06     16:16
 2     14-JUN-06     08:19
end data.

*** create variables to flag if the nth lag is within 90 days.
*** edit for the maximum number of cases within id.
numeric pr_1 to pr_10.

*** calculate the flags.
do repeat v = pr_1 to pr_10 /i = 1 to 10 .
if datediff(date,lag(date,i),"days") LE 90  and id = lag(id,i)  v = 1.
end repeat.

*** create a variable to hold the number of lags.
count priors =pr_1 to pr_10 (1).

*** cleanup-- drop the flag variables.
add files file = * /drop pr_1 to pr_10.

*** trigger data pass.
LIST.

HTH
--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tanya Temkin
Sent: Thursday, July 17, 2008 1:54 PM
To: [hidden email]
Subject: Re: computations with dates

Thanks for the advice, Dennis .... I started to follow your suggestion,
and got this far - up to the summing of days from prior call date (this
is
different set of test data from one in prior posting):

(days diff=time in days from previous call; r_sum=running sum of time
from
1st call)

ID      call-num        call_date       days_diff       r_sum
1000xx  1       29-Dec-200y     0       0
1000xx  2       12-Jan-200x     14      14
1000xx  3       02-Feb-200x     21      35
1000xx  4       11-Feb-200x     9       44
1000xx  5       09-Mar-200x     26      70
1000xx  6       10-Mar-200x     1       71
1000xx  7       02-Apr-200x     23      94
1000xx  8       04-Apr-200x     2       96
1000xx  9       09-Apr-200x     5       101
1000xx  10      14-May-200x     35      136
1000xx  11      20-May-200x     6       142
1000xx  12      03-Jun-200x     14      156
1000xx  13      26-Jun-200x     23      179
1000xx  14      27-Jun-200x     1       180
1000xx  15      27-Jun-200x     0       180
1000xx  16      28-Jun-200x     1       181
1000xx  17      29-Jun-200x     1       182
1000xx  18      05-Jul-200x     6       188
1000xx  19      10-Jul-200x     5       193
1000xx  20      12-Aug-200x     33      226
1001xx  0       25-Aug-200x     0       0
1001xx  1       12-Nov-200x     79      79
1002xx  1       06-Feb-200x     0       0
1002xx  2       09-Feb-200x     3       3
1002xx  3       08-Aug-200x     180     183
1002xx  4       09-Aug-200x     1       184
1002xx  5       12-Aug-200x     3       187
1002xx  6       14-Aug-200x     2       189
1002xx  7       15-Aug-200x     1       190
1002xx  8       31-Aug-200x     16      206
1002xx  9       05-Sep-200x     5       211
1002xx  10      03-Nov-200x     59      270
1002xx  11      09-Dec-200x     36      306
1002xx  12      11-Dec-200x     2       308

The wheels started to come off when I got to the first 90+ day value
(7th
record) ....the previous dates back to 12 Jan 05 are in the prior 90 day
period. It's not just a matter of incrementing by 1 for each 90 day
period
following first call, but checking to see if each prior call is in the
90-day period preceding the current call. Are you sure I am not going to
have to resort to a vector/loop routine to loop through each call for
each
ID? Or am I missing something in your instructions?

Thanks again....


Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Dennis Deck" <[hidden email]>
07/17/2008 12:42 AM

To
Tanya L TemKin/CA/KAIPERM@KAIPERM
cc

Subject
RE: computations with dates






That is tricky because the number of prior records to sum across is
unknown and
keeps changing from ID to ID.

One approach might be something like:
  number the records within ID
  track the prior comparison record number starting with 1
  sum the days from the prior comparison record and keep a running sum
from prior comparion
  if prior comparison is more than 90 days then increment and resum
until days are less than 90

Dennis

-----Original Message-----
From: Tanya Temkin [mailto:[hidden email]]
Sent: Wednesday, July 16, 2008 1:39 PM
Subject: computations with dates

I have a data set with each patient call made to our call center within
a certain time period; each call is a record. There is a unique patient
ID, call date, and call time (among many other variables). What I'm
trying to do for each patient is count up all calls that preceded each
call within the last 90 days. This is turning out to be more complicated
than I thought.

PRIOR calls were all initialized to zero, then the **first** record for
each ID was updated with data from another file.
There are no missing values. Cases are sorted by ID  CALL_DATE
CALL_TIME.
Some patients made more than one call on the same day.

At first I thought this would work:
  If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
    prior_calls=lag(prior_calls)+1.

This is what I got:

 ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          6
 1     02-APR-05     16:16          7
 1     14-MAY-05     08:19          8

Problem here. What would be accurate would be only the number of calls
made in the 90 days prior to each call, i.e.

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
 1     29-DEC-04     16:12          0
 1     12-JAN-05     11:57          1
 1     01-FEB-05     14:26          2
 1     11-FEB-05     11:14          3
 1     14-FEB-05     18:39          4
 1     10-MAR-05     15:19          5
 1     02-APR-05     12:34          5
 1     02-APR-05     16:16          6
 1     14-MAY-05     08:19          4

But I can't figure out the syntax needed to do this - at least with
current data structure. What approach could I use?

I'm wondering if this is going to entail restructuring my data long to
wide and then applying vector/loop syntax to loop through all the call
dates for each record (not in my comfort zone!)

Thanks!

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: computations with dates

Marks, Jim
DO REPEAT is a function that generates code, without having to type it
out.

 

it takes the code, replaces the indicator  and runs it with each item
from the list :

 

first pair:

if datediff(date,lag(date,1),"days") LE 90  and id = lag(id,1)  pr_1 =
1.



second pair:

if datediff(date,lag(date,2),"days") LE 90  and id = lag(id,2)  pr_2 =
1.



etc

 

HTH

 

--jim

 

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, July 18, 2008 12:36 PM
To: [hidden email]
Cc: Marks, Jim
Subject: Re: computations with dates

 


Thanks Jim! I tried this on the test data and it worked. Hillel, I'll
hold on to your solution for use in the future...good to know a number
of ways to solve this problem.

But please help me understand exactly what I'm doing with some of this
syntax.... specifically

do repeat v = pr_1 to pr_10 /i = 1 to 10 .
if datediff(date,lag(date,i),"days") LE 90  and id = lag(id,i)  v = 1.
end repeat.

I understand the pr_1 to pr_10 creates a set of new variables.
But what is the function of "i"? I'm guessing it's some kind of index.
But is this somehow looping through all the previous cases for each ID?
I thought indexes could just be used in computations across variables,
not cases. Anyway, thanks for helping me understand what you helped me
do.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Marks, Jim" <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>

07/17/2008 06:03 PM

Please respond to
"Marks, Jim" <[hidden email]>

To

[hidden email]

cc

       
Subject

Re: computations with dates

 

       




Tanya:

A brute force solution, based on a solution to a similar problem from
Art Kendall:

*** SAMPLE DATA.
data list free /id (f8.0)  date (date10) time (time5).
Begin data
1     29-DEC-04     16:12
1     12-JAN-05     11:57
1     01-FEB-05     14:26
1     11-FEB-05     11:14
1     14-FEB-05     18:39
1     10-MAR-05     15:19
1     02-APR-05     12:34
1     02-APR-05     16:16
1     14-MAY-05     08:19
2     14-FEB-05     18:39
2     10-MAR-05     15:19
2     02-APR-06     12:34
2     11-APR-06     16:16
2     14-JUN-06     08:19
end data.

*** create variables to flag if the nth lag is within 90 days.
*** edit for the maximum number of cases within id.
numeric pr_1 to pr_10.

*** calculate the flags.
do repeat v = pr_1 to pr_10 /i = 1 to 10 .
if datediff(date,lag(date,i),"days") LE 90  and id = lag(id,i)  v = 1.
end repeat.

*** create a variable to hold the number of lags.
count priors =pr_1 to pr_10 (1).

*** cleanup-- drop the flag variables.
add files file = * /drop pr_1 to pr_10.

*** trigger data pass.
LIST.

HTH
--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tanya Temkin
Sent: Thursday, July 17, 2008 1:54 PM
To: [hidden email]
Subject: Re: computations with dates

Thanks for the advice, Dennis .... I started to follow your suggestion,
and got this far - up to the summing of days from prior call date (this
is
different set of test data from one in prior posting):

(days diff=time in days from previous call; r_sum=running sum of time
from
1st call)

ID      call-num        call_date       days_diff       r_sum
1000xx  1       29-Dec-200y     0       0
1000xx  2       12-Jan-200x     14      14
1000xx  3       02-Feb-200x     21      35
1000xx  4       11-Feb-200x     9       44
1000xx  5       09-Mar-200x     26      70
1000xx  6       10-Mar-200x     1       71
1000xx  7       02-Apr-200x     23      94
1000xx  8       04-Apr-200x     2       96
1000xx  9       09-Apr-200x     5       101
1000xx  10      14-May-200x     35      136
1000xx  11      20-May-200x     6       142
1000xx  12      03-Jun-200x     14      156
1000xx  13      26-Jun-200x     23      179
1000xx  14      27-Jun-200x     1       180
1000xx  15      27-Jun-200x     0       180
1000xx  16      28-Jun-200x     1       181
1000xx  17      29-Jun-200x     1       182
1000xx  18      05-Jul-200x     6       188
1000xx  19      10-Jul-200x     5       193
1000xx  20      12-Aug-200x     33      226
1001xx  0       25-Aug-200x     0       0
1001xx  1       12-Nov-200x     79      79
1002xx  1       06-Feb-200x     0       0
1002xx  2       09-Feb-200x     3       3
1002xx  3       08-Aug-200x     180     183
1002xx  4       09-Aug-200x     1       184
1002xx  5       12-Aug-200x     3       187
1002xx  6       14-Aug-200x     2       189
1002xx  7       15-Aug-200x     1       190
1002xx  8       31-Aug-200x     16      206
1002xx  9       05-Sep-200x     5       211
1002xx  10      03-Nov-200x     59      270
1002xx  11      09-Dec-200x     36      306
1002xx  12      11-Dec-200x     2       308

The wheels started to come off when I got to the first 90+ day value
(7th
record) ....the previous dates back to 12 Jan 05 are in the prior 90 day
period. It's not just a matter of incrementing by 1 for each 90 day
period
following first call, but checking to see if each prior call is in the
90-day period preceding the current call. Are you sure I am not going to
have to resort to a vector/loop routine to loop through each call for
each
ID? Or am I missing something in your instructions?

Thanks again....


Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




"Dennis Deck" <[hidden email]>
07/17/2008 12:42 AM

To
Tanya L TemKin/CA/KAIPERM@KAIPERM
cc

Subject
RE: computations with dates






That is tricky because the number of prior records to sum across is
unknown and
keeps changing from ID to ID.

One approach might be something like:
 number the records within ID
 track the prior comparison record number starting with 1
 sum the days from the prior comparison record and keep a running sum
from prior comparion
 if prior comparison is more than 90 days then increment and resum
until days are less than 90

Dennis

-----Original Message-----
From: Tanya Temkin [mailto:[hidden email]]
Sent: Wednesday, July 16, 2008 1:39 PM
Subject: computations with dates

I have a data set with each patient call made to our call center within
a certain time period; each call is a record. There is a unique patient
ID, call date, and call time (among many other variables). What I'm
trying to do for each patient is count up all calls that preceded each
call within the last 90 days. This is turning out to be more complicated
than I thought.

PRIOR calls were all initialized to zero, then the **first** record for
each ID was updated with data from another file.
There are no missing values. Cases are sorted by ID  CALL_DATE
CALL_TIME.
Some patients made more than one call on the same day.

At first I thought this would work:
 If ID=lag(ID) and CTIME.DAYS(CALL_DATE-lag(CALL_DATE)) <= 90
   prior_calls=lag(prior_calls)+1.

This is what I got:

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
1     29-DEC-04     16:12          0
1     12-JAN-05     11:57          1
1     01-FEB-05     14:26          2
1     11-FEB-05     11:14          3
1     14-FEB-05     18:39          4
1     10-MAR-05     15:19          5
1     02-APR-05     12:34          6
1     02-APR-05     16:16          7
1     14-MAY-05     08:19          8

Problem here. What would be accurate would be only the number of calls
made in the 90 days prior to each call, i.e.

ID    CALL_DATE     CALL_TIME     PRIOR_CALLS
1     29-DEC-04     16:12          0
1     12-JAN-05     11:57          1
1     01-FEB-05     14:26          2
1     11-FEB-05     11:14          3
1     14-FEB-05     18:39          4
1     10-MAR-05     15:19          5
1     02-APR-05     12:34          5
1     02-APR-05     16:16          6
1     14-MAY-05     08:19          4

But I can't figure out the syntax needed to do this - at least with
current data structure. What approach could I use?

I'm wondering if this is going to entail restructuring my data long to
wide and then applying vector/loop syntax to loop through all the call
dates for each record (not in my comfort zone!)

Thanks!

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD