How to identify blocks of consecutive cases in SPSS?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

How to identify blocks of consecutive cases in SPSS?

Roberts, Michael-2

Good Afternoon All,

 

I’m hoping someone in here would have a solution for a data manipulation problem I am facing, and one which eludes me:

 

I have cases, some of which are consecutive in time and the only indicator I have is the month – so for a given case, I have an Identifier, and the month (plus other data which are no help!)  The problem is that for a given identifier, there are breaks between continuous periods, and I need to be able to measure the length of time any given contiguous period – somewhat like a length of stay in a hospital. I have done something like this in the past with regular dates, but cannot for the life of me remember how I did so, and would appreciate some advice.  Data may look like this:

 

ID   Month

1a  3

1a  6

1a  7

1a  8

1a  11

1a  12

2a  9

2a  11

2a  12

2a  15

2a  16

2a  17

3a  1

3a  3

3a  4

3a  7

3a  8

3a  9

 

So, 1a has two spans of 3 consecutive months and 2 months, 2a has 1 span of 2 consecutive months, etc.

 

 

TIA

Mike

Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

Marks, Jim

An example using scratch variables to “copy” the sequence identifiers across variable number of consecutive cases:

 

** sample data.

NEW FILE.

DATA LIST  FREE /id (a2) month (f8.0).

BEGIN DATA

1a  3

1a  6

1a  7

1a  8

1a  11

1a  12

2a  9

2a  11

2a  12

2a  15

2a  16

2a  17

3a  1

3a  3

3a  4

3a  7

3a  8

3a  9

END DATA.

 

DATASET NAME tst WINDOW=  FRONT.

 

COMPUTE month_seq = month.

IF  id NE LAG(id) month_seq = 0.

 

IF id EQ LAG(id) month_seq = LAG(month)+1.

 

COMPUTE same_seq = month = month_seq.

COMPUTE new_seq = same_seq =0.

FREQUENCIES new_seq.

FILTER BY new_seq.

 

** RANK requires numeric values to work for each ID values.

 

COMPUTE id_nbr = numeric(substr(id,1,1),f8.0).

 

RANK month BY id_nbr /RANK INTO seq_nbr.

 

FILTER OFF .

 

DO IF new_seq.

COMPUTE #seqhold = seq_nbr.

ELSE .

COMPUTE  seq_nbr = #seqhold.

END IF .

EXECUTE .

 

Jim Marks

Director, Market Research

x1616

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Roberts, Michael
Sent: Friday, January 14, 2011 3:55 PM
To: [hidden email]
Subject: How to identify blocks of consecutive cases in SPSS?

 

Good Afternoon All,

 

I’m hoping someone in here would have a solution for a data manipulation problem I am facing, and one which eludes me:

 

I have cases, some of which are consecutive in time and the only indicator I have is the month – so for a given case, I have an Identifier, and the month (plus other data which are no help!)  The problem is that for a given identifier, there are breaks between continuous periods, and I need to be able to measure the length of time any given contiguous period – somewhat like a length of stay in a hospital. I have done something like this in the past with regular dates, but cannot for the life of me remember how I did so, and would appreciate some advice.  Data may look like this:

 

ID   Month

1a  3

1a  6

1a  7

1a  8

1a  11

1a  12

2a  9

2a  11

2a  12

2a  15

2a  16

2a  17

3a  1

3a  3

3a  4

3a  7

3a  8

3a  9

 

So, 1a has two spans of 3 consecutive months and 2 months, 2a has 1 span of 2 consecutive months, etc.

 

 

TIA

Mike

Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

Roberts, Michael-2

Jim,

 

I just ran through the example you sent and it looks like it does what I need to get done.

 

Thank You!  May your weekend be filled with many good things!!! J

 

Best Regards

Mike

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marks, Jim
Sent: Friday, January 14, 2011 5:51 PM
To: [hidden email]
Subject: Re: How to identify blocks of consecutive cases in SPSS?

 

An example using scratch variables to “copy” the sequence identifiers across variable number of consecutive cases:

 

** sample data.

NEW FILE.

DATA LIST  FREE /id (a2) month (f8.0).

BEGIN DATA

1a  3

1a  6

1a  7

1a  8

1a  11

1a  12

2a  9

2a  11

2a  12

2a  15

2a  16

2a  17

3a  1

3a  3

3a  4

3a  7

3a  8

3a  9

END DATA.

 

DATASET NAME tst WINDOW=  FRONT.

 

COMPUTE month_seq = month.

IF  id NE LAG(id) month_seq = 0.

 

IF id EQ LAG(id) month_seq = LAG(month)+1.

 

COMPUTE same_seq = month = month_seq.

COMPUTE new_seq = same_seq =0.

FREQUENCIES new_seq.

FILTER BY new_seq.

 

** RANK requires numeric values to work for each ID values.

 

COMPUTE id_nbr = numeric(substr(id,1,1),f8.0).

 

RANK month BY id_nbr /RANK INTO seq_nbr.

 

FILTER OFF .

 

DO IF new_seq.

COMPUTE #seqhold = seq_nbr.

ELSE .

COMPUTE  seq_nbr = #seqhold.

END IF .

EXECUTE .

 

Jim Marks

Director, Market Research

x1616

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Roberts, Michael
Sent: Friday, January 14, 2011 3:55 PM
To: [hidden email]
Subject: How to identify blocks of consecutive cases in SPSS?

 

Good Afternoon All,

 

I’m hoping someone in here would have a solution for a data manipulation problem I am facing, and one which eludes me:

 

I have cases, some of which are consecutive in time and the only indicator I have is the month – so for a given case, I have an Identifier, and the month (plus other data which are no help!)  The problem is that for a given identifier, there are breaks between continuous periods, and I need to be able to measure the length of time any given contiguous period – somewhat like a length of stay in a hospital. I have done something like this in the past with regular dates, but cannot for the life of me remember how I did so, and would appreciate some advice.  Data may look like this:

 

ID   Month

1a  3

1a  6

1a  7

1a  8

1a  11

1a  12

2a  9

2a  11

2a  12

2a  15

2a  16

2a  17

3a  1

3a  3

3a  4

3a  7

3a  8

3a  9

 

So, 1a has two spans of 3 consecutive months and 2 months, 2a has 1 span of 2 consecutive months, etc.

 

 

TIA

Mike

Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

David Marso
Administrator
In reply to this post by Roberts, Michael-2
I believe this does the same as other solution.
To get the length of runs use AGGREGATE.
IF ID<>LAG(ID) OR $CASENUM=1 #SEQ=1.
IF MONTH-LAG(MONTH) >1 #SEQ=#SEQ+1.
COMPUTE SEQ_nbr=#SEQ.
LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

David Marso
Administrator
David Marso wrote
I believe this does the same as other solution.
To get the length of runs use AGGREGATE.
IF ID<>LAG(ID) OR $CASENUM=1 #SEQ=1.
IF MONTH-LAG(MONTH) >1 #SEQ=#SEQ+1.
COMPUTE SEQ_nbr=#SEQ.
LIST.
OR A mind bending one liner ;-)
Grok this and you'll get dizzy ....
I wouldn't use this in production code because it is somewhat  cranially oppressive.
OTOH, the basic technique might be useful in some applications.
----------------------
COMPUTE SEQ_nbr3= SUM(
                       (ID<>LAG(ID) OR $CASENUM=1) ,
                       ( ID=LAG(ID)) * SUM(
                                            (MONTH-LAG(MONTH)>1 ),
                                             LAG(SEQ_nbr3)
                                           )
                      ).
LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

Art Kendall
definitely "cranially oppressive". Succinct but cranially oppressive.


Art

On 1/17/2011 5:27 AM, David Marso wrote:

> David Marso wrote:
>> I believe this does the same as other solution.
>> To get the length of runs use AGGREGATE.
>> IF ID<>LAG(ID) OR $CASENUM=1 #SEQ=1.
>> IF MONTH-LAG(MONTH)>1 #SEQ=#SEQ+1.
>> COMPUTE SEQ_nbr=#SEQ.
>> LIST.
>>
> OR A mind bending one liner ;-)
> Grok this and you'll get dizzy ....
> I wouldn't use this in production code because it is somewhat  cranially
> oppressive.
> OTOH, the basic technique might be useful in some applications.
> ----------------------
> COMPUTE SEQ_nbr3= SUM(
>                         (ID<>LAG(ID) OR $CASENUM=1) ,
>                         ( ID=LAG(ID)) * SUM(
>                                              (MONTH-LAG(MONTH)>1 ),
>                                               LAG(SEQ_nbr3)
>                                             )
>                        ).
> LIST.
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-identify-blocks-of-consecutive-cases-in-SPSS-tp3342019p3344167.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to identify blocks of consecutive cases in SPSS?

Roberts, Michael-2
In reply to this post by David Marso
David,

Thank you very much for both solutions to my data manipulation problem! Your second solution is, as Art says, succinct, but warps the edges of my mind, but I will get to the bottom of it, yet! :)

Again, thanks for the help.

Mike


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Monday, January 17, 2011 5:27 AM
To: [hidden email]
Subject: Re: How to identify blocks of consecutive cases in SPSS?

David Marso wrote:
>
> I believe this does the same as other solution.
> To get the length of runs use AGGREGATE.
> IF ID<>LAG(ID) OR $CASENUM=1 #SEQ=1.
> IF MONTH-LAG(MONTH) >1 #SEQ=#SEQ+1.
> COMPUTE SEQ_nbr=#SEQ.
> LIST.
>

OR A mind bending one liner ;-)
Grok this and you'll get dizzy ....
I wouldn't use this in production code because it is somewhat  cranially
oppressive.
OTOH, the basic technique might be useful in some applications.
----------------------
COMPUTE SEQ_nbr3= SUM(
                       (ID<>LAG(ID) OR $CASENUM=1) ,
                       ( ID=LAG(ID)) * SUM(
                                            (MONTH-LAG(MONTH)>1 ),
                                             LAG(SEQ_nbr3)
                                           )
                      ).
LIST.
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-identify-blocks-of-consecutive-cases-in-SPSS-tp3342019p3344167.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD