grouping and sequencing cases

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

grouping and sequencing cases

Tufayel Chowdhury
Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel

Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Ruben Geert van den Berg
Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel

Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Tufayel Chowdhury
Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2

Thanks
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Fri, July 30, 2010 4:38:44 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel


Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Ruben Geert van den Berg
Dear Tufayel,
 
I'm sorry but you did not answer my question and I still don't understand the logic. However, I tried to 'extract' the logic from the data and wrote some syntax that exactly reproduces 'Type' in your data (but only for these example respondents). The syntax is rather long and clumsy, but I didn't see any better options to get it done. Perhaps the List can suggest some improvements?
 
This comes without any warranty whatsoever and I suggest you check the actual results meticulously, I'm not overly confident it will work properly.
 
HTH,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

*Test data.
 
data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.

if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.

descriptives check.
 
delete variables check.


 

Date: Mon, 2 Aug 2010 12:28:40 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

I should have been more explicit about the context. In my dataset each case represent a person's activity throughout the whole day (24 hours). For example, sleeping at home > taking breakfast at home > taking kid to daycare > going to office > from office to grocery > from grocery to home... etc. For my convenience, I have recoded the activity-locations as three main categories: 1=home, 2=office, 3=other places.

I define a TOUR based on two anchors: home and office. Thus, if a person travels like this: home > car > daycare > car > home (1>3>3>3>1), this is a home-home tour. A tour is complete when a person starts from any of the two anchors (home or office) and goes/returns to any of those, via other places (i.e. 3).

The variables 'type' is defined based on the type of tour-origin and tour-destination. The only variables used to define 'type' should be location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home), and type=3, if 2>3>3>2 (office-office).

location tourID type
1 1 2
3 1 2
3 1 2
2 2 3
3 2 3
3 2 3
2 3 3

Hope I have cleared things up. And thanks so much for your help.

-Tufayel 




From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Mon, August 2, 2010 3:56:33 AM
Subject: RE: grouping and sequencing cases

Dear Tufayel,
 
This logic is pretty hard to grasp. From the data, it seems that for
 
1
3
1
3
2
 
The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for the SECOND '1' is determined by the second tour (so its type=2, not 1). Is type always 1 for the first location within a personID?
 
Best,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com



 

Date: Fri, 30 Jul 2010 17:28:30 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2

Thanks
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Fri, July 30, 2010 4:38:44 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel



Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Tufayel Chowdhury
Hi Ruben,

Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the logic, but you got it right. I understood your syntax and was able to write (probably) an easier one to create the variable 'type'. I couldn't do it unless I went through your one.


*Make a little change in tourID.

compute tourID=1.
do if (location=1 or location=2) and (location ne lag(location)).
compute tourID=lag(tourID)+1.
else.
compute tourID=lag(tourID).
end if.
execute.

*Group the tours.

compute group=1.
if tourID = lag(tourID) group = lag(group).
if tourID <> lag(tourID) group = 1+ lag(group).
EXECUTE.

*Compute the first location of the tour.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/firstlocation = first(location).
EXECUTE.

*Compute the last location of the tour.

create leadlocation = lead(location,1).
EXECUTE.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/lastlocation = last(leadlocation).
EXECUTE.

*Compute type.

compute type_b = 0.
if firstlocation = 1 and lastlocation = 1 type_b = 1.
if firstlocation = 1 and lastlocation = 2 type_b = 2.
if firstlocation = 2 and lastlocation = 1 type_b = 2.
if firstlocation = 2 and lastlocation = 2 type_b = 3.
EXECUTE.


Thanks a lot.

-Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Tue, August 3, 2010 5:40:19 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
I'm sorry but you did not answer my question and I still don't understand the logic. However, I tried to 'extract' the logic from the data and wrote some syntax that exactly reproduces 'Type' in your data (but only for these example respondents). The syntax is rather long and clumsy, but I didn't see any better options to get it done. Perhaps the List can suggest some improvements?
 
This comes without any warranty whatsoever and I suggest you check the actual results meticulously, I'm not overly confident it will work properly.
 
HTH,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

*Test data.
 
data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.

if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.

descriptives check.
 
delete variables check.


 

Date: Mon, 2 Aug 2010 12:28:40 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

I should have been more explicit about the context. In my dataset each case represent a person's activity throughout the whole day (24 hours). For example, sleeping at home > taking breakfast at home > taking kid to daycare > going to office > from office to grocery > from grocery to home... etc. For my convenience, I have recoded the activity-locations as three main categories: 1=home, 2=office, 3=other places.

I define a TOUR based on two anchors: home and office. Thus, if a person travels like this: home > car > daycare > car > home (1>3>3>3>1), this is a home-home tour. A tour is complete when a person starts from any of the two anchors (home or office) and goes/returns to any of those, via other places (i.e. 3).

The variables 'type' is defined based on the type of tour-origin and tour-destination. The only variables used to define 'type' should be location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home), and type=3, if 2>3>3>2 (office-office).

location tourID type
1 1 2
3 1 2
3 1 2
2 2 3
3 2 3
3 2 3
2 3 3

Hope I have cleared things up. And thanks so much for your help.

-Tufayel 




From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Mon, August 2, 2010 3:56:33 AM
Subject: RE: grouping and sequencing cases

Dear Tufayel,
 
This logic is pretty hard to grasp. From the data, it seems that for
 
1
3
1
3
2
 
The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for the SECOND '1' is determined by the second tour (so its type=2, not 1). Is type always 1 for the first location within a personID?
 
Best,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com



 

Date: Fri, 30 Jul 2010 17:28:30 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2

Thanks
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Fri, July 30, 2010 4:38:44 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel




Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Ruben Geert van den Berg
Dear Tufayel,
 
Your syntax is lovely, I completely forgot about CREATE and I didn't even know FIRST and LAST were functions in AGGREGATE!
 
However, when I ran it, the variable TYPE_B you created did not correspond to the variable TYPE in your test data. For $casenum=12 TYPE=2 but your syntax rendered TYPE_B=1.
I'll paste the entire syntax below, I suffixed 'your' variables with _T (from Tufayel ;-)).
 
Thanks for the lovely teamwork!

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.
if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.
descriptives check.
 
delete variables check.
 
*********Tufayel solution.
 
*Make a little change in tourID.
compute tourID_T=1.
do if (location=1 or location=2) and (location ne lag(location)).
compute tourID_T=lag(tourID_T)+1.
else.
compute tourID_T=lag(tourID_T).
end if.
execute.
*Group the tours.
compute group=1.
if tourID = lag(tourID) group = lag(group).
if tourID <> lag(tourID) group = 1+ lag(group).
EXECUTE.
*Compute the first location of the tour.
aggregate
/outfile * overwrite=yes mode addvariables
/break group
/firstlocation = first(location).
EXECUTE.
*Compute the last location of the tour.
create leadlocation = lead(location,1).
EXECUTE.
aggregate
/outfile * overwrite=yes mode addvariables
/break group
/lastlocation = last(leadlocation).
EXECUTE.
*Compute type.
compute type_T = 0.
if firstlocation = 1 and lastlocation = 1 type_T = 1.
if firstlocation = 1 and lastlocation = 2 type_T = 2.
if firstlocation = 2 and lastlocation = 1 type_T = 2.
if firstlocation = 2 and lastlocation = 2 type_T = 3.
EXECUTE.
compute check=type_b-type_T.
exe.

 

Date: Wed, 4 Aug 2010 16:27:00 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the logic, but you got it right. I understood your syntax and was able to write (probably) an easier one to create the variable 'type'. I couldn't do it unless I went through your one.


*Make a little change in tourID.

compute tourID=1.
do if (location=1 or location=2) and (location ne lag(location)).
compute tourID=lag(tourID)+1.
else.
compute tourID=lag(tourID).
end if.
execute.

*Group the tours.

compute group=1.
if tourID = lag(tourID) group = lag(group).
if tourID <> lag(tourID) group = 1+ lag(group).
EXECUTE.

*Compute the first location of the tour.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/firstlocation = first(location).
EXECUTE.

*Compute the last location of the tour.

create leadlocation = lead(location,1).
EXECUTE.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/lastlocation = last(leadlocation).
EXECUTE.

*Compute type.

compute type_b = 0.
if firstlocation = 1 and lastlocation = 1 type_b = 1.
if firstlocation = 1 and lastlocation = 2 type_b = 2.
if firstlocation = 2 and lastlocation = 1 type_b = 2.
if firstlocation = 2 and lastlocation = 2 type_b = 3.
EXECUTE.


Thanks a lot.

-Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Tue, August 3, 2010 5:40:19 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
I'm sorry but you did not answer my question and I still don't understand the logic. However, I tried to 'extract' the logic from the data and wrote some syntax that exactly reproduces 'Type' in your data (but only for these example respondents). The syntax is rather long and clumsy, but I didn't see any better options to get it done. Perhaps the List can suggest some improvements?
 
This comes without any warranty whatsoever and I suggest you check the actual results meticulously, I'm not overly confident it will work properly.
 
HTH,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

*Test data.
 
data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.

if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.

descriptives check.
 
delete variables check.


 

Date: Mon, 2 Aug 2010 12:28:40 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

I should have been more explicit about the context. In my dataset each case represent a person's activity throughout the whole day (24 hours). For example, sleeping at home > taking breakfast at home > taking kid to daycare > going to office > from office to grocery > from grocery to home... etc. For my convenience, I have recoded the activity-locations as three main categories: 1=home, 2=office, 3=other places.

I define a TOUR based on two anchors: home and office. Thus, if a person travels like this: home > car > daycare > car > home (1>3>3>3>1), this is a home-home tour. A tour is complete when a person starts from any of the two anchors (home or office) and goes/returns to any of those, via other places (i.e. 3).

The variables 'type' is defined based on the type of tour-origin and tour-destination. The only variables used to define 'type' should be location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home), and type=3, if 2>3>3>2 (office-office).

location tourID type
1 1 2
3 1 2
3 1 2
2 2 3
3 2 3
3 2 3
2 3 3

Hope I have cleared things up. And thanks so much for your help.

-Tufayel 




From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Mon, August 2, 2010 3:56:33 AM
Subject: RE: grouping and sequencing cases

Dear Tufayel,
 
This logic is pretty hard to grasp. From the data, it seems that for
 
1
3
1
3
2
 
The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for the SECOND '1' is determined by the second tour (so its type=2, not 1). Is type always 1 for the first location within a personID?
 
Best,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com



 

Date: Fri, 30 Jul 2010 17:28:30 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2

Thanks
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Fri, July 30, 2010 4:38:44 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel




Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

hillel vardi
In reply to this post by Tufayel Chowdhury
Shalom

Here is another way to do what you asked for .
It look to me that creating  tourID is a mistake  because in many case
the act that end one tour is also the act that start th next one .
In your data  line 4  (location  1 ) is both the end of tour1 and the
start of tour 2. all lines of each  tourID   should have the same type
and that is not the case here .
(your progarm create 9 tourID   but ther is only 7 tours ) .
I also think that after you calculate type  you only need   one line
per trip  (run the syntax  up to  the comment to see it)  .

Hillel Vard
BGU

dataset close all.
data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
compute  seq=$casenum.
add files file=* / keep=personID location seq .
numeric tourID(f2) .
if  any(location, 1, 2)  and   (personID eq  lag(personID,1))
tourID=tourID+1 .
leave  tourID .
SORT CASES BY personID seq .
dataset name orig .
DATASET COPY work .
DATASET ACTIVATE  work .
if  ( location eq  lag(location,1)) and (personID eq  lag(personID,1))
keep=0 .
recode   keep(0=0)(else=1) .
select if  keep=1 .
add files  file=* / by personID / first=first .
if  ( first eq 0 ) and ( lag(first,1) eq 0)     location_detail=
lag(location,2)*100+ lag(location,1)*10+   location .
recode location_detail (232 =3)(131=1)(132 231=2) into type  .
add files  file=* / keep= personID  seq location_detail type  tourID .
select if type gt 0 .
execute .

***  run this part if you want to see all lines .

match files file= orig / file=* / by personID seq .
sort cases by personID(d) seq (d) .
if type gt 0 #type=type .
if sysmis(type)    type= #type .
SORT CASES BY personID seq .




Tufayel Chowdhury wrote:

> Hi Ruben,
>
> Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the
> logic, but you got it right. I understood your syntax and was able to
> write (probably) an easier one to create the variable 'type'. I
> couldn't do it unless I went through your one.
>
>
> *Make a little change in tourID.
>
> compute tourID=1.
> do if (location=1 or location=2) and (location ne lag(location)).
> compute tourID=lag(tourID)+1.
> else.
> compute tourID=lag(tourID).
> end if.
> execute.
>
> *Group the tours.
>
> compute group=1.
> if tourID = lag(tourID) group = lag(group).
> if tourID <> lag(tourID) group = 1+ lag(group).
> EXECUTE.
>
> *Compute the first location of the tour.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break group
> /firstlocation = first(location).
> EXECUTE.
>
> *Compute the last location of the tour.
>
> create leadlocation = lead(location,1).
> EXECUTE.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break group
> /lastlocation = last(leadlocation).
> EXECUTE.
>
> *Compute type.
>
> compute type_b = 0.
> if firstlocation = 1 and lastlocation = 1 type_b = 1.
> if firstlocation = 1 and lastlocation = 2 type_b = 2.
> if firstlocation = 2 and lastlocation = 1 type_b = 2.
> if firstlocation = 2 and lastlocation = 2 type_b = 3.
> EXECUTE.
>
>
> Thanks a lot.
>
> -Tufayel
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <[hidden email]>
> *To:* [hidden email]
> *Sent:* Tue, August 3, 2010 5:40:19 AM
> *Subject:* Re: grouping and sequencing cases
>
> Dear Tufayel,
>
> I'm sorry but you did not answer my question and I still don't
> understand the logic. However, I tried to 'extract' the logic from the
> data and wrote some syntax that exactly reproduces 'Type' in your data
> (but only for these example respondents). The syntax is rather long
> and clumsy, but I didn't see any better options to get it done.
> Perhaps the List can suggest some improvements?
>
> This comes without any warranty whatsoever and I suggest you check the
> actual results meticulously, I'm not overly confident it will work
> properly.
>
> HTH,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: [hidden email]
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com>
>
> *Test data.
>
> data list free/personID location   tourID   type.
> begin data
> 1 1 1 1
> 1 3 1 1
> 1 3 1 1
> 1 1 2 2
> 1 3 2 2
> 1 2 3 3
> 1 3 3 3
> 1 3 3 3
> 1 3 3 3
> 1 2 4 2
> 1 3 4 2
> 1 1 5 2
> 2 1 1 2
> 2 3 1 2
> 2 2 2 2
> 2 3 2 2
> 2 3 2 2
> 2 1 3 2
> 3 2 1 3
> 3 3 1 3
> 3 3 1 3
> 3 2 2 3
> end data.
>
> dataset name d1.
>
> *Create visit.
>
> compute visit=1.
> if personID=lag(personID) visit=lag(visit)+1.
>
> *Compute tourID.
>
> compute tourID_B=1.
> do if (personID=lag(personID)) and (location=1 or location=2) and
> (location ne lag(location)).
> compute tourID_B=lag(tourID_B)+1.
> else  if (personID=lag(personID)).
> compute tourID_B=lag(tourID_B).
> end if.
> execute.
>
> *********Scratch copy of data.
>
> dataset copy d2.
> dataset activate d2.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break personid
> /maxvis=max(visit).
> execute.
>
> select if location ne 3.
>
> compute type_B=0.
> if visit=maxvis and location=1 and lag(location)=1 type_b=1.
> if visit=maxvis and location=1 and lag(location)=2 type_b=2.
> if visit=maxvis and location=2 and lag(location)=1 type_b=2.
> if visit=maxvis and location=2 and lag(location)=2 type_b=3.
>
> sort cases personid(a)visit(d).
>
> compute newcount=1.
> if personID=lag(personID) newcount=lag(newcount)+1.
> compute t1=lag(type_b).
> execute.
>
> if t1 gt 0 type_b=t1.
> execute.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break personid
> /maxnewcount=max(newcount).
>
> loop #i=3 to maxnewcount.
> if newcount=#i and location=1 and lag(location)=1 type_b=1.
> if newcount=#i and location=1 and lag(location)=2 type_b=2.
> if newcount=#i and location=2 and lag(location)=1 type_b=2.
> if newcount=#i and location=2 and lag(location)=2 type_b=3.
> end loop.
>
> sort cases personid visit.
>
> match files file *
> /keep personid visit type_b.
> execute.
>
> match files file d1
> /file d2
> /by personid visit.
> execute.
>
> dataset close all.
> dataset name d1.
>
> if mis (type_b) type_b=lag(type_b).
> execute.
>
> compute check=type-type_b.
>
> descriptives check.
>
> delete variables check.
>
>
>
> ------------------------------------------------------------------------
> Date: Mon, 2 Aug 2010 12:28:40 -0700
> From: [hidden email]
> Subject: Re: grouping and sequencing cases
> To: [hidden email]
>
> Hi Ruben,
>
> I should have been more explicit about the context. In my dataset each
> case represent a person's activity throughout the whole day (24
> hours). For example, sleeping at home > taking breakfast at home >
> taking kid to daycare > going to office > from office to grocery >
> from grocery to home... etc. For my convenience, I have recoded the
> activity-locations as three main categories: 1=home, 2=office, 3=other
> places.
>
> I define a TOUR based on two anchors: home and office. Thus, if a
> person travels like this: home > car > daycare > car > home
> (1>3>3>3>1), this is a home-home tour. A tour is complete when a
> person starts from any of the two anchors (home or office) and
> goes/returns to any of those, via other places (i.e. 3).
>
> The variables 'type' is defined based on the type of tour-origin and
> tour-destination. The only variables used to define 'type' should be
> location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home
> tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home),
> and type=3, if 2>3>3>2 (office-office).
>
> location tourID type
> 1 1 2
> 3 1 2
> 3 1 2
> 2 2 3
> 3 2 3
> 3 2 3
> 2 3 3
>
> Hope I have cleared things up. And thanks so much for your help.
>
> -Tufayel
>
>
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <[hidden email]>
> *To:* [hidden email]
> *Sent:* Mon, August 2, 2010 3:56:33 AM
> *Subject:* RE: grouping and sequencing cases
>
> Dear Tufayel,
>
> This logic is pretty hard to grasp. From the data, it seems that for
>
> 1
> 3
> 1
> 3
> 2
>
> The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for
> the SECOND '1' is determined by the second tour (so its type=2, not
> 1). Is type always 1 for the first location within a personID?
>
> Best,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: [hidden email]
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com/>
>
>
>
>
> ------------------------------------------------------------------------
> Date: Fri, 30 Jul 2010 17:28:30 -0700
> From: [hidden email]
> Subject: Re: grouping and sequencing cases
> To: [hidden email]
>
> Hi Ruben,
>
> Thank you for the help with tourID. I'm sorry for being ambiguous
> about the variable 'type'. Thing is, 1 and 2 are fixed locations (home
> and office respectively) and 3 is any kind of vehicle/walk. You'd
> notice in the example that within a personID the location changes from
> 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like
> 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in
> the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for
> 2-3-2 (tourID=3), type=3. I hope this clears things up.
>
> *personID **location   tourID**   **type*
> *1** **1** **1** **1*
> *1** **3** **1** **1*
> *1** **3** **1** **1*
> *1** **1** **2** **2*
> *1** **3** **2** **2*
> *1** **2** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **2** **4** **2*
> *1** **3** **4** **2*
> *1** **1** **5** **2*
>
> Thanks
> Tufayel
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <[hidden email]>
> *To:* [hidden email]
> *Sent:* Fri, July 30, 2010 4:38:44 AM
> *Subject:* Re: grouping and sequencing cases
>
> Dear Tufayel,
>
> The order of cases in your example data is vital information, isn't
> it? The first thing I'd do if these were my raw data, is create this
> order in the data. Otherwise, if you'd sort your records randomly,
> you'd destroy part of the information contained in the data. I created
> a new variable 'visit' which is the nth visit for each personID. The
> next block of syntax should create tourID (I called it tourID_B so you
> can compare it to your desired tourID).
>
> Your third request, however, was somewhat unclear to me. I think in
> total you have 9 location sequences:
>
> 1-1
> 1-2
> 1-3
> 2-1
> 2-2
> 2-3
> 3-1
> 3-2
> 3-3
>
> Four of these (within personID) cause the tourID to change:
>
> 1-2
> 2-1
> 3-1
> 3-2
>
> So within tourID groups there are 5 possible sequences:
>
> 1-1
> 1-3
> 2-2
> 2-3
> 3-3
>
> As I understood, you want to create type within tourID group, so for
> each of these 5 sequences the value of type should be specified (even
> if (system) missing).
>
> Could you please help us out a bit more?
>
> Best,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: [hidden email]
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com/>
>
>
> data list free/personID location   tourID   type.
> begin data
> 1 1 1 1
> 1 3 1 1
> 1 3 1 1
> 1 1 2 2
> 1 3 2 2
> 1 2 3 3
> 1 3 3 3
> 1 3 3 3
> 1 3 3 3
> 1 2 4 2
> 1 3 4 2
> 1 1 5 2
> 2 1 1 2
> 2 3 1 2
> 2 2 2 2
> 2 3 2 2
> 2 3 2 2
> 2 1 3 2
> 3 2 1 3
> 3 3 1 3
> 3 3 1 3
> 3 2 2 3
> end data.
>
> compute visit=1.
> if personID=lag(personID) visit=lag(visit)+1.
>
> compute tourID_B=1.
> do if (personID=lag(personID)) and (location=1 or location=2) and
> (location ne lag(location)).
> compute tourID_B=lag(tourID_B)+1.
> else  if (personID=lag(personID)).
> compute tourID_B=lag(tourID_B).
> end if.
> execute.
>
>
>
>
> ------------------------------------------------------------------------
>
> Date: Thu, 29 Jul 2010 16:58:27 -0700
> From: [hidden email]
> Subject: grouping and sequencing cases
> To: [hidden email]
>
>
> Hi all,
>
> I am trying to create two variables (tourID and type) from two
> existing variables (personID and location). TourID is a sequence where
> the numbers remain the same until 'location' 1 or 2 arrives. TourID
> always starts from 1 when personID changes.
>
> The variable 'type' can take three values - 1, 2 and 3. If the
> 'location' changes from 1 to 1, type=1 (for that tourID group); if it
> changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four
> cases).
>
> *personID **location   tourID**   **type*
> *1** **1** **1** **1*
> *1** **3** **1** **1*
> *1** **3** **1** **1*
> *1** **1** **2** **2*
> *1** **3** **2** **2*
> *1** **2** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **2** **4** **2*
> *1** **3** **4** **2*
> *1** **1** **5** **2*
> *2** **1** **1** **2*
> *2** **3** **1** **2*
> *2** **2** **2** **2*
> *2** **3** **2** **2*
> *2** **3** **2** **2*
> *2** **1** **3** **2*
> *3** **2** **1** **3*
> *3** **3** **1** **3*
> *3** **3** **1** **3*
> *3** **2** **2** **3*
> *
> *
> Can anyone please help me out?
>
> Thanks in advance!
>
> Tufayel
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: grouping and sequencing cases

Tufayel Chowdhury
In reply to this post by Ruben Geert van den Berg
Hi Ruben,

Yes, it was wonderful, I've learned a lot. As to the $casenum=12, my syntax gives the accurate result (type_b = 2). I checked it couple of times. The syntax also works fine with my dataset. But thanks a lot, I'm pretty new to syntax and your syntax helped a lot.

Regards
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Thu, August 5, 2010 3:56:26 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
Your syntax is lovely, I completely forgot about CREATE and I didn't even know FIRST and LAST were functions in AGGREGATE!
 
However, when I ran it, the variable TYPE_B you created did not correspond to the variable TYPE in your test data. For $casenum=12 TYPE=2 but your syntax rendered TYPE_B=1.
I'll paste the entire syntax below, I suffixed 'your' variables with _T (from Tufayel ;-)).
 
Thanks for the lovely teamwork!

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.
if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.
descriptives check.
 
delete variables check.
 
*********Tufayel solution.
 
*Make a little change in tourID.
compute tourID_T=1.
do if (location=1 or location=2) and (location ne lag(location)).
compute tourID_T=lag(tourID_T)+1.
else.
compute tourID_T=lag(tourID_T).
end if.
execute.
*Group the tours.
compute group=1.
if tourID = lag(tourID) group = lag(group).
if tourID <> lag(tourID) group = 1+ lag(group).
EXECUTE.
*Compute the first location of the tour.
aggregate
/outfile * overwrite=yes mode addvariables
/break group
/firstlocation = first(location).
EXECUTE.
*Compute the last location of the tour.
create leadlocation = lead(location,1).
EXECUTE.
aggregate
/outfile * overwrite=yes mode addvariables
/break group
/lastlocation = last(leadlocation).
EXECUTE.
*Compute type.
compute type_T = 0.
if firstlocation = 1 and lastlocation = 1 type_T = 1.
if firstlocation = 1 and lastlocation = 2 type_T = 2.
if firstlocation = 2 and lastlocation = 1 type_T = 2.
if firstlocation = 2 and lastlocation = 2 type_T = 3.
EXECUTE.
compute check=type_b-type_T.
exe.

 

Date: Wed, 4 Aug 2010 16:27:00 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the logic, but you got it right. I understood your syntax and was able to write (probably) an easier one to create the variable 'type'. I couldn't do it unless I went through your one.


*Make a little change in tourID.

compute tourID=1.
do if (location=1 or location=2) and (location ne lag(location)).
compute tourID=lag(tourID)+1.
else.
compute tourID=lag(tourID).
end if.
execute.

*Group the tours.

compute group=1.
if tourID = lag(tourID) group = lag(group).
if tourID <> lag(tourID) group = 1+ lag(group).
EXECUTE.

*Compute the first location of the tour.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/firstlocation = first(location).
EXECUTE.

*Compute the last location of the tour.

create leadlocation = lead(location,1).
EXECUTE.

aggregate
/outfile * overwrite=yes mode addvariables
/break group
/lastlocation = last(leadlocation).
EXECUTE.

*Compute type.

compute type_b = 0.
if firstlocation = 1 and lastlocation = 1 type_b = 1.
if firstlocation = 1 and lastlocation = 2 type_b = 2.
if firstlocation = 2 and lastlocation = 1 type_b = 2.
if firstlocation = 2 and lastlocation = 2 type_b = 3.
EXECUTE.


Thanks a lot.

-Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Tue, August 3, 2010 5:40:19 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
I'm sorry but you did not answer my question and I still don't understand the logic. However, I tried to 'extract' the logic from the data and wrote some syntax that exactly reproduces 'Type' in your data (but only for these example respondents). The syntax is rather long and clumsy, but I didn't see any better options to get it done. Perhaps the List can suggest some improvements?
 
This comes without any warranty whatsoever and I suggest you check the actual results meticulously, I'm not overly confident it will work properly.
 
HTH,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

*Test data.
 
data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
dataset name d1.
 
*Create visit.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
*Compute tourID.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.
 
*********Scratch copy of data.
 
dataset copy d2.
dataset activate d2.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxvis=max(visit).
execute.
 
select if location ne 3.
 
compute type_B=0.
if visit=maxvis and location=1 and lag(location)=1 type_b=1.
if visit=maxvis and location=1 and lag(location)=2 type_b=2.
if visit=maxvis and location=2 and lag(location)=1 type_b=2.
if visit=maxvis and location=2 and lag(location)=2 type_b=3.
 
sort cases personid(a)visit(d).
 
compute newcount=1.
if personID=lag(personID) newcount=lag(newcount)+1.
compute t1=lag(type_b).
execute.

if t1 gt 0 type_b=t1.
execute.
 
aggregate
/outfile * overwrite=yes mode addvariables
/break personid
/maxnewcount=max(newcount).
 
loop #i=3 to maxnewcount.
if newcount=#i and location=1 and lag(location)=1 type_b=1.
if newcount=#i and location=1 and lag(location)=2 type_b=2.
if newcount=#i and location=2 and lag(location)=1 type_b=2.
if newcount=#i and location=2 and lag(location)=2 type_b=3.
end loop.
 
sort cases personid visit.
 
match files file *
/keep personid visit type_b.
execute.
 
match files file d1
/file d2
/by personid visit.
execute.
 
dataset close all.
dataset name d1.
 
if mis (type_b) type_b=lag(type_b).
execute.
 
compute check=type-type_b.

descriptives check.
 
delete variables check.


 

Date: Mon, 2 Aug 2010 12:28:40 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

I should have been more explicit about the context. In my dataset each case represent a person's activity throughout the whole day (24 hours). For example, sleeping at home > taking breakfast at home > taking kid to daycare > going to office > from office to grocery > from grocery to home... etc. For my convenience, I have recoded the activity-locations as three main categories: 1=home, 2=office, 3=other places.

I define a TOUR based on two anchors: home and office. Thus, if a person travels like this: home > car > daycare > car > home (1>3>3>3>1), this is a home-home tour. A tour is complete when a person starts from any of the two anchors (home or office) and goes/returns to any of those, via other places (i.e. 3).

The variables 'type' is defined based on the type of tour-origin and tour-destination. The only variables used to define 'type' should be location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home), and type=3, if 2>3>3>2 (office-office).

location tourID type
1 1 2
3 1 2
3 1 2
2 2 3
3 2 3
3 2 3
2 3 3

Hope I have cleared things up. And thanks so much for your help.

-Tufayel 




From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Mon, August 2, 2010 3:56:33 AM
Subject: RE: grouping and sequencing cases

Dear Tufayel,
 
This logic is pretty hard to grasp. From the data, it seems that for
 
1
3
1
3
2
 
The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for the SECOND '1' is determined by the second tour (so its type=2, not 1). Is type always 1 for the first location within a personID?
 
Best,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com



 

Date: Fri, 30 Jul 2010 17:28:30 -0700
From: [hidden email]
Subject: Re: grouping and sequencing cases
To: [hidden email]

Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2

Thanks
Tufayel


From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Fri, July 30, 2010 4:38:44 AM
Subject: Re: grouping and sequencing cases

Dear Tufayel,
 
The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).
 
Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:
 
1-1
1-2
1-3
2-1
2-2
2-3
3-1
3-2
3-3
 
Four of these (within personID) cause the tourID to change:
 
1-2
2-1
3-1
3-2
 
So within tourID groups there are 5 possible sequences:
 
1-1
1-3
2-2
2-3
3-3
 
As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).
 
Could you please help us out a bit more?
 
Best, 

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com


data list free/personID location   tourID   type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
 
compute visit=1.
if personID=lag(personID) visit=lag(visit)+1.
 
compute tourID_B=1.
do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)).
compute tourID_B=lag(tourID_B)+1.
else  if (personID=lag(personID)).
compute tourID_B=lag(tourID_B).
end if.
execute.


 



Date: Thu, 29 Jul 2010 16:58:27 -0700
From: [hidden email]
Subject: grouping and sequencing cases
To: [hidden email]


Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes. 

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location   tourID   type
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3

Can anyone please help me out?

Thanks in advance! 

Tufayel