Identify the most important intinerary

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Identify the most important intinerary

Victor Tarrago
Dear list members,

We are facing a complicated SPSS problem and are looking for a clean
solution.

In a current study we are dealing with itineraries people run with a certain
frequency (i.e. going to work and back every working day).

Itineraries consist of a maximum of 12 steps (Step 1: 10km on road x, Step
2: 20km on road y,...) and are each described by the following variables:

A)      Is the step done in the investigated area (yes/no).
B)      To which zone of the investigated area does it belong (1-16)
C)      Type of road being used (1-3; like i.e. Highway, Country-road )
D)      What road is being used? (1-50; i.e. A26, EN778,...)
E)      Distance covered on that road (numeric)

We need to

1)       Identify the "Most important step" defined as largest distance with
highest frequency (frequency is equal for all 12 steps of an itinerary)

2)       Return the values of zone (Var B) and type (Var C) of the selected
step to 2 output variables (O1 and O2).

We guess this should be easy with the aid of a macro but unfortunately I
don't know much about them.

Thanks for your advice!

Víctor Tarragó
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Identify the most important intinerary

Maguin, Eugene
Victor,

I'm going to assume that you are quite familiar with syntax and can work
with a brief description of how I'd work this problem. I haven't tried any
sample data. So I may have errors in my plan and I may not understand some
that things that are necessary.

I'll asume that your data are in a 'long' format file. That is each person's
data consists of itinerary 1, steps 1 thru n1 (12 max), itinerary 2, steps 1
thru n2, etc. And, I assume that each record has a person id, an itinerary
id, and a step id. If they are not that way, you can use Varstocases to
reconstruct the dataset.

I think that if you sort the data by person and itinerary, both in ascending
order, and variable E in descending order, you will have the record you want
at the top of each itinerary set. If you then use aggregate, use person id
and itinerary id as break variables, indicate that the data are presorted,
and use the first function to extract the first non missing value of
variables B and C, you will have what you want.

I wasn't sure what to make of this part of 1) '... highest frequency
(frequency is equal for all 12 steps of an itinerary)' since you say that
the frequency is equal for all steps of an itinerary. So I ignored it.

Gene Maguin