SPSSX Discussion

combining multiple records with same id

Classic

List

Threaded

5 messages Options

nomi-6

combining multiple records with same id

Hi,

I have a file with 87,000 cases. Each case (line) contains data about a
car accident. the id number for each person is given. If the person had
more than one car accident then he has several entries. I'd like to
combine all these entries so that at the end I will have a line for each
person with all the different variables (and no duplicate entries).
How can I do this with spss?

Thanks!
Nomi

Maguin, Eugene

Re: combining multiple records with same id

Nomi,

I'll be very brief, perhaps too much so but I have other things to do.

You want to go from multiple cases per id to one case per id. This type of
rearrangement is going from 'long' to 'wide' or 'long-to-wide'.

Go into the syntax reference and read up on the Casestovars command. That is
all you need.

Gene Maguin

Melissa Ives

Re: combining multiple records with same id

In reply to this post by nomi-6

Or use AGGREGATE if you prefer to summarize across accidents per person.

Melissa

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Wednesday, October 11, 2006 9:47 AM
To: [hidden email]
Subject: Re: [SPSSX-L] combining multiple records with same id

Nomi,

I'll be very brief, perhaps too much so but I have other things to do.

You want to go from multiple cases per id to one case per id. This type
of rearrangement is going from 'long' to 'wide' or 'long-to-wide'.

Go into the syntax reference and read up on the Casestovars command.
That is all you need.

Gene Maguin

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.

Bauer, Craig

Re: combining multiple records with same id

In reply to this post by nomi-6

Hi Nomi

You could try writing out the first accident to one file, everyone's
second accident to a second file, and so on and then string them all
together, like this:

* Sort by id, create a incremented count called flag, which will tell
you if it's the first, second, etc record for that id number.
sort cases by id.
compute flag=1.
if (lag(id)=id) flag=lag(flag)+1.
save outfile='master.sav'.

* Save out a file with the first record for each person only.
select if (flag=1).
save outfile='one.sav'.

* Get your sorted file, rename all variables to something new (SPSS
won't like duped var names) and save our file two with all second
accidents.
get file='master.sav'.
rename variables (var1 var2 var3 = var1b var2b var3b).
select if (flag=2).
save outfile='two.sav'.

* Repeat this process for every possible instance (so if a person can
have a max of 5 accidents, you'll do this five times and have 5 files).
* If the people in your file are particularly accident-prone and will
have a higher max, you could probably put this in a macro or something
to save time.

* Finally, pull them all together onto one row.
new file.
match files
/file='one.sav'
/file='two.sav'
/by id.
execute.

HTH

Craig

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Nomi
Sent: Wednesday, October 11, 2006 7:11 AM
To: [hidden email]
Subject: combining multiple records with same id

Hi,

I have a file with 87,000 cases. Each case (line) contains data about a
car accident. the id number for each person is given. If the person had
more than one car accident then he has several entries. I'd like to
combine all these entries so that at the end I will have a line for each
person with all the different variables (and no duplicate entries).
How can I do this with spss?

Thanks!
Nomi

Richard Ristow

Re: combining multiple records with same id

In reply to this post by nomi-6

At 07:11 AM 10/11/2006, Nomi wrote:

>I have a file with 87,000 cases. Each case (line) contains data about
>a car accident. If the person had more than one car accident then he
>has several entries. I'd like to combine all these entries so that at
>the end I will have a line for each person with all the different
>variables (and no duplicate entries).

At 10:47 AM 10/11/2006, Gene Maguin wrote:

>You want to go from multiple cases per id to one case per id. This is
>going from 'long' to 'wide' or 'long-to-wide'. In the syntax
>reference, read up on the Casestovars command.

That's correct. You may also find you can click up the command from the
menus; see menu entry "Data>Restructure".

HOWEVER, think at least twice, before you restructure long to wide. Go
over what you plan to do with your data, and how best to do it. You may
want to post, asking about that. Many, many operations are easier with
data in 'long' form, one record per incident and multiple per person,
particularly when there's an indefinite number of incidents per person.
At 11:53 AM 10/11/2006, Melissa Ives wrote:

>Use AGGREGATE if you prefer to summarize across accidents per person.

That is quite right; you'll almost certainly need to do some of it; and
it's one operation much easier with Long organization.