SPSSX Discussion

One to many merge of large dataset

Classic

List

Threaded

4 messages Options

jredshaw

One to many merge of large dataset

I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.

David Marso

Re: One to many merge of large dataset

Administrator

"table file does not have a unique identifier "..
That means that you have one or more duplicate keys in the file you designate as TABLE.
After the sort try:

MATCH FILES / FILE * / FIRST=@TOP@/LAST=@BOT@ / BY TSTNO .
COMPUTE @UNIQUE@=@TOP@ AND @BOT@.
FREQ @UNIQUE@.

jredshaw wrote

I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

David Marso

Re: One to many merge of large dataset

Administrator

In addition it *MAY* be attempting to use the *UNSORTED* files residing on disk since you are using the filepath rather than the DATASET NAME.

David Marso wrote

"table file does not have a unique identifier "..
That means that you have one or more duplicate keys in the file you designate as TABLE.
After the sort try:

MATCH FILES / FILE * / FIRST=@TOP@/LAST=@BOT@ / BY TSTNO .
COMPUTE @UNIQUE@=@TOP@ AND @BOT@.
FREQ @UNIQUE@.

jredshaw wrote

I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.

David Marso

Re: One to many merge of large dataset

Administrator

In reply to this post by jredshaw

Take a casual look at the files you have in your dropbox and almost ALL of the TSTNO values in the "TABLE" file are duplicated!!! Looks like TSTNO in combination with VEHNO is unique. Maybe you need to use that as an additional KEY in the lookup. OTOH I have no idea what you what the file to look like so that is just my guess.

jredshaw wrote

I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.