One to many merge of large dataset

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

One to many merge of large dataset

jredshaw
I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
  FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
  FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: One to many merge of large dataset

David Marso
Administrator
"table file does not have a unique identifier "..
That means that you have one or more duplicate keys in the file you designate as TABLE.
After the sort try:

MATCH FILES / FILE * / FIRST=@TOP@/LAST=@BOT@ / BY TSTNO .
COMPUTE @UNIQUE@=@TOP@ AND @BOT@.
FREQ @UNIQUE@.


jredshaw wrote
I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
  FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
  FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: One to many merge of large dataset

David Marso
Administrator
In addition it *MAY* be attempting to use the *UNSORTED* files residing on disk since you are using the filepath rather than the DATASET NAME.
David Marso wrote
"table file does not have a unique identifier "..
That means that you have one or more duplicate keys in the file you designate as TABLE.
After the sort try:

MATCH FILES / FILE * / FIRST=@TOP@/LAST=@BOT@ / BY TSTNO .
COMPUTE @UNIQUE@=@TOP@ AND @BOT@.
FREQ @UNIQUE@.


jredshaw wrote
I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
  FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
  FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: One to many merge of large dataset

David Marso
Administrator
In reply to this post by jredshaw
Take a casual look at the files you have in your dropbox and almost ALL of the TSTNO values in the "TABLE" file are duplicated!!!  Looks like TSTNO in combination with VEHNO is unique.  Maybe you need to use that as an additional KEY in the lookup.  OTOH I have no idea what you what the file to look like so that is just my guess.

jredshaw wrote
I'm a relatively new SPSS user. I am trying to add several variables from one dataset with only one entry per ID number (https://www.dropbox.com/s/1jzaxcmjswdz7w0/Veh%20Data%20Reduced.sav) to another larger dataset that has several entries per ID number (https://www.dropbox.com/s/lguvj3n6ceezkzs/Inst%20Data.sav)

-ID number is TSTNO and I am trying to use variations on the following code thinking that this should be a one to many merge, however, I can't seem to get it to work. I keep getting error 5132 (I think, sorry I don't actually have it in front of me) - table file does not have a unique identifier

GET
  FILE='/Users/Jeff/Desktop/Veh Data Reduced.sav'.
DATASET NAME DataSet4 WINDOW=FRONT.
SORT CASES BY TSTNO.

GET
  FILE='/Users/Jeff/Desktop/Inst Data.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
SORT CASES BY TSTNO.

Match files file='/Users/Jeff/Desktop/Inst Data.sav'
/TABLE='/Users/Jeff/Desktop/Veh Data Reduced.sav'
/BY TSTNO.

Any thoughts? I have tried it both ways and get the same error. Any help would be greatly appreciated. Thanks.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"