|
Hi,
I need to combine 2 large files similar to below. If there is a record in 2006 and 2007 I need to only keep the record from 2007. However, I need to keep all unique records from 2006 and 2007. Any help on how to do this through syntax? data list free/ id year status. begin data 1 2006 1 2 2006 0 3 2006 1 5 2006 0 6 2006 1 7 2006 1 end data. data list free / id year status. begin data 1 2007 1 2 2007 0 3 2007 1 4 2007 0 5 2007 1 6 2007 0 end data. Thanks in advance, Keval ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi,
Perhaps you could try the syntax below. Don't forget to save your work first; it's untested. Cheers!! Albert-Jan data list free/ id year status. begin data 1 2006 1 2 2006 0 3 2006 1 5 2006 0 6 2006 1 7 2006 1 end data. dataset name a. data list free / id year status. begin data 1 2007 1 2 2007 0 3 2007 1 4 2007 0 5 2007 1 6 2007 0 end data. dataset name b. add files / file = a / file = b. sort cases by id year. aggregate outfile = * / presorted / break = id / year = last (year) / status = last (status). --- Keval Khichadia <[hidden email]> wrote: > Hi, > I need to combine 2 large files similar to below. If > there is a record in 2006 and 2007 I need to only > keep the record from 2007. However, I need to keep > all unique records from 2006 and 2007. Any help on > how to do this through syntax? > data list free/ id year status. > begin data > 1 2006 1 > 2 2006 0 > 3 2006 1 > 5 2006 0 > 6 2006 1 > 7 2006 1 > end data. > data list free / id year status. > begin data > 1 2007 1 > 2 2007 0 > 3 2007 1 > 4 2007 0 > 5 2007 1 > 6 2007 0 > end data. > Thanks in advance, > Keval > > > > > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > ===================== > To manage your subscription to SPSSX-L, send a > message to > [hidden email] (not to SPSSX-L), with no > body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send > the command > INFO REFCARD > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Keval Khichadia
At 04:30 PM 5/3/2008, Keval Khichadia wrote:
>I need to combine 2 large files similar to below. If there is a >record in 2006 and 2007 I need to only keep the record from 2007. >However, I need to keep all unique records from 2006 and 2007. I'd try a little different approach than Albert-Jan did. See if you like the following. It's tested (WRR:not saved separately): ADD FILES /FILE=Rcds2006 /FILE=Rcds2007 /BY id year. LIST. List |-----------------------------|---------------------------| |Output Created |04-MAY-2008 22:32:57 | |-----------------------------|---------------------------| id year status 01 2006 1 01 2007 1 02 2006 0 02 2007 0 03 2006 1 03 2007 1 04 2007 0 05 2006 0 05 2007 1 06 2006 1 06 2007 0 07 2006 1 Number of cases read: 12 Number of cases listed: 12 AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /Latest 'Latest year with a record for this ID' = MAX(year). SELECT IF year = Latest. LIST. List |-----------------------------|---------------------------| |Output Created |04-MAY-2008 22:35:08 | |-----------------------------|---------------------------| id year status Latest 01 2007 1 2007 02 2007 0 2007 03 2007 1 2007 04 2007 0 2007 05 2007 1 2007 06 2007 0 2007 07 2006 1 2006 Number of cases read: 7 Number of cases listed: 7 ============================= APPENDIX: Test data, and code ============================= data list free/ id year status. begin data 1 2006 1 2 2006 0 3 2006 1 5 2006 0 6 2006 1 7 2006 1 end data. DATASET NAME Rcds2006 WINDOW=FRONT. FORMATS id(N2) year (F4) status (F2). data list free / id year status. begin data 1 2007 1 2 2007 0 3 2007 1 4 2007 0 5 2007 1 6 2007 0 end data. FORMATS id(N2) year (F4) status (F2). DATASET NAME Rcds2007 WINDOW=FRONT. ADD FILES /FILE=Rcds2006 /FILE=Rcds2007 /BY id year. LIST. AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /Latest 'Latest year with a record for this ID' = MAX(year). SELECT IF year = Latest. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
