combining files

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

combining files

Keval Khichadia
Hi,
I need to combine 2 large files similar to below. If there is a record in 2006 and 2007 I need to only keep the record from 2007. However, I need to keep all unique records from 2006 and 2007. Any help on how to do this through syntax?
data list free/ id year status.
begin data
1 2006 1
2 2006 0
3 2006 1
5 2006 0
6 2006 1
7 2006 1
end data.
data list free / id year status.
begin data
1 2007 1
2 2007 0
3 2007 1
4 2007 0
5 2007 1
6 2007 0
end data.
Thanks in advance,
Keval


      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: combining files

Albert-Jan Roskam
Hi,

Perhaps you could try the syntax below. Don't forget
to save your work first; it's untested.

Cheers!!
Albert-Jan

data list free/ id year status.
begin data
1 2006 1
2 2006 0
3 2006 1
5 2006 0
6 2006 1
7 2006 1
end data.
dataset name a.

data list free / id year status.
begin data
1 2007 1
2 2007 0
3 2007 1
4 2007 0
5 2007 1
6 2007 0
end data.
dataset name b.

add files / file = a / file = b.
sort cases by id year.
aggregate outfile = *
  / presorted
  / break = id
  / year = last (year)
  / status = last (status).


--- Keval Khichadia <[hidden email]> wrote:

> Hi,
> I need to combine 2 large files similar to below. If
> there is a record in 2006 and 2007 I need to only
> keep the record from 2007. However, I need to keep
> all unique records from 2006 and 2007. Any help on
> how to do this through syntax?
> data list free/ id year status.
> begin data
> 1 2006 1
> 2 2006 0
> 3 2006 1
> 5 2006 0
> 6 2006 1
> 7 2006 1
> end data.
> data list free / id year status.
> begin data
> 1 2007 1
> 2 2007 0
> 3 2007 1
> 4 2007 0
> 5 2007 1
> 6 2007 0
> end data.
> Thanks in advance,
> Keval
>
>
>
>
____________________________________________________________________________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.
>
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

>
> =====================
> To manage your subscription to SPSSX-L, send a
> message to
> [hidden email] (not to SPSSX-L), with no
> body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send
> the command
> INFO REFCARD
>



      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: combining files

Richard Ristow
In reply to this post by Keval Khichadia
At 04:30 PM 5/3/2008, Keval Khichadia wrote:

>I need to combine 2 large files similar to below. If there is a
>record in 2006 and 2007 I need to only keep the record from 2007.
>However, I need to keep all unique records from 2006 and 2007.


I'd try a little different approach than Albert-Jan did. See if you
like the following. It's tested (WRR:not saved separately):


ADD FILES
   /FILE=Rcds2006
   /FILE=Rcds2007
   /BY   id year.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |04-MAY-2008 22:32:57       |
|-----------------------------|---------------------------|
id year status

01 2006    1
01 2007    1
02 2006    0
02 2007    0
03 2006    1
03 2007    1
04 2007    0
05 2006    0
05 2007    1
06 2006    1
06 2007    0
07 2006    1

Number of cases read:  12    Number of cases listed:  12


AGGREGATE OUTFILE=* MODE=ADDVARIABLES
   /BREAK=ID
   /Latest 'Latest year with a record for this ID' = MAX(year).

SELECT IF year = Latest.
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |04-MAY-2008 22:35:08       |
|-----------------------------|---------------------------|
id year status Latest

01 2007    1    2007
02 2007    0    2007
03 2007    1    2007
04 2007    0    2007
05 2007    1    2007
06 2007    0    2007
07 2006    1    2006

Number of cases read:  7    Number of cases listed:  7
=============================
APPENDIX: Test data, and code
=============================
data list free/ id year status.
begin data
1 2006 1
2 2006 0
3 2006 1
5 2006 0
6 2006 1
7 2006 1
end data.
DATASET NAME     Rcds2006 WINDOW=FRONT.
FORMATS id(N2) year (F4) status (F2).
data list free / id year status.
begin data
1 2007 1
2 2007 0
3 2007 1
4 2007 0
5 2007 1
6 2007 0
end data.
FORMATS id(N2) year (F4) status (F2).
DATASET NAME     Rcds2007 WINDOW=FRONT.

ADD FILES
   /FILE=Rcds2006
   /FILE=Rcds2007
   /BY   id year.

LIST.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES
   /BREAK=ID
   /Latest 'Latest year with a record for this ID' = MAX(year).

SELECT IF year = Latest.
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD