|
I need to select cases that have a matching id, term & sequence. Using
the lag function I can create a flag to denote whether the preceding case is a match on ID, term & seq. But how would I select both the flag & the preceding case. sort cases by id term seq. compute flag1=0. if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1. ID, term, seq, flag1 1072 200810 1 0 1074 200910 1 0 1075 200710 1 0 1075 200720 1 0 (keep) 1075 200720 1 1 (keep) 1075 200910 1 0 1226 200720 1 0 1226 200720 2 0 1226 200720 3 0 (keep) 1226 200720 3 1 (keep) 1226 200730 1 0 (keep) 1226 200730 1 1 (keep) After my select if I should end up with: ID, term, seq, flag1 1075 200720 1 0 1075 200720 1 1 1226 200720 3 0 1226 200720 3 1 1226 200730 1 0 1226 200730 1 1 Thanks in advance, David ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
How about something like
AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=ID term seq /Count=N. This will create a new variable (Count) in your dataset, and then you can select cases with count>1. Garry Gelade Business Analytic Ltd -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Wright Sent: 05 February 2010 13:24 To: [hidden email] Subject: sel if using lag & previous case I need to select cases that have a matching id, term & sequence. Using the lag function I can create a flag to denote whether the preceding case is a match on ID, term & seq. But how would I select both the flag & the preceding case. sort cases by id term seq. compute flag1=0. if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1. ID, term, seq, flag1 1072 200810 1 0 1074 200910 1 0 1075 200710 1 0 1075 200720 1 0 (keep) 1075 200720 1 1 (keep) 1075 200910 1 0 1226 200720 1 0 1226 200720 2 0 1226 200720 3 0 (keep) 1226 200720 3 1 (keep) 1226 200730 1 0 (keep) 1226 200730 1 1 (keep) After my select if I should end up with: ID, term, seq, flag1 1075 200720 1 0 1075 200720 1 1 1226 200720 3 0 1226 200720 3 1 1226 200730 1 0 1226 200730 1 1 Thanks in advance, David ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD __________ Information from ESET NOD32 Antivirus, version of virus signature database 4837 (20100205) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __________ Information from ESET NOD32 Antivirus, version of virus signature database 4837 (20100205) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by wsu_wright
If I understand you, you want to be able to select only those cases
that are duplicate cases. Below I describe the code I provide: The first two lines below are copies of your syntax. The third line reverses the sort so that the case with flag1=1 comes before the duplicate case with flag1=0. The fourth line tests the lagged value of flag1 for the value of one. If lag(flag1)=1, then the value of the flag1 for the current case is also set to flag1=1. The fifth line selects the duplicate cases, both with flag1=1. If for some reason you need flag1 to keep 0 and 1, then compute a new variable flag2 that identifies the duplicate cases with flag1=0. Use select if (flag1 eq 1 and flag2 eq 1). to select the duplicate pairs. **. sort cases by id term seq. if id eq lag(id) and term eq lag(term) and seq eq lag(seq)flag1=1. sort cases by id (d) term (d) seq (d). if (lag(flag1) eq 1). compute flag1=1. select if (flag1 eq 1). -Mike Palij New York University [hidden email] ----- Original Message ----- From: "David Wright" <[hidden email]> To: <[hidden email]> Sent: Friday, February 05, 2010 8:24 AM Subject: sel if using lag & previous case >I need to select cases that have a matching id, term & sequence. Using > the lag function I can create a flag to denote whether the preceding > case is a match on ID, term & seq. But how would I select both the flag > & the preceding case. > > sort cases by id term seq. > compute flag1=0. > if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1. > > ID, term, seq, flag1 > 1072 200810 1 0 > 1074 200910 1 0 > 1075 200710 1 0 > 1075 200720 1 0 (keep) > 1075 200720 1 1 (keep) > 1075 200910 1 0 > 1226 200720 1 0 > 1226 200720 2 0 > 1226 200720 3 0 (keep) > 1226 200720 3 1 (keep) > 1226 200730 1 0 (keep) > 1226 200730 1 1 (keep) > > > After my select if I should end up with: > > ID, term, seq, flag1 > 1075 200720 1 0 > 1075 200720 1 1 > 1226 200720 3 0 > 1226 200720 3 1 > 1226 200730 1 0 > 1226 200730 1 1 > > > Thanks in advance, > > David > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by wsu_wright
Try this example syntax. Open a clean instance of SPSS. Paste the
syntax into a new syntax window. Run it. In more complex situations you might try the GUI "find duplicate cases" and paste the syntax, and use the variables it creates. data list list/ ID (f4) term (f6) seq (f1) flag1 (f1) wanted(a6). begin data 1072 200810 1 0 1074 200910 1 0 1075 200710 1 0 1075 200720 1 0 (keep) 1075 200720 1 1 (keep) 1075 200910 1 0 1226 200720 1 0 1226 200720 2 0 1226 200720 3 0 (keep) 1226 200720 3 1 (keep) 1226 200730 1 0 (keep) 1226 200730 1 1 (keep) end data. list. sort cases by id term seq (a) flag1(d). select if flag1 eq 1 or lag(flag1) eq 1. list. sort cases by id term seq (a) flag1(a). list. Art Kendall Social Research Consultants On 2/5/2010 8:24 AM, David Wright wrote: > sort cases by id term seq. > compute flag1=0. > if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
| Free forum by Nabble | Edit this page |
