sel if using lag & previous case

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

sel if using lag & previous case

wsu_wright
I need to select cases that have a matching id, term & sequence.  Using
the lag function I can create a flag to denote whether the preceding
case is a match on ID, term & seq.  But how would I select both the flag
& the preceding case.

sort cases by id term seq.
compute flag1=0.
if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1.

ID, term, seq, flag1
1072 200810 1 0
1074 200910 1 0
1075 200710 1 0
1075 200720 1 0 (keep)
1075 200720 1 1 (keep)
1075 200910 1 0
1226 200720 1 0
1226 200720 2 0
1226 200720 3 0 (keep)
1226 200720 3 1 (keep)
1226 200730 1 0 (keep)
1226 200730 1 1 (keep)


After my select if I should end up with:

ID, term, seq, flag1
1075 200720 1 0
1075 200720 1 1
1226 200720 3 0
1226 200720 3 1
1226 200730 1 0
1226 200730 1 1


Thanks in advance,

David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: sel if using lag & previous case

Garry Gelade
How about something like

AGGREGATE   /OUTFILE=*   MODE=ADDVARIABLES /BREAK=ID term seq
  /Count=N.

This will create a new variable (Count) in your dataset, and then you can select cases with count>1.

Garry Gelade
Business Analytic Ltd

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Wright
Sent: 05 February 2010 13:24
To: [hidden email]
Subject: sel if using lag & previous case

I need to select cases that have a matching id, term & sequence.  Using
the lag function I can create a flag to denote whether the preceding
case is a match on ID, term & seq.  But how would I select both the flag
& the preceding case.

sort cases by id term seq.
compute flag1=0.
if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1.

ID, term, seq, flag1
1072 200810 1 0
1074 200910 1 0
1075 200710 1 0
1075 200720 1 0 (keep)
1075 200720 1 1 (keep)
1075 200910 1 0
1226 200720 1 0
1226 200720 2 0
1226 200720 3 0 (keep)
1226 200720 3 1 (keep)
1226 200730 1 0 (keep)
1226 200730 1 1 (keep)


After my select if I should end up with:

ID, term, seq, flag1
1075 200720 1 0
1075 200720 1 1
1226 200720 3 0
1226 200720 3 1
1226 200730 1 0
1226 200730 1 1


Thanks in advance,

David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4837 (20100205) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com




__________ Information from ESET NOD32 Antivirus, version of virus signature database 4837 (20100205) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: sel if using lag & previous case

Mike
In reply to this post by wsu_wright
If I understand you, you want to be able to select only those cases
that are duplicate cases.  Below I describe the code I provide:

The first two lines below are copies of your syntax.

The third line reverses the sort so that the case with flag1=1
comes before the duplicate case with flag1=0.

The fourth line tests the lagged value of flag1 for the value of one.
If lag(flag1)=1, then the value of the flag1 for the current case is
also set to flag1=1.

The fifth line selects the duplicate cases, both with flag1=1.

If for some reason you need flag1 to keep 0 and 1, then compute
a new variable flag2 that identifies the duplicate cases with flag1=0.
Use
select if (flag1 eq 1 and flag2 eq 1).
to select the duplicate pairs.

**.

sort cases by id term seq.
if id eq lag(id) and term eq lag(term) and seq eq lag(seq)flag1=1.

sort cases by id (d) term (d) seq (d).
if (lag(flag1) eq 1). compute flag1=1.

select if (flag1 eq 1).

-Mike Palij
New York University
[hidden email]


----- Original Message -----
From: "David Wright" <[hidden email]>
To: <[hidden email]>
Sent: Friday, February 05, 2010 8:24 AM
Subject: sel if using lag & previous case


>I need to select cases that have a matching id, term & sequence.  Using
> the lag function I can create a flag to denote whether the preceding
> case is a match on ID, term & seq.  But how would I select both the flag
> & the preceding case.
>
> sort cases by id term seq.
> compute flag1=0.
> if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1.
>
> ID, term, seq, flag1
> 1072 200810 1 0
> 1074 200910 1 0
> 1075 200710 1 0
> 1075 200720 1 0 (keep)
> 1075 200720 1 1 (keep)
> 1075 200910 1 0
> 1226 200720 1 0
> 1226 200720 2 0
> 1226 200720 3 0 (keep)
> 1226 200720 3 1 (keep)
> 1226 200730 1 0 (keep)
> 1226 200730 1 1 (keep)
>
>
> After my select if I should end up with:
>
> ID, term, seq, flag1
> 1075 200720 1 0
> 1075 200720 1 1
> 1226 200720 3 0
> 1226 200720 3 1
> 1226 200730 1 0
> 1226 200730 1 1
>
>
> Thanks in advance,
>
> David
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: sel if using lag & previous case

Art Kendall
In reply to this post by wsu_wright
Try this example syntax.  Open a clean instance of SPSS. Paste the
syntax into a new syntax window. Run it.
In more complex situations you might try the GUI "find duplicate cases"
and paste the syntax, and use the variables it creates.

data list list/
    ID (f4) term (f6) seq (f1) flag1 (f1) wanted(a6).
begin data
1072 200810 1 0
1074 200910 1 0
1075 200710 1 0
1075 200720 1 0 (keep)
1075 200720 1 1 (keep)
1075 200910 1 0
1226 200720 1 0
1226 200720 2 0
1226 200720 3 0 (keep)
1226 200720 3 1 (keep)
1226 200730 1 0 (keep)
1226 200730 1 1 (keep)
end data.
list.
sort cases by id term seq (a) flag1(d).
select if flag1 eq 1 or lag(flag1) eq 1.
list.
sort cases by id term seq (a) flag1(a).
list.

Art Kendall
Social Research Consultants

On 2/5/2010 8:24 AM, David Wright wrote:
> sort cases by id term seq.
> compute flag1=0.
> if id eq lag(id) and term eq lag(term) and seq eq lag(seq) flag1=1.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants