using lag function to keep records based on certain conditions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

using lag function to keep records based on certain conditions

thara vardhan-2
Dear List Members

Can anyone teach me how to use the lag function to select records in my
data file based on certain conditions please?

Situation1:

Erefnum   Personcni     Newifc  newweapon

1234          8888                 5                21
1234          8888                 5                21

I need to keep record 1 only in my data file.

Situation2:

Erefnum   Personcni     Newifc  newweapon

5678          9999                 1                7
5678          9999                 1                11

In this case I need to keep the record with newweapon code 11 in my data
file.

I am using SPSS version17.0 since last week. Using the menu I tried to use
transform/shift values function but was not able to do it.

Any help is much appreciated.

thanks
regards
Thara Vardhan
Senior Statistician
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information contained in this email is intended for the named recipient(s)
only. It may contain private, confidential, copyright or legally privileged
information.  If you are not the intended recipient or you have received this
email by mistake, please reply to the author and delete this email immediately.
You must not copy, print, forward or distribute this email, nor place reliance
on its contents. This email and any attachment have been virus scanned. However,
you are requested to conduct a virus scan as well.  No liability is accepted
for any loss or damage resulting from a computer virus, or resulting from a delay
or defect in transmission of this email or any attached file. This email does not
constitute a representation by the NSW Police Force unless the author is legally
entitled to do so.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: using lag function to keep records based on certain conditions

Simon Palmer-4
Hi Thara,

There may be some clever new way of doing this, but these have worked
since before SPSS moved to windows:

> Situation1:
>
> Erefnum   Personcni     Newifc  newweapon
>
> 1234          8888                 5                21
> 1234          8888                 5                21
>
> I need to keep record 1 only in my data file.

I'm assuming you're identifying this by the erefnum, and that you want
to keep the first record with each erefnum only.

select if erefnum ne lag(erefnum).
execute.

If your keep criterion is more complex you might need to do something
like:

select if erefnum = lag(erefnum) and personcni = lag(personcni) and
newifc = lag(newifc) and newweapon = lag(newweapon).
execute.

An alternative is to use aggregate:

agg outfile * /break erefnum /personcni = first(personcni) /newifc =
first(newifc) / newweapon= first(newweapon).
execute.

>
>
> Situation2:
>
> Erefnum   Personcni     Newifc  newweapon
>
> 5678          9999                 1                7
> 5678          9999                 1                11
>
> In this case I need to keep the record with newweapon code 11 in my
> data
> file.

Assuming you want to keep the last record for a particular erefnum:

sort cases erefnum.
compute recnum = $casenum.
compute keepnext = 0.
if erefnum ne lag(erefnum) keepnext = 1.
sort cases recnum (d).
select if lag(keepnext) = 1.
sort cases recnum (a).

In this case aggregate would be much easier:

agg outfile * /break erefnum /personcni = last(personcni) /newifc =
last(newifc) / newweapon= last(newweapon).
execute.

Cheers,
Simon

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: using lag function to keep records based on certain conditions

Richard Ristow
In reply to this post by thara vardhan-2
At 09:26 PM 11/30/2008, Thara Vardhan wrote:

>Can anyone teach me how to use the lag function
>to select records in my data file based on certain conditions please?

Did you ever get a solution to this? I'm not
attempting one; I don't understand just what your
selection logic is meant to be.

Selecting records using LAG is tricky, and
sometimes requires using EXECUTE. See example
below, and section "Use EXECUTE Sparingly" in any
edition of "SPSSĀ® Programming and Data Management".

|-----------------------------|---------------------------|
|Output Created               |06-DEC-2008 15:48:22       |
|-----------------------------|---------------------------|
Group Letter

    1  A
    1  B
    1  C
    2  D
    3  E
    3  F
    3  G
    3  H
    4  I

Number of cases read:  9    Number of cases listed:  9


SELECT IF Group NE LAG(Group).
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |06-DEC-2008 15:48:58       |
|-----------------------------|---------------------------|

Number of cases read:  0    Number of cases listed:  0

=============================
APPENDIX: Test data, and code
=============================
DATA LIST FIXED /
    Group  03-04
    Letter 06    (A).
*---|---10----|---20.
BEGIN DATA
    1 A
    1 B
    1 C
    2 D
    3 E
    3 F
    3 G
    3 H
    4 I
END DATA.

LIST.

SELECT IF Group NE LAG(Group).
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD