I really don't understand 'Lag'

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

I really don't understand 'Lag'

Paul Frankel
Friends,

I'm a SPSS novice and really don't understand the 'Lag' functions.  Even
if I rely on SPSS descriptions, it still does not make a lot of sense to
me, e.g., "LAG(arg,n)  The value of the variable n cases before. The
first argument is a variable. The second argument, if specified, is a
constant and must be a positive integer"

Do you have any advice of a more simplified explanation of the Lag
functions?  I know they are very useful functions (e.g., looking for
duplicates which is very important in my field of child welfare), but I
just don't get it!

Any advice you could offer would be much appreciated.

Paul

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: I really don't understand 'Lag'

Peck, Jon
Apart from a simplified explanation of lag, lag(x) is the value of x in the preceding case, if you are hunting for duplicates, I would recommend that you use Data/Identify Duplicate Cases.  In most situations, you can define what you mean by a duplicate and find them much more easily than handcrafting syntax.  And, of course, you can keep the syntax the dialog generates for future use or tweaking if needed.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Paul Frankel
Sent: Wednesday, October 08, 2008 10:39 AM
To: [hidden email]
Subject: [SPSSX-L] I really don't understand 'Lag'

Friends,

I'm a SPSS novice and really don't understand the 'Lag' functions.  Even
if I rely on SPSS descriptions, it still does not make a lot of sense to
me, e.g., "LAG(arg,n)  The value of the variable n cases before. The
first argument is a variable. The second argument, if specified, is a
constant and must be a positive integer"

Do you have any advice of a more simplified explanation of the Lag
functions?  I know they are very useful functions (e.g., looking for
duplicates which is very important in my field of child welfare), but I
just don't get it!

Any advice you could offer would be much appreciated.

Paul

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: I really don't understand 'Lag'

Catherine Kubitschek
In reply to this post by Paul Frankel
Paul,

My suggestion is to start with a very simple data set and some very simple
computes.  Guess what you think the answer will be, then run the code and
see if you're right.  If not, try to figure out why you were wrong.  Then,
change the data or change the code and try again.  Eventually, there will
be a flash above your head.  If necessary, repeat daily until it sticks.

You might start with the following code:

data list /a 1 b 3 c 5 d 7 .
begin data
1 3 1 8
3 2 3 7
5 7 2 6
7 8 5 5
end data .

compute e=lag(d) .
list var=all .
if (b=lag(a) + 1) f=lag(d) .
list var=all .
if ($casenum > 2) g=lag(a,2) .
list var=all .
if ($casenum > 1) h=lag(a)+1 .
list var=all .
compute c=lag(c) .
list var=all .

/* i=lag(a,c) note that this is illegal */ .
/* j=lag(a+1,1) note that this is illegal */ .

Catherine Kubitschek         ([hidden email])
Sr. Programmer/Analyst        574/631-3550
Institutional Research
210 Flanner Hall
University of Notre Dame
Notre Dame, IN  46556-5611

At 10/8/2008 12:39 PM, Paul Frankel wrote:

>Friends,
>
>I'm a SPSS novice and really don't understand the 'Lag' functions.  Even
>if I rely on SPSS descriptions, it still does not make a lot of sense to
>me, e.g., "LAG(arg,n)  The value of the variable n cases before. The
>first argument is a variable. The second argument, if specified, is a
>constant and must be a positive integer"
>
>Do you have any advice of a more simplified explanation of the Lag
>functions?  I know they are very useful functions (e.g., looking for
>duplicates which is very important in my field of child welfare), but I
>just don't get it!
>
>Any advice you could offer would be much appreciated.
>
>Paul

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: I really don't understand 'Lag'

Raffe, Sydelle, SSA
In reply to this post by Peck, Jon
I was excited about Data/Identify Duplicate Cases until I realized that the result only 0 or 1. But, if you have more than a duplicate, e.g. a triplicate or more, it's not exactly what I want.

I have always used the following:

        COMPUTE SEQ=1.
        IF (ssn=LAG(ssn)) SEQ=LAG(SEQ)+1.

When you look at a frequency distribution, you will see how many records have the same, in this case, ssn. If nothing else about the records matter, then

        SELECT IF (SEQ=1).

Another way to look at the results of your syntax is to use list vars. By listing all the input vars and your result var, you can see how your syntax functioned in a variety of situations, even when it is looking n records above.

I feel much the same about rtrim and lpad etc.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Peck, Jon
Sent: Wednesday, October 08, 2008 9:58 AM
To: [hidden email]
Subject: Re: I really don't understand 'Lag'


Apart from a simplified explanation of lag, lag(x) is the value of x in the preceding case, if you are hunting for duplicates, I would recommend that you use Data/Identify Duplicate Cases.  In most situations, you can define what you mean by a duplicate and find them much more easily than handcrafting syntax.  And, of course, you can keep the syntax the dialog generates for future use or tweaking if needed.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Paul Frankel
Sent: Wednesday, October 08, 2008 10:39 AM
To: [hidden email]
Subject: [SPSSX-L] I really don't understand 'Lag'

Friends,

I'm a SPSS novice and really don't understand the 'Lag' functions.  Even
if I rely on SPSS descriptions, it still does not make a lot of sense to
me, e.g., "LAG(arg,n)  The value of the variable n cases before. The
first argument is a variable. The second argument, if specified, is a
constant and must be a positive integer"

Do you have any advice of a more simplified explanation of the Lag
functions?  I know they are very useful functions (e.g., looking for
duplicates which is very important in my field of child welfare), but I
just don't get it!

Any advice you could offer would be much appreciated.

Paul

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: I really don't understand 'Lag'

Peck, Jon
IDC has facilities for dealing with triplicates etc.

First, all the duplicates get marked as such, so you can tabulate those as needed.  In fact, IDC by default does tabulate these.
Second, checking the box "sequential count ofmatching case..." tells you how many duplicates were encountered.

HTH,
Jon Peck
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Raffe, Sydelle, SSA
Sent: Wednesday, October 08, 2008 11:24 AM
To: [hidden email]
Subject: Re: [SPSSX-L] I really don't understand 'Lag'

I was excited about Data/Identify Duplicate Cases until I realized that the result only 0 or 1. But, if you have more than a duplicate, e.g. a triplicate or more, it's not exactly what I want.

I have always used the following:

        COMPUTE SEQ=1.
        IF (ssn=LAG(ssn)) SEQ=LAG(SEQ)+1.

When you look at a frequency distribution, you will see how many records have the same, in this case, ssn. If nothing else about the records matter, then

        SELECT IF (SEQ=1).

Another way to look at the results of your syntax is to use list vars. By listing all the input vars and your result var, you can see how your syntax functioned in a variety of situations, even when it is looking n records above.

I feel much the same about rtrim and lpad etc.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Peck, Jon
Sent: Wednesday, October 08, 2008 9:58 AM
To: [hidden email]
Subject: Re: I really don't understand 'Lag'


Apart from a simplified explanation of lag, lag(x) is the value of x in the preceding case, if you are hunting for duplicates, I would recommend that you use Data/Identify Duplicate Cases.  In most situations, you can define what you mean by a duplicate and find them much more easily than handcrafting syntax.  And, of course, you can keep the syntax the dialog generates for future use or tweaking if needed.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Paul Frankel
Sent: Wednesday, October 08, 2008 10:39 AM
To: [hidden email]
Subject: [SPSSX-L] I really don't understand 'Lag'

Friends,

I'm a SPSS novice and really don't understand the 'Lag' functions.  Even
if I rely on SPSS descriptions, it still does not make a lot of sense to
me, e.g., "LAG(arg,n)  The value of the variable n cases before. The
first argument is a variable. The second argument, if specified, is a
constant and must be a positive integer"

Do you have any advice of a more simplified explanation of the Lag
functions?  I know they are very useful functions (e.g., looking for
duplicates which is very important in my field of child welfare), but I
just don't get it!

Any advice you could offer would be much appreciated.

Paul

=======
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD