Shift values command

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Shift values command

Maguin, Eugene
If anyone has used the shift values command, which appeared, I guess, for
the first time in 17, I'd like some help in understanding the command and,
in particular, one keyword.

The command syntax is

SHIFT VALUES VARIABLE=varname RESULT=varname {LEAD=non-negative integer}
                                             {LAG=non-negative integer }
                                             {SHIFT=integer            }

Where I previously might have written

Compute y=lag(x).

I now write

Shift values variable=x result=y lag=1.

So while the old lag function is really more useful, there never was a lead
function and so the new command is an nice improvement (but a lead function
would really have been better, I think).

My commentary aside, what is the shift keyword for? The menu system says

"Number of cases to shift. Get the value from the nth preceding or
subsequent case, where n is the value specified. The value must be a
non-negative integer."

But how is that different from the lead or lag keywords? I find the
explanation for this unintelligible. Also, by the way there is, I think, a
contradiction in the parameter value range explanation (shift=integer vs
'value must be a non-negative integer').

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Shift values command

Garry Gelade
Dear Gene

There was a lead function, helpfully(!) called CREATE rather than COMPUTE.
Eg you could write.
CREATE newvar = LEAD(oldvar,1).

Garry Gelade
Business Analytic Ltd.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: 05 November 2009 18:58
To: [hidden email]
Subject: Shift values command

If anyone has used the shift values command, which appeared, I guess, for
the first time in 17, I'd like some help in understanding the command and,
in particular, one keyword.

The command syntax is

SHIFT VALUES VARIABLE=varname RESULT=varname {LEAD=non-negative integer}
                                             {LAG=non-negative integer }
                                             {SHIFT=integer            }

Where I previously might have written

Compute y=lag(x).

I now write

Shift values variable=x result=y lag=1.

So while the old lag function is really more useful, there never was a lead
function and so the new command is an nice improvement (but a lead function
would really have been better, I think).

My commentary aside, what is the shift keyword for? The menu system says

"Number of cases to shift. Get the value from the nth preceding or
subsequent case, where n is the value specified. The value must be a
non-negative integer."

But how is that different from the lead or lag keywords? I find the
explanation for this unintelligible. Also, by the way there is, I think, a
contradiction in the parameter value range explanation (shift=integer vs
'value must be a non-negative integer').

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4576 (20091105) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com




__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4576 (20091105) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Shift values command

Oliver, Richard
In reply to this post by Maguin, Eugene
You quote the UI help, which does not generate syntax that uses the SHIFT keyword. Here's what the syntax help has to say about the SHIFT keyword:

SHIFT=n. Get the value from a preceding or subsequent case. The value must be an integer that indicates the number of cases. For example, SHIFT=1 returns the value of the existing variable from the case immediately following the current case, and SHIFT=-1 returns the value of the existing variable from the case immediately before the current case.

So SHIFT is just an alternative method for specifying lags and leads, using positive numbers for one and negative numbers for the other. There is even an example in the syntax help that shows that SHIFT=-1 is equivalent to LAG=1.

The CREATE command has a LEAD function, but CREATE has some memory limitations, and the LAG function has some documented non-standard transformation behavior.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin
Sent: Thursday, November 05, 2009 12:58 PM
To: [hidden email]
Subject: Shift values command

If anyone has used the shift values command, which appeared, I guess, for
the first time in 17, I'd like some help in understanding the command and,
in particular, one keyword.

The command syntax is

SHIFT VALUES VARIABLE=varname RESULT=varname {LEAD=non-negative integer}
                                             {LAG=non-negative integer }
                                             {SHIFT=integer            }

Where I previously might have written

Compute y=lag(x).

I now write

Shift values variable=x result=y lag=1.

So while the old lag function is really more useful, there never was a lead
function and so the new command is an nice improvement (but a lead function
would really have been better, I think).

My commentary aside, what is the shift keyword for? The menu system says

"Number of cases to shift. Get the value from the nth preceding or
subsequent case, where n is the value specified. The value must be a
non-negative integer."

But how is that different from the lead or lag keywords? I find the
explanation for this unintelligible. Also, by the way there is, I think, a
contradiction in the parameter value range explanation (shift=integer vs
'value must be a non-negative integer').

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Shift values command

Jon K Peck
In reply to this post by Maguin, Eugene

See below.
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: "Gene Maguin" <[hidden email]>
To: Jon K Peck/Chicago/IBM@IBMUS
Date: 11/05/2009 12:52 PM
Subject: RE: [SPSSX-L] Shift values command





Jon,

Thank you for your reply. I'd like to ask two followup questions.

First, I always been mystified by the statement that spss could look
backwards but not forwards. I'm mystified because I envision the data being
held in a matrix or a matrix type structure. All transformation commands are
just do loops. My understanding isn't correct. Where's it wrong?


>>>The data are being passed from disk, so until a case is read, future values are still just spinning around on the disk, while values from past cases can be retained as needed.  Logically, transformations could have been implemented with everything relative to the most-future case needed, but that's not how the engine works, and it would be too big a change to be worth it.  If the data were all held in memory, accessing in both directions would be straightforward, but then you are limited to data that can be held in memory.

>>>Traditional time-series statistics and econometrics packages, which use leads and lags a lot, were designed to keep all data in memory.  This wasn't much of a limitation, since time series data volumes were much less.


Second, you saw Gary Gelade's reply. There used to be an ARIMA command in
spss that, I guess, has been moved to some other addon module. From how the
create is presented, it has always suggested to me that create was part of
Arima. If that is not so, is this a case where create overlaps with the lag
function in that create has a lag keyword and with the new shift values
command in that both have lead and lag keywords. BUT, all three function
differently given certain, specific commands such as select if, split files
or filter as you pointed out in your initial reply?


>>>CREATE is in the Base.  Because it was born as part of Trends and incorporates various transformations that sometimes need extensive lead and lag values, it can handle both directions, but it does not work well with large datasets.

Create was never part of Arima, although both were part of Trends, but since both were meant for time series, they handled data differently from the transformation system.

HTH,
Jon

Thank you, Gene Maguin



Of course you can continue to use the LAG function.  SHIFT VALUES is more
general in that it covers leads and lags, but it also has some subtle
differences in behavior that make it work more intuitively.  A Lead function
in the transformation system was not practical to do, because the
transformation system can look backwards through cases but not forwards.

The SHIFT parameter can be positive or negative, so, while it is redundant,
if you are generating syntax where you might need to go either forwards or
backwards, it's easier to use.

Note these statements about Lag from the help:
When LAG is used with commands that select cases (for example, SELECT IF and
SAMPLE), LAG counts cases after case selection, even if specified before
these commands.

In a series of transformation commands without any intervening EXECUTE
commands or other commands that read the data, lag functions are calculated
after all other transformations, regardless of command order.

For SHIFT VALUES,
If split file processing is on, the scope of the shift is limited to each
split group. A shift value cannot be obtained from a case in a preceding or
subsequent split group.

. Filter status is ignored.

This makes SHIFT VALUES more convenient than Lag for many scenarios.

HTH,

Jon Peck




Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From:                  Gene Maguin <[hidden email]>
To:                  [hidden email]
Date:                  11/05/2009 11:59 AM
Subject:                  [SPSSX-L] Shift values command
Sent by:                  "SPSSX(r) Discussion" <[hidden email]>

________________________________




If anyone has used the shift values command, which appeared, I guess, for
the first time in 17, I'd like some help in understanding the command and,
in particular, one keyword.

The command syntax is

SHIFT VALUES VARIABLE=varname RESULT=varname {LEAD=non-negative integer}
                                           {LAG=non-negative integer }
                                           {SHIFT=integer            }

Where I previously might have written

Compute y=lag(x).

I now write

Shift values variable=x result=y lag=1.

So while the old lag function is really more useful, there never was a lead
function and so the new command is an nice improvement (but a lead function
would really have been better, I think).

My commentary aside, what is the shift keyword for? The menu system says

"Number of cases to shift. Get the value from the nth preceding or
subsequent case, where n is the value specified. The value must be a
non-negative integer."

But how is that different from the lead or lag keywords? I find the
explanation for this unintelligible. Also, by the way there is, I think, a
contradiction in the parameter value range explanation (shift=integer vs
'value must be a non-negative integer').

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD