I'm starting a new thread, because the topic has shifted some.
At 06:04 AM 4/3/2014, Moon Kid wrote, in thread "time-diff in minutes": >On 2014-04-02 19:22 Richard Ristow <[hidden email]> wrote: > >>. EXECUTE isn't necessary. I remark, because if those who are list >>guri post unnecessary EXECUTEs, we'll never teach everybody else not to. > >Can you specify that? In my understanding and observation COMPUTE >has no effect without an EXECUTE. That's easy to get confused about. As Jon Peck wrote recently(*), >Statistics does lazy evaluation of transformations. That means that >they are executed the next time the data has to be passed such as >with SAVE or a statistical procedure. This saves a usually >unnecessary data pass just for the transformations. That is, when you 'run' a transformation command like COMPUTE v0v = CTIME.MINUTES(v04bis + 86400 - v04von). it doesn't perform the computations; it just adds the command to the transformation program that's being built, which will be run the next time you make a pass through the data. When you run a COMPUTE interactively, indeed you won't see the results. That doesn't mean it hasn't worked; it means it hasn't been run yet. It will be run, and the results made available, when it is needed. "EXECUTE" is a null procedure -- it makes a pass through the data, but does nothing with it. If you write, COMPUTE vov = v04bis - v04von. EXECUTE. IF vov LT 0 vov = vov + TIME.HMS(24). EXECUTE. COMPUTE vov = CTIME.MINUTES(vov). EXECUTE. DESCRIPTIVES VARIABLES=vov. then SPSS makes a complete pass through the data to compute the first value of 'vov'; then, another pass to execute the IF statement and correct for crossing a midnight boundary; then a third, to convert to minutes; and, finally, a fourth to run the DESCRIPTIVES. You've forced the transformations to be run in variable-order. If, instead, you write, COMPUTE vov = v04bis - v04von. IF vov LT 0 vov = vov + TIME.HMS(24). COMPUTE vov = CTIME.MINUTES(vov). DESCRIPTIVES VARIABLES=vov. then, SPSS makes a single pass through the data to run the DESCRIPTIVES; but, as part of that pass, all the transformation commands are executed, in order by cases (or records). That can be a drastic time saving, on a big file. You'll want to read Levesque, Raynald and IBM Corp., *Programming and Data Management for IBM SPSS Statistics 20: A Guide for IBM SPSS Statistics and SAS Users*, IBM Corporation, 2011. It's available for free download; there's a link on page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/We70df3195ec8_4f95_9773_42e448fa9029/page/Books%20and%20Articles. (And, although it refers to release 20, it seems to be the latest edition available.) There's a section "Use EXECUTE sparingly" in "2. Best Practices and Efficiency Tips", And you also wrote, >Thx, I really like clean and beautiful code. Thank you! Appreciation is most appreciated. =========================================== (*) See posting Date: Sat, 29 Mar 2014 06:44:24 -0600 From: Jon K Peck <[hidden email]> Subject: Re: understand transformations Comments: To: [hidden email] To: [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
This post was updated on .
Moved this post from thread "time-diff in minutes" to here.
I provide the following without comment for rumination. -- DATA LIST FREE/ a . BEGIN DATA 1 2 3 4 5 6 END DATA. LIST. SELECT IF $CASENUM GT 1. LIST. DATA LIST FREE/ a . BEGIN DATA 1 2 3 4 5 6 END DATA. SELECT IF a*LAG(a) NE 0. LIST. DATA LIST FREE/ a . BEGIN DATA 1 2 3 4 5 6 END DATA. COMPUTE b=a*LAG(a) NE 0. EXECUTE. SELECT IF b. LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Richard Ristow
I would ask:
"Why EVER do an EXECUTE rather than running an informative procedure?". AND. People best watch out for the LAG function if you don't know about certain counterintuitive properties. From the FM Universals section: "Note: In a series of transformation commands without any intervening EXECUTE commands or other commands that read the data, lag functions are calculated after all other transformations, regardless of command order. ...." Also consider SELECT IF very carefully. Again from the FM: "System variable $CASENUM is the sequence number of a case in the active dataset. Although it is syntactically correct to use $CASENUM on SELECT IF, it does not produce the expected results. To select a set of cases based on their sequence in a file, create your own sequence variable with the transformation language prior to making the selection". The fine folks in the publications department could have been a little more specific. SELECT IF ($CASENUM GT n) is impossible (for n >=1). SELECT IF ($CASENUM LT n) is quite reasonable and will NOT produce any unexpected results. Also, simply creating one's own sequence variable does not allow one to get away with case 1 without an intervening data pass (preferable one which provides information).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The value of $CASENUM represents the current
position of the case in file order. SELECT IF $CASENUM <= n is
functionally equivalent to N OF CASES = n, so I would suggest using of
N OF CASES if you just want the first n cases.
SELECT IF permanently deletes cases (if you save the file after SELECT IF, those cases are gone). I would recommend using FILTER instead, unless you really want to permanently delete cases. SELECT IF $CASENUM > [positive integer] deletes all cases, because the value of $CASENUM changes dynamically as SELECT IF is processed. So when the first case is deleted because it doesn't meet the condition, the second case because case #1 and is consequently deleted since it doesn't meet the condition, etc. Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: David Marso <[hidden email]> To: [hidden email], Date: 04/03/2014 02:28 PM Subject: Re: Use of EXECUTE Sent by: "SPSSX(r) Discussion" <[hidden email]> I would ask: "Why EVER do an EXECUTE rather than running an informative procedure?". AND. People best watch out for the LAG function if you don't know about certain counterintuitive properties. From the FM Universals section: "Note: In a series of transformation commands without any intervening EXECUTE commands or other commands that read the data, lag functions are calculated *after all other* transformations, regardless of command order. ...." Also consider SELECT IF very carefully. Again from the FM: "System variable $CASENUM is the sequence number of a case in the active dataset. Although it is syntactically correct to use $CASENUM on SELECT IF, it does not produce the expected results. To select a set of cases based on their sequence in a file, create your own sequence variable with the transformation language prior to making the selection". The fine folks in the publications department could have been a little more specific. SELECT IF ($CASENUM GT n) is impossible (for n >=1). SELECT IF ($CASENUM LT n) is quite reasonable and will NOT produce any unexpected results. Also, simply creating one's own sequence variable does not allow one to get away with case 1 without an intervening data pass (preferable one which provides information). Richard Ristow wrote > I'm starting a new thread, because the topic has shifted some. > > At 06:04 AM 4/3/2014, Moon Kid wrote, in thread "time-diff in minutes": > >>On 2014-04-02 19:22 Richard Ristow < > wrristow@ > > wrote: >> >>>. EXECUTE isn't necessary. I remark, because if those who are list >>>guri post unnecessary EXECUTEs, we'll never teach everybody else not to. >> >>Can you specify that? In my understanding and observation COMPUTE >>has no effect without an EXECUTE. > > That's easy to get confused about. As Jon Peck wrote recently(*), > >>Statistics does lazy evaluation of transformations. That means that >>they are executed the next time the data has to be passed such as >>with SAVE or a statistical procedure. This saves a usually >>unnecessary data pass just for the transformations. > > That is, when you 'run' a transformation command like > > COMPUTE v0v = CTIME.MINUTES(v04bis + 86400 - v04von). > > it doesn't perform the computations; it just adds the command to the > transformation program that's being built, which will be run the next > time you make a pass through the data. When you run a COMPUTE > interactively, indeed you won't see the results. That doesn't mean it > hasn't worked; it means it hasn't been run yet. It will be run, and > the results made available, when it is needed. > > "EXECUTE" is a null procedure -- it makes a pass through the data, > but does nothing with it. If you write, > > COMPUTE vov = v04bis - v04von. > EXECUTE. > IF vov LT 0 > vov = vov + TIME.HMS(24). > EXECUTE. > COMPUTE vov = CTIME.MINUTES(vov). > EXECUTE. > DESCRIPTIVES VARIABLES=vov. > > then SPSS makes a complete pass through the data to compute the first > value of 'vov'; then, another pass to execute the IF statement and > correct for crossing a midnight boundary; then a third, to convert to > minutes; and, finally, a fourth to run the DESCRIPTIVES. You've > forced the transformations to be run in variable-order. > > If, instead, you write, > > COMPUTE vov = v04bis - v04von. > IF vov LT 0 > vov = vov + TIME.HMS(24). > COMPUTE vov = CTIME.MINUTES(vov). > DESCRIPTIVES VARIABLES=vov. > > then, SPSS makes a single pass through the data to run the > DESCRIPTIVES; but, as part of that pass, all the transformation > commands are executed, in order by cases (or records). That can be a > drastic time saving, on a big file. > > You'll want to read Levesque, Raynald and IBM Corp., *Programming and > Data Management for IBM SPSS Statistics 20: A Guide for IBM SPSS > Statistics and SAS Users*, IBM Corporation, 2011. It's available for > free download; there's a link on page > https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/We70df3195ec8_4f95_9773_42e448fa9029/page/Books%20and%20Articles. > (And, although it refers to release 20, it seems to be the latest > edition available.) There's a section "Use EXECUTE sparingly" in "2. > Best Practices and Efficiency Tips", > > And you also wrote, >>Thx, I really like clean and beautiful code. > > Thank you! Appreciation is most appreciated. > =========================================== > (*) See posting > Date: Sat, 29 Mar 2014 06:44:24 -0600 > From: Jon K Peck < > peck@.ibm > > > Subject: Re: understand transformations > Comments: To: > moonkid@ > To: > SPSSX-L@.UGA > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Use-of-EXECUTE-tp5725240p5725244.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
At 03:24 PM 4/3/2014, David Marso wrote:
>Why EVER do an EXECUTE rather than running an informative procedure? Well, I use EXECUTE to run transformation programs whose purpose is to create an output .SAV file (with XSAVE)(1) or external file (with WRITE)(2). Tastes vary; my own is, not to put in a procedure I don't really want, just to make use of the data pass. In that connection, I occasionally add an EXECUTE before an informative procedure, if the transformation program is likely to produce a number of warning messages. That 'wastes' a data pass, but makes the output a lot clearer -- transformation-program warnings are separated from the procedure output. (1) XSAVE example: Date: Mon, 3 Feb 2014 16:59:05 -0500 From: Richard Ristow <[hidden email]> Subject: Re: Determining number of days used per month with a beginning & ending date To: [hidden email] (2) WRITE example: Date: Mon, 17 Feb 2014 14:06:41 -0500 From: Richard Ristow <[hidden email]> Subject: Re: Automating Adding Value Labels To: [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Rick Oliver-3
<snip>
>AND. >People best watch out for the LAG function if you don't know about certain >counterintuitive properties. In addition, beware of MISSING VALUES and other commands that take effect immediately. The Data Management book has a good example of it. COMPUTE somevar = 999. IF (MISSING(somevar)) othervar = 1. *EXECUTE. MISSING VALUES somevar (999). FREQUENCIES othervar. IIRC, this would result in ones, unless you use an execute or a procedure. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |