All listers, I have gotten something to work for me but I’d like to
better understand what is going on here. Here is the background to what I’m doing.I have used
the syntax produced by spss point-and-click to identify duplicates. Now I want
to identify both the duplicates and the primary cases and get rid of both (or
however many there are) of them. I am deduping three times using three different scenarios in
which duplicates could occur. This is to ensure that all dups are dealt a
deadly blow. When one of the variables being deduped on is missing it was
recorded in the data (not by me) with a zero. Now what happens after the first
two deduping processes are run is that all that is left for the final third deduping
variable are zeros. Any of the matching cases for this third deduping scenario
have already been taken out by the first two procedures. So of course if I
identify duplicates and get rid of duplicates I wind up with an empty dataset. So I decided to use aggregate and create a new variable for
the sum of all of my primarylast (variable spss creates to giving 0 to dups and
1 to primary cases). I would only proceed to dedup and get rid of all primary
and duplicate cases if that aggregated variable did not equal 1 (indicating
that all cases had the same value). The goal is getting rid of cases where a
client was charged (debited) and then credited (charge removed), those are credit_debit_washes.
The cases where credit_debit_washes = 1 I want to get rid of. These are duplicates
on other fields that have been used in the three duplication procedures. Hopefully
that all sets the stage. My question is how does the following syntax result in
giving me only those cases where primarylast = 1 (my primary cases). Whereas if
I insert an exe after the do-if I get what I want, a removal of all dups and
primary cases, those instances that are a credit debit wash. I don’t even
see where primary cases are being selected here. Hopefully my story is not too
convoluted and someone has followed it. do if (primarylast_sum <> 1). if (primarylast = 1) &
(lag(primarylast) = 0) credit_debit_washes1 = 1. if (primarylast = 0) credit_debit_washes1
= 1. if (sysmis(credit_debit_washes1) = 1)
credit_debit_washes1=0. else. compute credit_debit_washes1 = 0. end if. exe. select if credit_debit_washes1 = 0. exe. Thanks Matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 |
Administrator
|
"Whereas if I insert an exe after the do-if I get what I want, a removal of all dups and primary cases, those instances that are a credit debit wash. I don’t even see where primary cases are being selected here. Hopefully my story is not too convoluted and someone has followed it."
Matt, It has *NOTHING* specifically to do with DO IF and *EVERYTHING* to do with the LAG!!! Yeah, your story is way too convoluted. Since I don't use that silly dialog to DEDUP, I have *NO* idea what convoluted syntax you are beating your head against. Example of the data might be useful and the specific desired result. OTOH, whenever selection is done based on variables created by LAG you typically need a procedure to pass the data so things don't get FUBARed. (Might be best to use that data pass to get something informative? Perhaps freqs/XTABS on some variables you have just created as reality checks ;-) HTH, David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I don't know what the original problem
was. But if lag is an issue, go here:
http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp And then enter "lag function" in the Search field. It may shed some light on the problem. From: David Marso <[hidden email]> To: [hidden email] Date: 09/23/2011 12:12 PM Subject: Re: the intricacies of using do if-end if and execute Sent by: "SPSSX(r) Discussion" <[hidden email]> "Whereas if I insert an exe after the do-if I get what I want, a removal of all dups and primary cases, those instances that are a credit debit wash. I don’t even see where primary cases are being selected here. Hopefully my story is not too convoluted and someone has followed it." Matt, It has *NOTHING* specifically to do with DO IF and *EVERYTHING* to do with the LAG!!! Yeah, your story is way too convoluted. Since I don't use that silly dialog to DEDUP, I have *NO* idea what convoluted syntax you are beating your head against. Example of the data might be useful and the specific desired result. OTOH, whenever selection is done based on variables created by LAG you typically need a procedure to pass the data so things don't get FUBARed. (Might be best to use that data pass to get something informative? Perhaps freqs/XTABS on some variables you have just created as reality checks ;-) HTH, David -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834370.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thanks David & Rick, It looks like it probably has something to
do with the fact that lag is calculated after all other transformations even if
it precedes them. So my select if is actually happening before the compute
using lag that the select if is supposed to be based on. Thanks Matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 From: SPSSX(r)
Discussion [mailto:[hidden email]] On
Behalf Of Rick Oliver I don't know what the original problem was. But if lag
is an issue, go here:
|
Lag is tricky. You might find the
SHIFT VALUES behavior to be more intuitive.
Jon Peck (no "h") Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: "Pirritano, Matthew" <[hidden email]> To: [hidden email] Date: 09/23/2011 11:49 AM Subject: Re: [SPSSX-L] the intricacies of using do if-end if and execute Sent by: "SPSSX(r) Discussion" <[hidden email]> Thanks David & Rick, It looks like it probably has something to do with the fact that lag is calculated after all other transformations even if it precedes them. So my select if is actually happening before the compute using lag that the select if is supposed to be based on. Thanks Matt Matthew Pirritano, Ph.D. Research Analyst IV Medical Services Initiative (MSI) Orange County Health Care Agency (714) 568-5648 From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Rick Oliver Sent: Friday, September 23, 2011 10:25 AM To: [hidden email] Subject: Re: the intricacies of using do if-end if and execute I don't know what the original problem was. But if lag is an issue, go here: http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp And then enter "lag function" in the Search field. It may shed some light on the problem. From: David Marso <[hidden email]> To: [hidden email] Date: 09/23/2011 12:12 PM Subject: Re: the intricacies of using do if-end if and execute Sent by: "SPSSX(r) Discussion" <[hidden email]> "Whereas if I insert an exe after the do-if I get what I want, a removal of all dups and primary cases, those instances that are a credit debit wash. I don’t even see where primary cases are being selected here. Hopefully my story is not too convoluted and someone has followed it." Matt, It has *NOTHING* specifically to do with DO IF and *EVERYTHING* to do with the LAG!!! Yeah, your story is way too convoluted. Since I don't use that silly dialog to DEDUP, I have *NO* idea what convoluted syntax you are beating your head against. Example of the data might be useful and the specific desired result. OTOH, whenever selection is done based on variables created by LAG you typically need a procedure to pass the data so things don't get FUBARed. (Might be best to use that data pass to get something informative? Perhaps freqs/XTABS on some variables you have just created as reality checks ;-) HTH, David -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834370.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by mpirritano
FWIW:
I believe the following reasonably comprehensible ONE-LINER can replace that DO IF!!! COMPUTE credit_debit_washes1 = primarylast_sum NE 1 AND ( ( primarylast EQ 1 AND LAG(primarylast) EQ 0) OR (primarylast EQ 0) ) . EXE. ---------- do if (primarylast_sum <> 1). if (primarylast = 1) & (lag(primarylast) = 0) credit_debit_washes1 = 1. if (primarylast = 0) credit_debit_washes1 = 1. if (sysmis(credit_debit_washes1) = 1) credit_debit_washes1=0. else. compute credit_debit_washes1 = 0. end if. exe.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Jon K Peck
OTOH: From the FM, SHIFT VALUES results in a data pass!!
Hence I suspect cannot be done in a LOOP or DO REPEAT and with HUGE data sets WILL result in unacceptable performance hits. For example: DO REPEAT X = X1 TO X10 / Y=Y1 TO Y10 / L=1 TO 10 . + COMPUTE X=LAG(Xvar,L). + COMPUTE Y=LAG(Yvar,L). END REPEAT. If recast as a set of SHIFT VALUE commands would result in 20 data passes and require either a MACRO or 20 separate commands. Sacrificing performance for a minor gain in "intuitiveness" is hardly laudible ;-)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
SHIFT VALUES is a procedure, of course.
But you can shift any number of variables in a single data pass.
SV was motivated by two requirements - ability to handle leads, which the transformation data feed is not set up to do. - ability to handle unlimited file sizes. CREATE, which can do leads and lags, requires the data to be in memory. That is not because of lead or lag functionality per se but because of the other functions that CREATE supports. Jon Peck (no "h") Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email] Date: 09/23/2011 01:37 PM Subject: Re: [SPSSX-L] the intricacies of using do if-end if and execute Sent by: "SPSSX(r) Discussion" <[hidden email]> OTOH: From the FM, SHIFT VALUES results in a data pass!! Hence I suspect cannot be done in a LOOP or DO REPEAT and with HUGE data sets WILL result in unacceptable performance hits. For example: DO REPEAT X = X1 TO X10 / Y=Y1 TO Y10 / L=1 TO 10 . + COMPUTE X=LAG(Xvar,L). + COMPUTE Y=LAG(Yvar,L). END REPEAT. If recast as a set of SHIFT VALUE commands would result in 20 data passes and require either a MACRO or 20 separate commands. Sacrificing performance for a minor gain in "intuitiveness" is hardly laudible ;-) Jon K Peck wrote: > > Lag is tricky. You might find the SHIFT VALUES behavior to be more > intuitive. > > Jon Peck (no "h") > Senior Software Engineer, IBM > [hidden email] > new phone: 720-342-5621 > > > > > From: "Pirritano, Matthew" <[hidden email]> > To: [hidden email] > Date: 09/23/2011 11:49 AM > Subject: Re: [SPSSX-L] the intricacies of using do if-end if and > execute > Sent by: "SPSSX(r) Discussion" <[hidden email]> > > > > Thanks David & Rick, > > It looks like it probably has something to do with the fact that lag is > calculated after all other transformations even if it precedes them. So my > select if is actually happening before the compute using lag that the > select if is supposed to be based on. > > Thanks > Matt > > Matthew Pirritano, Ph.D. > Research Analyst IV > Medical Services Initiative (MSI) > Orange County Health Care Agency > (714) 568-5648 > > From: SPSSX(r) Discussion [[hidden email]] On Behalf Of > Rick Oliver > Sent: Friday, September 23, 2011 10:25 AM > To: [hidden email] > Subject: Re: the intricacies of using do if-end if and execute > > I don't know what the original problem was. But if lag is an issue, go > here: > > http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp > > And then enter "lag function" in the Search field. > > It may shed some light on the problem. > > > > From: David Marso <[hidden email]> > To: [hidden email] > Date: 09/23/2011 12:12 PM > Subject: Re: the intricacies of using do if-end if and execute > Sent by: "SPSSX(r) Discussion" <[hidden email]> > > > > > "Whereas if I insert an exe after the do-if I get what I want, a removal > of > all dups and primary cases, those instances that are a credit debit wash. > I > don’t even see where primary cases are being selected here. Hopefully my > story is not too convoluted and someone has followed it." > > Matt, > It has *NOTHING* specifically to do with DO IF and *EVERYTHING* to do with > the LAG!!! > Yeah, your story is way too convoluted. Since I don't use that silly > dialog > to DEDUP, I have *NO* idea what convoluted syntax you are beating your > head > against. Example of the data might be useful and the specific desired > result. OTOH, whenever selection is done based on variables created by > LAG > you typically need a procedure to pass the data so things don't get > FUBARed. > (Might be best to use that data pass to get something informative? Perhaps > freqs/XTABS on some variables you have just created as reality checks ;-) > HTH, David > > > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834370.html > > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834871.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
"But you can shift any number of variables in a single data pass."
OK, I can see that now. Current syntax seems a PIA ;-) Any chance for some later version to provide the following sort of syntactic flexibility? SHIFT VALUES varlag_1 TO varlag_10=LAG(singlevar,1,10). or SHIFT VALUES lag_1_var1 TO lag_1_var10 = LAG(oldvar1 TO oldvar10,1). -------
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I will be out of the office on Monday, September 26. I will reply to your e-mail when I return on Tuesday, September 27.
Jeff ____________________________ Jeff Allum Research Associate Council of Graduate Schools One Dupont Circle, NW, Suite 230 Washington, DC 20036-1173 (202) 461-3878 (direct) (202) 223-3791 (main) (202) 461-3879 (fax) [hidden email]<mailto:[hidden email]> www.cgsnet.org<http://www.cgsnet.org/> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
Not likely that the syntax will get enhanced,
but you can submit a suggestion to [hidden email]. It would be
trivial, though, to create an extension command that supported TO. The
expectation in the original design was that users would rarely be shifting
a lot of variables, and if they shifted multiples, they shouldn't be constrained
to use the same lag/lead for all of them.
Jon Peck (no "h") Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email] Date: 09/23/2011 03:17 PM Subject: Re: [SPSSX-L] the intricacies of using do if-end if and execute Sent by: "SPSSX(r) Discussion" <[hidden email]> "But you can shift any number of variables in a single data pass." OK, I can see that now. Current syntax seems a PIA ;-) Any chance for some later version to provide the following sort of syntactic flexibility? SHIFT VALUES varlag_1 TO varlag_10=LAG(singlevar,1,10). or SHIFT VALUES lag_1_var1 TO lag_1_var10 = LAG(oldvar1 TO oldvar10,1). ------- Jon K Peck wrote: > > SHIFT VALUES is a procedure, of course. But you can shift any number of > variables in a single data pass. > SV was motivated by two requirements > - ability to handle leads, which the transformation data feed is not set > up to do. > - ability to handle unlimited file sizes. CREATE, which can do leads and > lags, requires the data to be in memory. That is not because of lead or > lag functionality per se but because of the other functions that CREATE > supports. > > Jon Peck (no "h") > Senior Software Engineer, IBM > [hidden email] > new phone: 720-342-5621 > > > > > From: David Marso <[hidden email]> > To: [hidden email] > Date: 09/23/2011 01:37 PM > Subject: Re: [SPSSX-L] the intricacies of using do if-end if and > execute > Sent by: "SPSSX(r) Discussion" <[hidden email]> > > > > OTOH: From the FM, SHIFT VALUES results in a data pass!! > Hence I suspect cannot be done in a LOOP or DO REPEAT and with HUGE data > sets WILL result in > unacceptable performance hits. > For example: > DO REPEAT X = X1 TO X10 / Y=Y1 TO Y10 / L=1 TO 10 . > + COMPUTE X=LAG(Xvar,L). > + COMPUTE Y=LAG(Yvar,L). > END REPEAT. > If recast as a set of SHIFT VALUE commands would result in 20 data passes > and require either a MACRO or 20 separate commands. > Sacrificing performance for a minor gain in "intuitiveness" is hardly > laudible ;-) > > > Jon K Peck wrote: >> >> Lag is tricky. You might find the SHIFT VALUES behavior to be more >> intuitive. >> >> Jon Peck (no "h") >> Senior Software Engineer, IBM >> [hidden email] >> new phone: 720-342-5621 >> >> >> >> >> From: "Pirritano, Matthew" <[hidden email]> >> To: [hidden email] >> Date: 09/23/2011 11:49 AM >> Subject: Re: [SPSSX-L] the intricacies of using do if-end if and >> execute >> Sent by: "SPSSX(r) Discussion" <[hidden email]> >> >> >> >> Thanks David & Rick, >> >> It looks like it probably has something to do with the fact that lag is >> calculated after all other transformations even if it precedes them. So > my >> select if is actually happening before the compute using lag that the >> select if is supposed to be based on. >> >> Thanks >> Matt >> >> Matthew Pirritano, Ph.D. >> Research Analyst IV >> Medical Services Initiative (MSI) >> Orange County Health Care Agency >> (714) 568-5648 >> >> From: SPSSX(r) Discussion [[hidden email]] On Behalf Of >> Rick Oliver >> Sent: Friday, September 23, 2011 10:25 AM >> To: [hidden email] >> Subject: Re: the intricacies of using do if-end if and execute >> >> I don't know what the original problem was. But if lag is an issue, go >> here: >> >> http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp >> >> And then enter "lag function" in the Search field. >> >> It may shed some light on the problem. >> >> >> >> From: David Marso <[hidden email]> >> To: [hidden email] >> Date: 09/23/2011 12:12 PM >> Subject: Re: the intricacies of using do if-end if and execute >> Sent by: "SPSSX(r) Discussion" <[hidden email]> >> >> >> >> >> "Whereas if I insert an exe after the do-if I get what I want, a removal >> of >> all dups and primary cases, those instances that are a credit debit > wash. >> I >> don’t even see where primary cases are being selected here. Hopefully my >> story is not too convoluted and someone has followed it." >> >> Matt, >> It has *NOTHING* specifically to do with DO IF and *EVERYTHING* to do > with >> the LAG!!! >> Yeah, your story is way too convoluted. Since I don't use that silly >> dialog >> to DEDUP, I have *NO* idea what convoluted syntax you are beating your >> head >> against. Example of the data might be useful and the specific desired >> result. OTOH, whenever selection is done based on variables created by >> LAG >> you typically need a procedure to pass the data so things don't get >> FUBARed. >> (Might be best to use that data pass to get something informative? > Perhaps >> freqs/XTABS on some variables you have just created as reality checks > ;-) >> HTH, David >> >> >> -- >> View this message in context: >> > http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834370.html > >> >> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > > > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4834871.html > > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/the-intricacies-of-using-do-if-end-if-and-execute-tp4834252p4835219.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |