This is my first question here, so I apologize for any simplicity within it.
I'm looking for a way to mark cases that share a value for one variable, and one of those variables has a certain value of another variable. (I probably can't find my answer because I can't construct my question properly). IE. X Y Z 123 Yes 1 123 1 123 1 111 145 Yes 1 145 1 I want to create Z variable '1' to show any instance of 'Yes' in Y for a variable that shares the same X value of as the case with the 'Yes' in Y |
Administrator
|
Why do I get the feeling that this is weird and there will be other wrinkles popping out of the edges of my Friday?
-- SORT CASES BY X (A) Y (D). IF Y='Yes' Z=1. IF X=LAG(X) Y=LAG(Y). IF X=LAG(X) Z=LAG(Z). For the peanut gallery. I had considered the following but it is 2 lines more '-) You all know how I enjoy concise. SORT CASES BY X (A) Y (D). IF Y='Yes' Z=1. DO IF ( X=LAG(X)). COMPUTE Y=LAG(Y). COMPUTE Z=LAG(Z). END IF.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by pkcdust
You'll need to/should review the AGGREGATE command if you are not familiar to it, to understand a little how this works. Below is a demonstration which you can run via syntax at the start of a new session in SPSS. If you have any questions feel free to ask. DATA LIST FREE / X (F3.0) Y(A3). BEGIN DATA 123 Yes 123 No 123 No 111 No 145 Yes 145 No END DATA. COMPUTE @Y=Y="Yes". AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=X /Z=MAX(@Y). ADD FILES FILE=* /DROP=@Y. EXE. (David, I've taken up your prefixing of temporary variable names with @ symbol convention, :-D!) On 17 April 2015 at 19:51, pkcdust <[hidden email]> wrote: This is my first question here, so I apologize for any simplicity within it. |
Administrator
|
I'd say just go for the jugular and assume Z does not already exist.
OVERWRITE=YES allows you to omit the ADD FILES. In any case, I believe it good practice to use variable names which are unlikely to exist in the data file. My convention is prefix with @. In this case it is simpler to just go for it ;-) and eliminate the mop up. IF Y='Yes' Z=1. AGGREGATE OUTFILE * MODE=ADDVARIABLES OVERWRITE=YES / BREAK X /Z=MAX(Z). -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Any reason why you'd favour IF over COMPUTE?
On Fri, 17 Apr 2015 at 20:59, David Marso <[hidden email]> wrote:
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
I'd say just go for the jugular and assume Z does not already exist. |
Administrator
|
No particular reason aside from it being a bit more transparent to less experienced users.
OTOH: I would likely in retrospect concede to Art's 'soapboxing' and use EQ for the comparison as distinct from assignment. IF (Y EQ 'Yes') Z=1. alternatively COMPUTE Z=(Y EQ 'Yes'). OTOH, the Boolean assignment can bite you (and sometimes the bug is elusive). DO REPEAT var=var1 TO var100. COMPUTE flag = var EQ value. END REPEAT. vs DO REPEAT var=var1 TO var100. IF ( var EQ value) flag = 1. END REPEAT. Which I prefer to do in a loop with escape anyhow (so Boolean can't bite in this case and you bail once you have located the condition of interest). VECTOR vars=var1 TO var100. LOOP #=1 TO 100. COMPUTE flag = (vars(#) EQ value). END LOOP IF flag. vs VECTOR vars=var1 TO var100. LOOP #=1 TO 100. IF (vars(#) EQ value) flag =1. END LOOP IF flag. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
What's wrong with using COUNT
COUNT <newvar> = <var1> (<value1>) <var2> (<value2>. FREQ <newvar>. Should yield 0, 1 or 2. The 2s are your marker. John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: 17 April 2015 23:02 To: [hidden email] Subject: Re: Identifying values across records No particular reason aside from it being a bit more transparent to less experienced users. OTOH: I would likely in retrospect concede to Art's 'soapboxing' and use EQ for the comparison as distinct from assignment. IF (Y EQ 'Yes') Z=1. alternatively COMPUTE Z=(Y EQ 'Yes'). OTOH, the Boolean assignment can bite you (and sometimes the bug is elusive). DO REPEAT var=var1 TO var100. COMPUTE flag = var EQ value. END REPEAT. vs DO REPEAT var=var1 TO var100. IF ( var EQ value) flag = 1. END REPEAT. Which I prefer to do in a loop with escape anyhow (so Boolean can't bite in this case and you bail once you have located the condition of interest). VECTOR vars=var1 TO var100. LOOP #=1 TO 100. COMPUTE flag = (vars(#) EQ value). END LOOP IF flag. vs VECTOR vars=var1 TO var100. LOOP #=1 TO 100. IF (vars(#) EQ value) flag =1. END LOOP IF flag. -- Jignesh Sutar wrote > Any reason why you'd favour IF over COMPUTE? > On Fri, 17 Apr 2015 at 20:59, David Marso < > david.marso@ > > wrote: > >> I'd say just go for the jugular and assume Z does not already exist. >> OVERWRITE=YES allows you to omit the ADD FILES. >> In any case, I believe it good practice to use variable names which >> are unlikely to exist in the data file. >> My convention is prefix with @. In this case it is simpler to just >> go for it ;-) and eliminate the mop up. >> >> IF Y='Yes' Z=1. >> AGGREGATE OUTFILE * MODE=ADDVARIABLES OVERWRITE=YES / BREAK X /Z=MAX(Z). >> ----- >> >> Jignesh Sutar wrote >> > You'll need to/should review the AGGREGATE < >> http://www-01.ibm.com/support/knowledgecenter/SSLVMB_21.0.0/com.ibm.s >> pss.statistics.help/syn_aggregate_overview.htm> >> ; >> > command >> > if you are not familiar to it, to understand a little how this works. >> > Below >> > is a demonstration which you can run via syntax at the start of a new >> > session in SPSS. If you have any questions feel free to ask. >> > >> > DATA LIST FREE / X (F3.0) Y(A3). >> > BEGIN DATA >> > 123 Yes >> > 123 No >> > 123 No >> > 111 No >> > 145 Yes >> > 145 No >> > END DATA. >> > >> > COMPUTE @Y=Y="Yes". >> > AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=X /Z=MAX(@Y). >> > ADD FILES FILE=* /DROP=@Y. >> > EXE. >> > >> > (David, I've taken up your prefixing of temporary variable names with @ >> > symbol convention, :-D!) >> > >> > >> > On 17 April 2015 at 19:51, pkcdust < >> >> > bruinchiq@ >> >> > > wrote: >> > >> >> This is my first question here, so I apologize for any simplicity >> within >> >> it. >> >> >> >> I'm looking for a way to mark cases that share a value for one >> variable, >> >> and >> >> one of those variables has a certain value of another variable. (I >> >> probably >> >> can't find my answer because I can't construct my question properly). >> >> >> >> IE. >> >> >> >> X Y Z >> >> 123 Yes 1 >> >> 123 1 >> >> 123 1 >> >> 111 >> >> 145 Yes 1 >> >> 145 1 >> >> >> >> I want to create Z variable '1' to show any instance of 'Yes' in Y for >> a >> >> variable that shares the same X value of as the case with the 'Yes' in >> Y >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> View this message in context: >> >> >> rds-tp5729249.html >> >> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >> >> >> >> ===================== >> >> To manage your subscription to SPSSX-L, send a message to >> >> >> >> > LISTSERV@.UGA >> >> > (not to SPSSX-L), with no body text except the >> >> command. To leave the list, send the command >> >> SIGNOFF SPSSX-L >> >> For a list of commands to manage subscriptions, send the command >> >> INFO REFCARD >> >> >> > >> > ===================== >> > To manage your subscription to SPSSX-L, send a message to >> >> > LISTSERV@.UGA >> >> > (not to SPSSX-L), with no body text except the >> > command. To leave the list, send the command >> > SIGNOFF SPSSX-L >> > For a list of commands to manage subscriptions, send the command >> > INFO REFCARD >> >> >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos >> ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff >> in >> abyssum?" >> -- >> View this message in context: >> rds-tp5729249p5729252.html >> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Identifying-values-across-reco rds-tp5729249p5729254.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
David, why are you so fond of double controls? Is there any hidden
secret esoterism in the guru head that you might agree to reveal?
VECTOR vars=var1 TO var100. LOOP #=1 TO 100. IF (vars(#) EQ value) flag =1. END LOOP IF flag.Why not use BREAK under DO IF here? VECTOR vars=var1 TO var100. LOOP #=1 TO 100. DO IF (vars(#) EQ value). COMP flag =1. BREAK. END IF. END LOOP.Sometimes (it depends on the data structure) a version without IF-interruption will be faster: COMPUTE flag= 0. DO REPEAT var=var1 TO var100. COMPUTE flag = flag or (var EQ value). END REPEAT. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Aesthetics (4 lines vs 7) and Intuitiveness?
Note the DO REPEAT will search all 100 variables even if the flag trigger is already located in the 1st variable. How can that possibly be faster than the broken loop?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I'm for shorter ("aesthetical") code too. Unfortunately, shorter
code is not always a faster code.
>Note the DO REPEAT will search all 100 variables But on the other hand IF condition on every cycle takes time to check it. Therefore I said that sometimes blunt looping till the end may be faster than that additional operation at every step. I don't think you expect the search value be always located in few first variables. 18.04.2015 23:34, David Marso пишет:
Aesthetics (4 lines vs 7) and Intuitiveness? Note the DO REPEAT will search all 100 variables even if the flag trigger is already located in the 1st variable. How can that possibly be faster than the broken loop? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I guess this is an empirical question.
Please demonstrate any instance in which the exhaustive search is faster than the immediate abort. I'm certain that this conditional test takes a tiny fraction of needlessly continuing on after already locating the answer.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
OK. Here is an example, David.
Generate data. SET RNG=MC SEED=574352. matrix. *comp divers= 100. /*With low diversity of values *comp divers= 100000. /*Or with high diversity of values comp vars= rnd(uniform(100000,1000)*divers). save vars /out= * /vari= var1 to var1000. end matrix. *******************. *"Do repeat with OR" syntax. cache. VECTOR vars= var1 TO var1000. COMPUTE flag= 0. DO REPEAT var= var1 TO var1000. COMPUTE flag= flag or (var EQ 10). END REPEAT. freq flag. delete var flag. ******************. *"Loop with IF" syntax. cache. VECTOR vars= var1 TO var1000. COMPUTE flag= 0. LOOP #= 1 TO 1000. DO IF (vars(#) EQ 10). COMP flag= 1. BREAK. END IF. END LOOP. freq flag. delete var flag. I inserted VECTOR command in both pieces for "all other things being equal" convention. With divers=100 data, the Loop syntax is faster. With divers= 100000 data, the Do repeat syntax is becoming faster. 19.04.2015 17:35, David Marso пишет:
I guess this is an empirical question. Please demonstrate any instance in which the exhaustive search is faster than the immediate abort. I'm certain that this conditional test takes a tiny fraction of needlessly continuing on after already locating the answer. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
it would be interesting to see whether it take more CPU time to do the process with (a) fewer machine instructions per case but exhaustive or (b) a loop with an escape command.
perhaps modify the syntax below to answer How many cases does it to make a detectable difference between the times to use each of (1) a loop with and escape specification (2) a loop that searches exhaustively (3) a DO REPEAT. vs how many cases does the data set have to be to make a meaningful difference, e.g., 10 second.* cases and items. new file. input program. vector x (100,f3). loop id = 1 to 10000. loop #p = 1 to 100. compute x(#p) = rnd(rv.normal(50,10)). end loop. end case. end loop. end file. end input program. DO IF $CASENUM=1. PRINT /"'start generation'" $time (time20.3). END IF. execute. DO IF $CASENUM=1. PRINT /"'start descriptives 1'" $time (time20.3). END IF. descriptives variables = x1 to x10 /statistics=all. DO IF $CASENUM=1. PRINT /"'start descriptives 2'" $time (time20.3). END IF. descriptives variables = x1 to x10 /statistics=all.
Art Kendall
Social Research Consultants |
In reply to this post by Kirill Orlov
Benchmarking is tricky. I used the
STATS BENCHMRK extension command to test the alternative syntax approaches.
This extension command runs each job multiple times and interleaves
execution of the two versions in order to minimize environmental effects.
The command can record various measures of times, memory usage, including
paging, and i/o for each Statistics process in the session. In this
case I measured only the total time for the spssengine process, ignoring
the stats process, since spssengine time is the only interesting measure
here.
I factored out the data generation and followed it with an execute in order to isolate the differences to the transformation alternatives. I would not expect the results to be affected by generating a different dataset for each comparison round. I ran 5 repetitions of each version. In the results below, group 1 is the do repeat syntax, and group 2 is the loop. The t test results are below. For Kirill's first case with divers = 100, loop is much faster, and the difference is highly significant. for Kirill's second case, with divers=100,000, do repeat is significantly faster, but the difference is smaller. As you can see, the do repeat time does not vary much between the two tests, but the loop time goes up a lot in the second scenario. That makes intuitive sense, since the breakout from the loop will generally occur later in the second scenario. divers=100
divers=100,000
The STATS BENCHMRK extension command is available from the SPSS Communit website or, with V22 or 23, it can be installed from the Utilities menu. However, it requires the free Python Extensions for Windows, which you can find by Googling. That may require a registered Python in order to install. The command only works on Windows, since it uses system measures that are Windows specific. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Kirill Orlov <[hidden email]> To: [hidden email] Date: 04/19/2015 10:05 AM Subject: Re: [SPSSX-L] Identifying values across records Sent by: "SPSSX(r) Discussion" <[hidden email]> OK. Here is an example, David. Generate data. SET RNG=MC SEED=574352. matrix. *comp divers= 100. /*With low diversity of values *comp divers= 100000. /*Or with high diversity of values comp vars= rnd(uniform(100000,1000)*divers). save vars /out= * /vari= var1 to var1000. end matrix. *******************. *"Do repeat with OR" syntax. cache. VECTOR vars= var1 TO var1000. COMPUTE flag= 0. DO REPEAT var= var1 TO var1000. COMPUTE flag= flag or (var EQ 10). END REPEAT. freq flag. delete var flag. ******************. *"Loop with IF" syntax. cache. VECTOR vars= var1 TO var1000. COMPUTE flag= 0. LOOP #= 1 TO 1000. DO IF (vars(#) EQ 10). COMP flag= 1. BREAK. END IF. END LOOP. freq flag. delete var flag. I inserted VECTOR command in both pieces for "all other things being equal" convention. With divers=100 data, the Loop syntax is faster. With divers= 100000 data, the Do repeat syntax is becoming faster. 19.04.2015 17:35, David Marso пишет: I guess this is an empirical question. Please demonstrate any instance in which the exhaustive search is faster than the immediate abort. I'm certain that this conditional test takes a tiny fraction of needlessly continuing on after already locating the answer. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Am i correct in reading the means being in seconds
7 vs 3 seconds and 7.4 vs 9.9 seconds
Art Kendall
Social Research Consultants |
Yes, but the ratios are more interesting,
since the absolute times will obviously depend on the dataset size. For
modest dataset sizes as here, of course, the differences are trivial.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Art Kendall <[hidden email]> To: [hidden email] Date: 04/20/2015 09:05 AM Subject: Re: [SPSSX-L] Identifying values across records Sent by: "SPSSX(r) Discussion" <[hidden email]> Am i correct in reading the means being in seconds 7 vs 3 seconds and 7.4 vs 9.9 seconds ----- Art Kendall Social Research Consultants -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Identifying-values-across-records-tp5729249p5729276.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Jon K Peck
At 02:25 PM 4/19/2015, Jon K Peck wrote:
>I used the STATS BENCHMRK extension command to test the alternative >syntax approaches. >For Kirill's first case with divers = 100, loop is much faster, and >the difference is highly significant. That makes sense. In both cases, the loop terminates when value 10 is found. With divers=100, values are 10 with probability .01, and the mean number of loop passes to reach one is 100, much less than the 1,000 passes needed for a full search. >for Kirill's second case, with divers=100,000, do repeat is >significantly faster, but the difference is smaller. In this case, values are 10 with probability 1E-5, and about 99% of the cases, the LOOP will check all 1,000 values. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
Thank you so much for all of the great code/help with this! It's worked great so far.
I've tried to introduce multiple variables into this matching equation now but without success. If I want to match on more than 1 variable W X Y 12 123 Yes 12 123 13 123 12 111 14 145 Yes 14 145 And create Z that identifies W & X having the same values as those in a row that has Y='yes' row W X Y Z 1 12 123 Yes 1 2 12 123 1 3 13 123 4 12 111 5 14 145 Yes 1 6 14 145 1 |
Administrator
|
"I've tried to introduce multiple variables into this matching equation now but without success."
OK, what exactly have you tried? Nudge towards cliff-- Have you bothered to read up on LAG and AGGREGATE? That is what the original code involved. Feel free to post back with your 'try' and people will help you sort it. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by pkcdust
It is possible to match on more than one variable.
(e.g. Match file file=File1/File=File2/by Var1 Var2. This requires that both files HAVE Var1 and Var2 and are sorted by Var1 Var2. I do this quite frequently and have not had a problem.) HOWEVER, you aren't really trying to match files since you have only one file. Version 22 or higher can do something like this. (UNTESTED). Aggregate outfile=* mode=addvariables overwrite=yes /break=w x /z=cin(y,"Yes"). Z will be the number of Yes values for cases with the same W and X. Or (also UNTESTED) something like this. Compute z=y="Yes". Sort cases by w x (a) y (d). If ( y="" and lag(w)=w and lag(x)=x) z=lag(z). Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of pkcdust Sent: Wednesday, May 13, 2015 4:59 PM To: [hidden email] Subject: Re: [SPSSX-L] Identifying values across records Thank you so much for all of the great code/help with this! It's worked great so far. I've tried to introduce multiple variables into this matching equation now but without success. If I want to match on more than 1 variable W X Y 12 123 Yes 12 123 13 123 12 111 14 145 Yes 14 145 And create Z that identifies W & X having the same values as those in a row that has Y='yes' row W X Y Z 1 12 123 Yes 1 2 12 123 1 3 13 123 4 12 111 5 14 145 Yes 1 6 14 145 1 -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Identifying-values-across-records-tp5729249p5729556.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ________________________________ This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |