Hello,
I am trying to figure out how to count consecutive identical responses within a case using SPSS. data: obs 1: 1123512422222211344122215 longest string: 6 obs 2: 5555555555555555555555555 longest string: 25 obs 3: 3333333333333443231111111 longest string: 13 Any guidance will be greatly appreciated. Thank you! Jenny |
Below is an example counting the number of runs if all of the data are in one long string. Close to the same logic would work if the values were in separate variables.
***********************************************. data list free / X (A25). begin data 1123512422222211344122215 5555555555555555555555555 3333333333333443231111111 end data. dataset name test. COMPUTE #len = LENGTH(X). COMPUTE #run = 1. NUMERIC MaxRun (F3.0). LOOP #i = 2 to #len. DO IF (SUBSTR(X,#i,1) EQ SUBSTR(X,#i-1,1)). COMPUTE #run = #run + 1. COMPUTE MaxRun = MAX(MaxRun,#run). ELSE. COMPUTE #run = 1. END IF. END LOOP. EXE. ***********************************************. |
Notes:
If Statistics is in Unicode mode, length will appropriately report the length of the blank-trimmed string. In code page mode, it reports the declared length, so if there could be trailing blanks, use length(rtrim(X)) unless those should count as a run. You can just put the length call as the upper loop limit. No need to construct an extra variable. If there are no runs, the code returns sysmis for MaxRun, but 2 for a string like 11234. You might prefer to initialize MaxRun to 1 for consistency. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Andy W <[hidden email]> To: [hidden email], Date: 12/31/2013 08:45 AM Subject: Re: [SPSSX-L] counting long string Sent by: "SPSSX(r) Discussion" <[hidden email]> Below is an example counting the number of runs if all of the data are in one long string. Close to the same logic would work if the values were in separate variables. ***********************************************. data list free / X (A25). begin data 1123512422222211344122215 5555555555555555555555555 3333333333333443231111111 end data. dataset name test. COMPUTE #len = LENGTH(X). COMPUTE #run = 1. NUMERIC MaxRun (F3.0). LOOP #i = 2 to #len. DO IF (SUBSTR(X,#i,1) EQ SUBSTR(X,#i-1,1)). COMPUTE #run = #run + 1. COMPUTE MaxRun = MAX(MaxRun,#run). ELSE. COMPUTE #run = 1. END IF. END LOOP. EXE. ***********************************************. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/counting-long-string-tp5723725p5723726.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
This post was updated on .
In reply to this post by Andy W
Amended noting Jon's quip re UNICODE mode and LENGTH (added LTRIM and RTRIM) and SYSMIS fix for singlet. Also added data for amusement ;-) -------------------------------- You could also go slightly 'black arts' on the beast ;-) Noting that the EQ goes either 1 or 0 so * either retains and increments or blasts and increments . data list free / X (A25). begin data 1123512422222211344122215 5555555555555555555555555 3333333333333443231111111 1 12345678 12234 end data. dataset name test. COMPUTE #run=1. LOOP #i = 2 to LENGTH(LTRIM(RTRIM(X))). COMPUTE #run = SUM(1,#run*(SUBSTR(X,#i,1) EQ SUBSTR(X,#i-1,1))). COMPUTE MaxRun = MAX(MaxRun,#run). END LOOP. IF SYSMIS(MaxRun) MaxRun=1. EXECUTE. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Jon K Peck
Good calls Jon - here is the code for those interested with Jon's suggestions (note I use RTRIM and LTRIM which assumes you do not count blanks occurring in either the beginning or the end of the string as a run).
***********************************************. data list fixed / X (A25). begin data 1123512422222211344122215 5555555555555555555555555 3333333333333443231111111 11234 111 55555555 123456 6666 abcd1234 end data. dataset name test. COMPUTE #run = 1. COMPUTE MaxRun = 1. LOOP #i = 2 to LENGTH(LTRIM(RTRIM(X))). DO IF (SUBSTR(X,#i,1) EQ SUBSTR(X,#i-1,1)). COMPUTE #run = #run + 1. COMPUTE MaxRun = MAX(MaxRun,#run). ELSE. COMPUTE #run = 1. END IF. END LOOP. FORMATS MaxRun (F3.0). EXE. ***********************************************. |
Free forum by Nabble | Edit this page |