Administrator
|
In an effort to make gradual concessions to the oft posted claim that python can sometimes result in more compact, easier to read code I attempted to amend my profligate ways by replacing my tried and true loopy loop string parsing technique with the newfangled "string.split()" function residing in the bowels of the SPSSINC_TRANS extension. After searching for hours in this group I uncovered: "string.split" (since I was unsuccessful in my attempts to locate documentation on the goodies offered by the elves within this mystery tour.
After applying it to the following very simple file I was dismayed by the following result. Using SPSS 22.0.1 on Vista 32 bit. I do get the following brain dead warning: "Only string variables are allowed." OK, that makes a great deal of sense ;-) *NOT* *NOTE, the original code does an "inline" VARSTOCASES which would have followed the string.split, so please ignore the obvious non equivalence. So, to python or not to python? That is the question! My initial response is a resounding NAY!!!!!!! I'll stick to my tried and true! NEW FILE. DATASET CLOSE ALL. DATA LIST /Sentence (A30). BEGIN DATA It is a big Chair and table It is a chair It was a table It is a Big Char It is a Tabl and Char END DATA. DATASET NAME Source. COMPUTE ID=$CASENUM. DATASET COPY temp. DATASET ACTIVATE temp. SPSSINC TRANS RESULT = word1 TO word10 TYPE=5 /FORMULA "string.split(Sentence)". LIST. Sentence ID word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 It is a big Chair and table 1.00 It is a big Chair and table It is a chair 2.00 It was a table 3.00 It is a Big Char 4.00 It is a Tabl and Char 5.00 Number of cases read: 5 Number of cases listed: 5 Intended to replace the following. SET MXLOOPS=1000. STRING #Cpy (A30). STRING Word (A8). COMPUTE #Cpy= UPCASE(Sentence). LOOP. COMPUTE #=CHAR.INDEX(#Cpy," "). DO IF # GT 0. COMPUTE Word=CHAR.SUBSTR(#Cpy,1,#-1). COMPUTE #Cpy=CHAR.SUBSTR(#Cpy,#+1). ELSE. COMPUTE Word=#Cpy. END IF. XSAVE OUTFILE "C:\Temp\Parsed" / KEEP ID Word . END LOOP IF # EQ 0. EXECUTE. DELETE VARIABLES Word. GET FILE "C:\Temp\Parsed" .
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I'm surprised it returned anything. In python:
string.split() Is modifying the object named *string* and splits it into a list of the substrings. So in theory you want (assuming *Sentence* is the object of interest): Sentence.split() But the trans extension is not smart enough to figure this out (see here for alittle different example, http://spssx-discussion.1045642.n5.nabble.com/How-to-transform-alpha-numerical-values-in-a-variable-in-lower-case-letters-so-that-the-first-letter-td5724418.html#a5724425) My SPSS is being consumed right now by other calculations, but what happens if you try to define your own function and then pass that to Trans. Something like: **********************************************. BEGIN PROGRAM. def Split(s): return s.split() test = "It is a big Chair and table" print Split(test) test2 = "It is a" print Split(test2) END PROGRAM. SPSSINC TRANS RESULT = word1 TO word10 TYPE=5 5 5 5 5 5 5 5 5 5 /FORMULA "Split(Sentence)". **********************************************. I'm not 100% sure if you need to define the type for every variable. |
Administrator
|
Hi Andy,
I'm using what I found here. http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-td5714037.html#a5714041 Notice my usage works correctly for the first case but not the remaining 4. Weird eh?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by David Marso
It's nice to see that you are so open minded
to not so new technology, David. SPSS has had Python integration now for
almost nine years.
The SPSSINC TRANS command makes no attempt to document the thousands of Python standard library functions that you could use in SPSSINC TRANS. Sorry to say, since you always love to aim an arrow at my back, you found a bug. The particular api that SPSSINC TRANS uses to update the result dataset has apparently changed with regard to None values, so when the number of string results produced is less than the number of result variables specified, it could produce the error. However, the benefits of the Python integration mean that many bugs can be quickly fixed. It you download and install the latest version of this extension via the Utilities menu (be sure to start Statistics using Run As Administrator), you will find that this code works fine. DATASET CLOSE ALL. DATA LIST /Sentence (A50). BEGIN DATA It is a big Chair and table eight nine ten It is a chair It was a table It is a Big Char It is a Tabl and Char END DATA. DATASET NAME Source. SPSSINC TRANS RESULT = word1 TO word10 TYPE=5 /FORMULA "string.split(Sentence)". LIST. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email], Date: 06/12/2014 11:40 AM Subject: [SPSSX-L] Issue with SPSSINC TRANS string.split Sent by: "SPSSX(r) Discussion" <[hidden email]> In an effort to make gradual concessions to the oft posted claim that python can sometimes result in more compact, easier to read code I attempted to amend my profligate ways by replacing my tried and true loopy loop string parsing technique with the newfangled "string.split()" function residing in the bowels of the SPSSINC_TRANS extension. After searching for hours in this group I uncovered: "string.split" (since I was unsuccessful in my attempts to locate documentation on the goodies offered by the elves within this mystery tour. After applying it to the following very simple file I was dismayed by the following result. Using SPSS 22.0.1 on Vista 32 bit. I do get the following brain dead warning: "Only string variables are allowed." OK, that makes a great deal of sense ;-) **NOT** *NOTE, the original code does an "inline" VARSTOCASES which would have followed the string.split, so please ignore the obvious non equivalence. So, to python or not to python? That is the question! My initial response is a resounding *NAY*!!!!!!! I'll stick to my tried and true! NEW FILE. DATASET CLOSE ALL. DATA LIST /Sentence (A30). BEGIN DATA It is a big Chair and table It is a chair It was a table It is a Big Char It is a Tabl and Char END DATA. DATASET NAME Source. COMPUTE ID=$CASENUM. DATASET COPY temp. DATASET ACTIVATE temp. SPSSINC TRANS RESULT = word1 TO word10 TYPE=5 /FORMULA "string.split(Sentence)". LIST. Sentence ID word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 It is a big Chair and table 1.00 It is a big Chair and table It is a chair 2.00 It was a table 3.00 It is a Big Char 4.00 It is a Tabl and Char 5.00 Number of cases read: 5 Number of cases listed: 5 Intended to replace the following. SET MXLOOPS=1000. STRING #Cpy (A30). STRING Word (A8). COMPUTE #Cpy= UPCASE(Sentence). LOOP. COMPUTE #=CHAR.INDEX(#Cpy," "). DO IF # GT 0. COMPUTE Word=CHAR.SUBSTR(#Cpy,1,#-1). COMPUTE #Cpy=CHAR.SUBSTR(#Cpy,#+1). ELSE. COMPUTE Word=#Cpy. END IF. XSAVE OUTFILE "C:\Temp\Parsed" / KEEP ID Word . END LOOP IF # EQ 0. EXECUTE. DELETE VARIABLES Word. GET FILE "C:\Temp\Parsed" . ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Issue-with-SPSSINC-TRANS-string-split-tp5726445.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Thanks Jon.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |