Hi all I want to scan the content of a variable using the syntax below. My problem is that when I run the Python part for defining 'Head', 'Shoulder', and 'Knee' I get the following error: This happens also, when I re-run I the same syntax several times just for the Python statement defining 'Head'(including the 'NEW FILE. DATASET CLOSE all.' part). I use SPSS Version 19 and Python version 26. The problem is that I need to extract various parts of the variable and my file contains more than 300'000 cases. Can someone help me? Christian NEW FILE. DATASET CLOSE all. DATA LIST FREE / id (a2) StringText (a240). BEGIN DATA 01 "395,353,311,354,396,313,312,270,271,269" 02 "62" 03 "21,64,22,63" 04 "395,356,353,311,354,355,396,313,312,314,270,271,269" 05 "353,311,354,313,312,314,270,271,269" 06 "353,311,312" 07 " " 08 "353,311,354,355,313,312,270,269" END DATA. DATASET NAME Work2. DATASET ACTIVATE Work2. * Head. begin program. import re def func(*args): return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in args]) end program. spssinc trans result = Head type = 0 /formula "func(StringText)". * Shoulder. begin program. import re def func(*args): return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for arg in args]) end program. spssinc trans result = Shoulder type = 0 /formula "func(StringText)". * Knee. begin program. import re def func(*args): return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for arg in args]) end program. spssinc trans result = Knee type = 0 /formula "func(StringText)". ********************************** |
Administrator
|
KISS!
Why not simply use regular old fashioned SPSS CHAR.INDEX and CHAR.SUBSTR in a LOOP? I will not repost the code I have done so many times so search this group for Parse. You will find numerous examples which are unencumbered by python.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Christian Schmidhauser
The oldest version of Statistics I have installed is 23 (64-bit, Unicode mode). I replicated your data up to 400,000 cases and ran your code. It completed successfully, so I can only guess that there was a problem with Python 2.6 or the V19 Python plugin or that there is an environmental issue. SPSSINC TRANS is slower than native processing due to the use of the spss.Dataset class to write the results, but it runs fine in the versions I can test. On Wed, May 4, 2016 at 6:00 AM, Schmidhauser <[hidden email]> wrote:
|
In reply to this post by David Marso
Thanks David
But I have a hard time to distinguish between 322 and 22, when I search for the String 22. Christian -----Ursprüngliche Nachricht----- Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von David Marso Gesendet: Mittwoch, 4. Mai 2016 15:46 An: [hidden email] Betreff: Re: Expression matches KISS! Why not simply use regular old fashioned SPSS CHAR.INDEX and CHAR.SUBSTR in a LOOP? I will not repost the code I have done so many times so search this group for Parse. You will find numerous examples which are unencumbered by python. Christian Schmidhauser wrote > Hi all > > > > I want to scan the content of a variable using the syntax below. My > problem is that when I run the Python part for defining 'Head', > 'Shoulder', and 'Knee' I get the following error: > Unrecoverable application error in the Statistics processor. > > > > This happens also, when I re-run I the same syntax several times just > for the Python statement defining 'Head'(including the 'NEW FILE. > DATASET CLOSE all.' part). > > > > I use SPSS Version 19 and Python version 26. > > The problem is that I need to extract various parts of the variable > and my file contains more than 300'000 cases. > > > > Can someone help me? > > Christian > > > > NEW FILE. > > DATASET CLOSE all. > > DATA LIST FREE / id (a2) StringText (a240). > > BEGIN DATA > > 01 "395,353,311,354,396,313,312,270,271,269" > > 02 "62" > > 03 "21,64,22,63" > > 04 "395,356,353,311,354,355,396,313,312,314,270,271,269" > > 05 "353,311,354,313,312,314,270,271,269" > > 06 "353,311,312" > > 07 " " > > 08 "353,311,354,355,313,312,270,269" > > END DATA. > > DATASET NAME Work2. > > DATASET ACTIVATE Work2. > > > > * Head. > > begin program. > > import re > > def func(*args): > > return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in > args]) > > end program. > > > > spssinc trans result = Head type = 0 > > /formula "func(StringText)". > > > > * Shoulder. > > begin program. > > import re > > def func(*args): > > return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for > arg in > args]) > > end program. > > > > spssinc trans result = Shoulder type = 0 > > /formula "func(StringText)". > > > > * Knee. > > begin program. > > import re > > def func(*args): > > return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for > arg in > args]) > > end program. > > > > spssinc trans result = Knee type = 0 > > /formula "func(StringText)". > > > > ********************************** > la volta statistics > Christian Schmidhauser, Dr.phil.II > Im Gubel 29 > CH-8706 Feldmeilen > > > > > > > > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Expression-matches-tp5732083p5 732084.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Christian Schmidhauser
It is unclear what you are trying to do.
Perhaps try AUTORECODE and RECODE. Perhaps ALTER TYPE. so there is no confusion about substrings.
Art Kendall
Social Research Consultants |
In reply to this post by Christian Schmidhauser
FYI your regex's don't make that distinction either...
If your strings are really as you have shown, you could distinguish "22" from "322" by searching for ",22". (To account for if it is the first number, just append a "," to the front of every string.) |
Administrator
|
In reply to this post by Christian Schmidhauser
Perhaps you should post your faulty code and it can be redeemed?
Meanwhile chew on this: SET MXLOOPS 100. STRING #copy (A240) #part (A3). COMPUTE #copy=StringText. LOOP. COMPUTE #comma=CHAR.INDEX(#copy,","). DO IF #comma GT 0. COMPUTE #part=CHAR.SUBSTR(#copy,1,#comma - 1). COMPUTE #copy=CHAR.SUBSTR(#copy,#comma + 1). ELSE. COMPUTE #part= #copy. END IF. COMPUTE FOUND=ANY(#part,"22","313","222"). END LOOP IF #copy EQ " " OR FOUND. LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |