Regular expression matches

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Regular expression matches

Jignesh Sutar
I'm trying to find/match certain substrings across a set of string variables.  I give examples of a couple of ways to achieve this below. This first option really isn't a option for me as I am not looking for exact matches. The second option is a little better but what would be even better would be to be able to match on regular expression searches. Is this possible either natively or otherwise? The third option below isn't supported and does not work but something as simple as that would be very useful in SPSS. Is this currently possible in someway?


* matches exact matches only .
compute Text1Found=any(“Text1”,Str01 to Str10).
compute Text2Found=any(“Text2”,Str01 to Str10).

* matches for occurrence at any start position.
do repeat Str=Str01 to Str10.
    if (index(Str,"Text1")>0) Text1Found.
    if (index(Str,"Text2")>0) Text2Found.
end repeat.

* ideal solution to be able to use ANY with regular expression.
compute Text1Found=any(“.*Text1.*”,Str01 to Str10).
compute Text2Found=any(“.*Text2.*”,Str01 to Str10).

Thanks in advance.
Jignesh
Reply | Threaded
Open this post in threaded view
|

Re: Regular expression matches

Albert-Jan Roskam
why doesn INDEX work for you (possibly with LOWER)?

compute Text1Found = 0.
compute Text2Found = 0.
do repeat Str=Str01 to Str10.
    if (index(Str, "Text1") > 0) Text1Found = Text1Found + 1.
    if (index(Str, "Text2") > 0) Text2Found = Text2Found + 1.
end repeat.

Of course, with real regexes you can do fancier things (untested):

begin program.
# match string 'Text' unless it is preceded by 'beeh'.
import re
def func(*args):
    return any([bool(re.search(r"(?<!beeh)Text", arg, re.I)) for arg in args])
end program.
spssinc trans result = Text1Found type = 0
    /formula "func(Str01, Str02, Str03)".
 
Regards,

Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



From: Jignesh Sutar <[hidden email]>
To: [hidden email]
Sent: Friday, March 7, 2014 2:11 PM
Subject: [SPSSX-L] Regular expression matches

I'm trying to find/match certain substrings across a set of string variables.  I give examples of a couple of ways to achieve this below. This first option really isn't a option for me as I am not looking for exact matches. The second option is a little better but what would be even better would be to be able to match on regular expression searches. Is this possible either natively or otherwise? The third option below isn't supported and does not work but something as simple as that would be very useful in SPSS. Is this currently possible in someway?


* matches exact matches only .
compute Text1Found=any(“Text1”,Str01 to Str10).
compute Text2Found=any(“Text2”,Str01 to Str10).

* matches for occurrence at any start position.
do repeat Str=Str01 to Str10.
    if (index(Str,"Text1")>0) Text1Found.
    if (index(Str,"Text2")>0) Text2Found.
end repeat.

* ideal solution to be able to use ANY with regular expression.
compute Text1Found=any(“.*Text1.*”,Str01 to Str10).
compute Text2Found=any(“.*Text2.*”,Str01 to Str10).

Thanks in advance.
Jignesh