Python: do I need RE for this?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Python: do I need RE for this?

Ruben Geert van den Berg
Dear all,
 
I sometimes use an 'if' statement to extract the variables I need, like
 
tarlist=[k for k in spssaux.GetVariableNamesList() if 'v42' in k.lower()]

Is it right that I can only specify a single, simple character string behind 'if'? Or, put differently, if I have variables [v42 av42 bV42 V42_blah v420] and I'd like to extract the v42 variables (v42 and V42_blah) from them, am I going to need a RE statement ('starts with (case insensitve) V42 and then no more digits (to exclude v420)')?

TIA,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com



Reply | Threaded
Open this post in threaded view
|

Re: Python: do I need RE for this?

Jon K Peck

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Date: 08/06/2010 04:06 AM
Subject: [SPSSX-L] Python: do I need RE for this?
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Dear all,

I sometimes use an 'if' statement to extract the variables I need, like

tarlist=[k for k in spssaux.GetVariableNamesList() if 'v42' in k.lower()]

Is it right that I can only specify a single, simple character string behind 'if'? Or, put differently, if I have variables [v42 av42 bV42 V42_blah v420] and I'd like to extract the v42 variables (v42 and V42_blah) from them, am I going to need a RE statement ('starts with (case insensitve) V42 and then no more digits (to exclude v420)')?

>>>You can write something more general that captures a contains idea, for example
tarlist = [k for k in spssaux.GetVariabeNamesList() if k.find('42') >=0]
That would select anything containing the string "42".  You can have other conditions after if also.
Or, [k for k in spssaux.GetVariableNamesList() if k.endswith('42')]

These are technically not regular expressions.

Another way to do this is to use the pattern capabilities of the VariableDict object.  E.g.,
vardict = spssaux.VariableDict()
all42 = vardict.variables(pattern=".*42$)
would select all variables in the dictionary ending in 42.
These are true re's, so they can be used to select much more complicated expressions.

HTH,
Jon Peck

TIA,

Ruben van den Berg

Consultant Models & Methods

TNS NIPO

Email: [hidden email]

Mobiel: +31 6 24641435

Telefoon: +31 20 522 5738

Internet:
www.tns-nipo.com