Expression matches

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Expression matches

Christian Schmidhauser

Hi all

 

I want to scan the content of a variable using the syntax below. My problem is that when I run the Python part for defining 'Head', 'Shoulder', and 'Knee' I get the following error:
Unrecoverable application error in the Statistics processor.

 

This happens also, when I re-run I the same syntax several times just for the Python statement defining 'Head'(including the 'NEW FILE. DATASET CLOSE all.' part).

 

I use SPSS Version 19 and Python version 26.

The problem is that I need to extract various parts of the variable and my file contains more than 300'000 cases.

 

Can someone help me?

Christian

 

NEW FILE.

DATASET CLOSE all.

DATA LIST FREE / id (a2) StringText (a240).

BEGIN DATA

01 "395,353,311,354,396,313,312,270,271,269"

02 "62"

03 "21,64,22,63"

04 "395,356,353,311,354,355,396,313,312,314,270,271,269"

05 "353,311,354,313,312,314,270,271,269"

06 "353,311,312"

07 " "

08 "353,311,354,355,313,312,270,269"

END DATA.

DATASET NAME Work2.

DATASET ACTIVATE Work2.

 

* Head.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Head type = 0

    /formula "func(StringText)".

 

* Shoulder.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Shoulder type = 0

    /formula "func(StringText)".

 

* Knee.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Knee type = 0

    /formula "func(StringText)".

 

**********************************
la volta statistics

Christian Schmidhauser, Dr.phil.II
Im Gubel 29
CH-8706 Feldmeilen

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Expression matches

David Marso
Administrator
KISS!
Why not simply use regular old fashioned SPSS CHAR.INDEX and CHAR.SUBSTR in a LOOP?
I will not repost the code I have done so many times so search this group for Parse.
You will find numerous examples which are unencumbered by python.



Christian Schmidhauser wrote
Hi all

 

I want to scan the content of a variable using the syntax below. My problem
is that when I run the Python part for defining 'Head', 'Shoulder', and
'Knee' I get the following error:
Unrecoverable application error in the Statistics processor.

 

This happens also, when I re-run I the same syntax several times just for
the Python statement defining 'Head'(including the 'NEW FILE. DATASET CLOSE
all.' part).

 

I use SPSS Version 19 and Python version 26.

The problem is that I need to extract various parts of the variable and my
file contains more than 300'000 cases.

 

Can someone help me?

Christian

 

NEW FILE.

DATASET CLOSE all.

DATA LIST FREE / id (a2) StringText (a240).

BEGIN DATA

01 "395,353,311,354,396,313,312,270,271,269"

02 "62"

03 "21,64,22,63"

04 "395,356,353,311,354,355,396,313,312,314,270,271,269"

05 "353,311,354,313,312,314,270,271,269"

06 "353,311,312"

07 " "

08 "353,311,354,355,313,312,270,269"

END DATA.

DATASET NAME Work2.

DATASET ACTIVATE Work2.

 

* Head.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in
args])

end program.

 

spssinc trans result = Head type = 0

    /formula "func(StringText)".

 

* Shoulder.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for arg in
args])

end program.

 

spssinc trans result = Shoulder type = 0

    /formula "func(StringText)".

 

* Knee.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for arg in
args])

end program.

 

spssinc trans result = Knee type = 0

    /formula "func(StringText)".

 

**********************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Im Gubel 29
CH-8706 Feldmeilen

 

 

 


=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Expression matches

Jon Peck
In reply to this post by Christian Schmidhauser
The oldest version of Statistics I have installed is 23 (64-bit, Unicode mode).  I replicated your data up to 400,000 cases and ran your code.  It completed successfully, so I can only guess that there was a problem with Python 2.6 or the V19 Python plugin or that there is an environmental issue.

SPSSINC TRANS is slower than native processing due to the use of the spss.Dataset class to write the results, but it runs fine in the versions I can test.

On Wed, May 4, 2016 at 6:00 AM, Schmidhauser <[hidden email]> wrote:

Hi all

 

I want to scan the content of a variable using the syntax below. My problem is that when I run the Python part for defining 'Head', 'Shoulder', and 'Knee' I get the following error:
Unrecoverable application error in the Statistics processor.

 

This happens also, when I re-run I the same syntax several times just for the Python statement defining 'Head'(including the 'NEW FILE. DATASET CLOSE all.' part).

 

I use SPSS Version 19 and Python version 26.

The problem is that I need to extract various parts of the variable and my file contains more than 300'000 cases.

 

Can someone help me?

Christian

 

NEW FILE.

DATASET CLOSE all.

DATA LIST FREE / id (a2) StringText (a240).

BEGIN DATA

01 "395,353,311,354,396,313,312,270,271,269"

02 "62"

03 "21,64,22,63"

04 "395,356,353,311,354,355,396,313,312,314,270,271,269"

05 "353,311,354,313,312,314,270,271,269"

06 "353,311,312"

07 " "

08 "353,311,354,355,313,312,270,269"

END DATA.

DATASET NAME Work2.

DATASET ACTIVATE Work2.

 

* Head.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Head type = 0

    /formula "func(StringText)".

 

* Shoulder.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Shoulder type = 0

    /formula "func(StringText)".

 

* Knee.

begin program.

import re

def func(*args):

    return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for arg in args])

end program.

 

spssinc trans result = Knee type = 0

    /formula "func(StringText)".

 

**********************************
la volta statistics

Christian Schmidhauser, Dr.phil.II
Im Gubel 29
CH-8706 Feldmeilen

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

AW: Expression matches

Christian Schmidhauser
In reply to this post by David Marso
Thanks David

But I have a hard time to distinguish between 322 and 22, when I search for
the String 22.

Christian

-----Ursprüngliche Nachricht-----
Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von
David Marso
Gesendet: Mittwoch, 4. Mai 2016 15:46
An: [hidden email]
Betreff: Re: Expression matches

KISS!
Why not simply use regular old fashioned SPSS CHAR.INDEX and CHAR.SUBSTR in
a LOOP?
I will not repost the code I have done so many times so search this group
for Parse.
You will find numerous examples which are unencumbered by python.




Christian Schmidhauser wrote

> Hi all
>
>  
>
> I want to scan the content of a variable using the syntax below. My
> problem is that when I run the Python part for defining 'Head',
> 'Shoulder', and 'Knee' I get the following error:
> Unrecoverable application error in the Statistics processor.
>
>  
>
> This happens also, when I re-run I the same syntax several times just
> for the Python statement defining 'Head'(including the 'NEW FILE.
> DATASET CLOSE all.' part).
>
>  
>
> I use SPSS Version 19 and Python version 26.
>
> The problem is that I need to extract various parts of the variable
> and my file contains more than 300'000 cases.
>
>  
>
> Can someone help me?
>
> Christian
>
>  
>
> NEW FILE.
>
> DATASET CLOSE all.
>
> DATA LIST FREE / id (a2) StringText (a240).
>
> BEGIN DATA
>
> 01 "395,353,311,354,396,313,312,270,271,269"
>
> 02 "62"
>
> 03 "21,64,22,63"
>
> 04 "395,356,353,311,354,355,396,313,312,314,270,271,269"
>
> 05 "353,311,354,313,312,314,270,271,269"
>
> 06 "353,311,312"
>
> 07 " "
>
> 08 "353,311,354,355,313,312,270,269"
>
> END DATA.
>
> DATASET NAME Work2.
>
> DATASET ACTIVATE Work2.
>
>  
>
> * Head.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Head type = 0
>
>     /formula "func(StringText)".
>
>  
>
> * Shoulder.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for
> arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Shoulder type = 0
>
>     /formula "func(StringText)".
>
>  
>
> * Knee.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for
> arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Knee type = 0
>
>     /formula "func(StringText)".
>
>  
>
> **********************************
> la volta statistics
> Christian Schmidhauser, Dr.phil.II
> Im Gubel 29
> CH-8706 Feldmeilen
>
>  
>
>  
>
>  
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email
me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Expression-matches-tp5732083p5
732084.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Expression matches

Art Kendall
In reply to this post by Christian Schmidhauser
It is unclear what you are trying to do.

Perhaps try AUTORECODE  and RECODE.

Perhaps ALTER TYPE.

so there is no confusion about substrings.
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: AW: Expression matches

Andy W
In reply to this post by Christian Schmidhauser
FYI your regex's don't make that distinction either...

If your strings are really as you have shown, you could distinguish "22" from "322" by searching for ",22". (To account for if it is the first number, just append a "," to the front of every string.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: AW: Expression matches

David Marso
Administrator
In reply to this post by Christian Schmidhauser
Perhaps you should post your faulty code and it can be redeemed?

Meanwhile chew on this:
SET MXLOOPS 100.
STRING #copy (A240) #part (A3).
COMPUTE #copy=StringText.
LOOP.
COMPUTE #comma=CHAR.INDEX(#copy,",").
DO IF #comma GT 0.
COMPUTE #part=CHAR.SUBSTR(#copy,1,#comma - 1).
COMPUTE #copy=CHAR.SUBSTR(#copy,#comma + 1).
ELSE.
COMPUTE #part= #copy.
END IF.
COMPUTE FOUND=ANY(#part,"22","313","222").
END LOOP IF #copy EQ " " OR FOUND.
LIST.


Christian Schmidhauser wrote
Thanks David

But I have a hard time to distinguish between 322 and 22, when I search for
the String 22.

Christian

-----Ursprüngliche Nachricht-----
Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von
David Marso
Gesendet: Mittwoch, 4. Mai 2016 15:46
An: [hidden email]
Betreff: Re: Expression matches

KISS!
Why not simply use regular old fashioned SPSS CHAR.INDEX and CHAR.SUBSTR in
a LOOP?
I will not repost the code I have done so many times so search this group
for Parse.
You will find numerous examples which are unencumbered by python.




Christian Schmidhauser wrote
> Hi all
>
>  
>
> I want to scan the content of a variable using the syntax below. My
> problem is that when I run the Python part for defining 'Head',
> 'Shoulder', and 'Knee' I get the following error:
> Unrecoverable application error in the Statistics processor.
>
>  
>
> This happens also, when I re-run I the same syntax several times just
> for the Python statement defining 'Head'(including the 'NEW FILE.
> DATASET CLOSE all.' part).
>
>  
>
> I use SPSS Version 19 and Python version 26.
>
> The problem is that I need to extract various parts of the variable
> and my file contains more than 300'000 cases.
>
>  
>
> Can someone help me?
>
> Christian
>
>  
>
> NEW FILE.
>
> DATASET CLOSE all.
>
> DATA LIST FREE / id (a2) StringText (a240).
>
> BEGIN DATA
>
> 01 "395,353,311,354,396,313,312,270,271,269"
>
> 02 "62"
>
> 03 "21,64,22,63"
>
> 04 "395,356,353,311,354,355,396,313,312,314,270,271,269"
>
> 05 "353,311,354,313,312,314,270,271,269"
>
> 06 "353,311,312"
>
> 07 " "
>
> 08 "353,311,354,355,313,312,270,269"
>
> END DATA.
>
> DATASET NAME Work2.
>
> DATASET ACTIVATE Work2.
>
>  
>
> * Head.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(62|270|313)", arg, re.I)) for arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Head type = 0
>
>     /formula "func(StringText)".
>
>  
>
> * Shoulder.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(353|271|269|270)", arg, re.I)) for
> arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Shoulder type = 0
>
>     /formula "func(StringText)".
>
>  
>
> * Knee.
>
> begin program.
>
> import re
>
> def func(*args):
>
>     return any([bool(re.search(r"(355|311|312|22)", arg, re.I)) for
> arg in
> args])
>
> end program.
>
>  
>
> spssinc trans result = Knee type = 0
>
>     /formula "func(StringText)".
>
>  
>
> **********************************
> la volta statistics
> Christian Schmidhauser, Dr.phil.II
> Im Gubel 29
> CH-8706 Feldmeilen
>
>  
>
>  
>
>  
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email
me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Expression-matches-tp5732083p5
732084.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"