Splitting a string

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Splitting a string

John F Hall
Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Kylie
Hi John,

If it is always the last three characters of the string that hold the digits
you need, then the following should create the numeric variable you can do
the arithmetic on:

COMPUTE y=NUMBER(CHAR.SUBSTR(str,CHAR.LENGTH(str)-2),f8).

Cheers,
Kylie.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John F Hall
Sent: Tuesday, 14 September 2010 3:23 PM
To: [hidden email]
Subject: Splitting a string

Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Richard Ristow
In reply to this post by John F Hall
At 01:53 AM 9/14/2010, John F Hall wrote:

I am helping someone out with a data set which has a string variable that needs splitting.  The strings consist of one letter or two, followed by three digits, each of which indicates something different.  I've sorted the letter codes, but I still need to generate three new variables, one for each digit eg:
|-----------------------------|---------------------------|
|Output Created               |14-SEP-2010 02:30:50       |
|-----------------------------|---------------------------|
Input    Yr_n1 Yr_n2 Yr_n3

f502        5     0     2
f503        5     0     3
f504        5     0     2
f521        5     2     1
fy101       1     0     1
fy102       1     0     2
fy111       1     1     1
fy121       1     2     1

Number of cases read:  8    Number of cases listed:  8

This is tested:

STRING  #Buffer  (A8).
NUMERIC #Pointer (F3).

STRING  Prefix              (A5).
NUMERIC My_n1, My_n2, My_n3 (F2).

COMPUTE #Buffer=LTRIM(Input).

COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
COMPUTE Prefix   = SUBSTR(#Buffer,1,#Pointer).
COMPUTE #Buffer  = SUBSTR(#Buffer,  #Pointer).

DO REPEAT TARGET = My_n1 My_n2 My_n3.
.  COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
.  DO IF   #Pointer GT 0.
.     COMPUTE Target  = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1).
.     COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1).
.  END IF.
END REPEAT.

LIST.

 
List
|-----------------------------|---------------------------|
|Output Created               |14-SEP-2010 02:30:50       |
|-----------------------------|---------------------------|
Input    Yr_n1 Yr_n2 Yr_n3 Prefix My_n1 My_n2 My_n3

f502        5     0     2  f5        5     0     2
f503        5     0     3  f5        5     0     3
f504        5     0     2  f5        5     0     4
f521        5     2     1  f5        5     2     1
fy101       1     0     1  fy1       1     0     1
fy102       1     0     2  fy1       1     0     2
fy111       1     1     1  fy1       1     1     1
fy121       1     2     1  fy1       1     2     1

Number of cases read:  8    Number of cases listed:  8
=============================
APPENDIX: Test data, and code
=============================
DATA LIST LIST/
   Input,   Yr_n1, Yr_n2, Yr_n3
   (A8,      3F2).
BEGIN DATA
   f502       5 0 2
   f503       5 0 3
   f504       5 0 2
   f521       5 2 1
   fy101     1 0 1
   fy102     1 0 2
   fy111     1 1 1
   fy121     1 2 1
END DATA.
LIST.

STRING  #Buffer  (A8).
NUMERIC #Pointer (F3).

STRING  Prefix              (A5).
NUMERIC My_n1, My_n2, My_n3 (F2).

COMPUTE #Buffer=LTRIM(Input).

COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
COMPUTE Prefix   = SUBSTR(#Buffer,1,#Pointer).
COMPUTE #Buffer  = SUBSTR(#Buffer,  #Pointer).

DO REPEAT TARGET = My_n1 My_n2 My_n3.
.  COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
.  DO IF   #Pointer GT 0.
.     COMPUTE Target  = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1).
.     COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1).
.  END IF.
END REPEAT.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Richard Ristow
In reply to this post by John F Hall
BOTHER. I posted with a bug that appended the first digit to the 'prefix'.

At 01:53 AM 9/14/2010, John F Hall wrote:
I have a string variable that needs splitting.  The strings consist of one letter or two, followed by three digits, each of which indicates something different.  I need to generate three new variables, one for each digit eg:
|-----------------------------|---------------------------|
|Output Created               |14-SEP-2010 02:45:10       |
|-----------------------------|---------------------------|
Input    Yr_n1 Yr_n2 Yr_n3

f502        5     0     2
f503        5     0     3
f504        5     0     2
f521        5     2     1
fy101       1     0     1
fy102       1     0     2
fy111       1     1     1
fy121       1     2     1

Number of cases read:  8    Number of cases listed:  8


Here's a correction; the change is in bold italic:

STRING  #Buffer  (A8).
NUMERIC #Pointer (F3).

STRING  Prefix              (A5).
NUMERIC My_n1, My_n2, My_n3 (F2).

COMPUTE #Buffer=LTRIM(Input).

COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
COMPUTE Prefix   = SUBSTR(#Buffer,1,#Pointer-1).
COMPUTE #Buffer  = SUBSTR(#Buffer,  #Pointer).

DO REPEAT TARGET = My_n1 My_n2 My_n3.
.  COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
.  DO IF   #Pointer GT 0.
.     COMPUTE Target  = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1).
.     COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1).
.  END IF.
END REPEAT.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |14-SEP-2010 02:45:11       |
|-----------------------------|---------------------------|
Input    Yr_n1 Yr_n2 Yr_n3 Prefix My_n1 My_n2 My_n3

f502        5     0     2  f         5     0     2
f503        5     0     3  f         5     0     3
f504        5     0     2  f         5     0     4
f521        5     2     1  f         5     2     1
fy101       1     0     1  fy        1     0     1
fy102       1     0     2  fy        1     0     2
fy111       1     1     1  fy        1     1     1
fy121       1     2     1  fy        1     2     1

Number of cases read:  8    Number of cases listed:  8
=============================
APPENDIX: Test data, and code
(Revised)
=============================
DATA LIST LIST/
   Input,   Yr_n1, Yr_n2, Yr_n3
   (A8,      3F2).
BEGIN DATA
   f502       5 0 2
   f503       5 0 3
   f504       5 0 2
   f521       5 2 1
   fy101     1 0 1
   fy102     1 0 2
   fy111     1 1 1
   fy121     1 2 1
END DATA.
LIST.

STRING  #Buffer  (A8).
NUMERIC #Pointer (F3).

STRING  Prefix              (A5).
NUMERIC My_n1, My_n2, My_n3 (F2).

COMPUTE #Buffer=LTRIM(Input).

COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
COMPUTE Prefix   = SUBSTR(#Buffer,1,#Pointer-1).
COMPUTE #Buffer  = SUBSTR(#Buffer,  #Pointer).

DO REPEAT TARGET = My_n1 My_n2 My_n3.
.  COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1).
.  DO IF   #Pointer GT 0.
.     COMPUTE Target  = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1).
.     COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1).
.  END IF.
END REPEAT.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

John F Hall
In reply to this post by Kylie
Kylie, Richard
 
Barely had time to get my breakfast and already 2 solutions.  Richard's will save some complex arithmetic, but Kylie's syntax will come in handy should I need something similar in future.  Always something new to learn!
 
Mille fois merci.
 
John
----- Original Message -----
Sent: Tuesday, September 14, 2010 8:05 AM
Subject: Re: Splitting a string


Hi John,

If it is always the last three characters of the string that hold the digits
you need, then the following should create the numeric variable you can do
the arithmetic on:

COMPUTE y=NUMBER(CHAR.SUBSTR(str,CHAR.LENGTH(str)-2),f8).

Cheers,
Kylie.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John F Hall
Sent: Tuesday, 14 September 2010 3:23 PM
To: [hidden email]
Subject: Splitting a string

Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

John F Hall
In reply to this post by John F Hall
Just tried Richard's original, but it generated an error. 
 

COMPUTE #Buffer=LTRIM (Input).

Error # 4285 in column 24. Text: Input

Incorrect variable name: either the name is more than 64 characters, or it is

not defined by a previous command.

Execution of this command stops.

Tried again with a couple of variations, but new variables contain only sysmis.  Perhaps I should have said I'm reading from a *.sav file and the variable to split is called student.
 
Will do Kylie's as well: give me some practice in logic.
 
----- Original Message -----
Sent: Tuesday, September 14, 2010 9:35 AM
Subject: Re: Re: Splitting a string

Kylie, Richard
 
Barely had time to get my breakfast and already 2 solutions.  Richard's will save some complex arithmetic, but Kylie's syntax will come in handy should I need something similar in future.  Always something new to learn!
 
Mille fois merci.
 
John
----- Original Message -----
Sent: Tuesday, September 14, 2010 8:05 AM
Subject: Re: Splitting a string


Hi John,

If it is always the last three characters of the string that hold the digits
you need, then the following should create the numeric variable you can do
the arithmetic on:

COMPUTE y=NUMBER(CHAR.SUBSTR(str,CHAR.LENGTH(str)-2),f8).

Cheers,
Kylie.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John F Hall
Sent: Tuesday, 14 September 2010 3:23 PM
To: [hidden email]
Subject: Splitting a string

Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Jon K Peck
In reply to this post by John F Hall

Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)

spssinc trans result=x y z
/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.

It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.

Regular expressions such as these provide powerful pattern-based string manipulation techniques.

Regards,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: John F Hall <[hidden email]>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

willers
Regular perl-expressions are powerfull, but the current problem ist simple. 
Just cut off the last three characters. 
This can be done with original SPSS-functions:

compute length=length(rtrim(string)) /*count the characters*/.
string stringnum (a3)  /*declare a string*/.
compute stringnum=char.substr(string,length-2) /*cut off*/.  
compute numbers=NUMBER(stringnum,f3 )     /*transform into a number*/.
execute.

Reinhart Willers

Am 14.09.2010 um 14:58 schrieb Jon K Peck:


Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)

spssinc trans result=x y z
/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.

It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.

Regular expressions such as these provide powerful pattern-based string manipulation techniques.

Regards,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: John F Hall <[hidden email]>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Jon K Peck

Sure, but you still have to split the remaining 3-digit value into separate variables to meet the original specification.  Easy enough to do with separate computes and trunc, but the code is starting to add up, and it is a bit less robust to formatting errors.
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Reinhart Willers <[hidden email]>
To: [hidden email]
Date: 09/14/2010 08:29 AM
Subject: Re: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Regular perl-expressions are powerfull, but the current problem ist simple.
Just cut off the last three characters.
This can be done with original SPSS-functions:

compute length=length(rtrim(string)) /*count the characters*/.
string stringnum (a3)  /*declare a string*/.
compute stringnum=char.substr(string,length-2) /*cut off*/.  
compute numbers=NUMBER(stringnum,f3 )     /*transform into a number*/.
execute.

Reinhart Willers

Am 14.09.2010 um 14:58 schrieb Jon K Peck:


Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)


spssinc trans result=x y z

/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.


It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.


Regular expressions such as these provide powerful pattern-based string manipulation techniques.


Regards,

Jon Peck
SPSS, an IBM Company

peck@...
312-651-3435


From: John F Hall <johnfhall@...>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>






Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall

johnfhall@...
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD





Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

John F Hall
In reply to this post by Jon K Peck

Jon
 
I love the fancy stuff: it's just that the kind of students I taught (and still help) have enough problems running basic SPSS syntax without having to get their heads round additional bells and whistles, especially when they have to download extensions etc.  Perhaps in 5 years I'll get round to a second and third level of tutorials, but things like attitude measurement and scale construction will take precedence.  Meanwhile I have to finish blocks 2 and 3 and decide how much plain English introduction to write for inferential stats without using any equations.
 
Neither solution from Kylie and Richard worked first time.  Kylie's worked when I substituted a variable name for str.  Meanwhile I had already stripped the letters off manually by copying the column to Word,  using find/replace and then pasting it back into a new column.  Only 15 codes and it only took a couple of minutes. 
 
*Kylie's (amended) .
COMPUTE y=NUMBER(CHAR.SUBSTR(student,CHAR.LENGTH(student)-2),f8).

* My check .
compute check = y - var00004 .
freq check .
 

check

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

318

100.0

100.0

100.0

 
The following arithmetic produced the variables needed once the letters were stripped off.
 
compute year = trunc (y/100) .
compute perform = trunc ((y - year*100)/10) .
compute groupid = y - (year*100) - (perform * 10) .
freq year to groupid .
 

year

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1

78

24.5

24.5

24.5

2

74

23.3

23.3

47.8

3

62

19.5

19.5

67.3

4

66

20.8

20.8

88.1

5

38

11.9

11.9

100.0

Total

318

100.0

100.0

 
 

perform

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

112

35.2

35.2

35.2

1

79

24.8

24.8

60.1

2

127

39.9

39.9

100.0

Total

318

100.0

100.0

 
 

groupid

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1

172

54.1

54.1

54.1

2

118

37.1

37.1

91.2

3

11

3.5

3.5

94.7

4

11

3.5

3.5

98.1

5

4

1.3

1.3

99.4

6

2

.6

.6

100.0

Total

318

100.0

100.0

 
 
Thanks to everyone else for their suggestions.
 
Why does the font size change irrevocably after we copy output from SPSS?
 
----- Original Message -----
Sent: Tuesday, September 14, 2010 2:58 PM
Subject: Re: Splitting a string


Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)

spssinc trans result=x y z
/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.

It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.

Regular expressions such as these provide powerful pattern-based string manipulation techniques.

Regards,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: John F Hall <[hidden email]>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Richard Ristow
In reply to this post by John F Hall
At 03:55 AM 9/14/2010, John F Hall wrote:

Just tried Richard's original, but it generated an error. 
 

COMPUTE #Buffer=LTRIM (Input).
Error # 4285 in column 24. Text: Input
Incorrect variable name: either the name is more than 64 characters, or it is not defined by a previous command.

Execution of this command stops.

Tried again with a couple of variations, but new variables contain only sysmis.  Perhaps I should have said I'm reading from a *.sav file and the variable to split is called student.

That'll do it, all right. You'll see that, in my test data, I named the variable to split "Input", since you hadn't given another name. So,
Error # 4285 in column 24. Text: Input
Incorrect variable name: ... it is not defined by a previous command.

Nope, in your code "Input" wasn't defined by a previous command, nor part of the input file. Try running my code (I recommend the corrected version), but replacing variable name "Input" by "student". String variable "#Buffer" should be at least as long as variable "student".
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

John F Hall
Richard
 
I'll have to try it later on a copy of the file.  I substituted the wrong bit in my amended syntax.  Meanwhile the problem has been sorted.  See my reply to Jon.
 
John
----- Original Message -----
Sent: Tuesday, September 14, 2010 5:49 PM
Subject: Re: Splitting a string

At 03:55 AM 9/14/2010, John F Hall wrote:

Just tried Richard's original, but it generated an error. 
 

COMPUTE #Buffer=LTRIM (Input).
Error # 4285 in column 24. Text: Input
Incorrect variable name: either the name is more than 64 characters, or it is not defined by a previous command.

Execution of this command stops.

Tried again with a couple of variations, but new variables contain only sysmis.  Perhaps I should have said I'm reading from a *.sav file and the variable to split is called student.

That'll do it, all right. You'll see that, in my test data, I named the variable to split "Input", since you hadn't given another name. So,
Error # 4285 in column 24. Text: Input
Incorrect variable name: ... it is not defined by a previous command.

Nope, in your code "Input" wasn't defined by a previous command, nor part of the input file. Try running my code (I recommend the corrected version), but replacing variable name "Input" by "student". String variable "#Buffer" should be at least as long as variable "student".
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

David Marso
Administrator
In reply to this post by John F Hall
John,
Use your variable name INSTEAD OF JUNK.  Or rename your variable to JUNK.
----------------
VECTOR A (3).
LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
COMPUTE ##=##+1 .
+  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
END LOOP.

Might need CHAR.SUBSTR rather than SUBSTR with newer version.

The increment BEFORE the compute is DELIBERATE and deals with the BASE1
vectors EVIL LAUGH!!!
DON'T YOU LOVE the IMPLICIT initialization of scratch variables!
Usually one would say COMPUTE ##=1 before using it, but scratch variables
are initialized to 0.

I could alternatively have used the following, but somehow I find the first
to be aesthetically preferable as it at least resembles some sort of
explicit initialization :

VECTOR A (3).
LOOP #=INDEX(junk,"0123456789",1) to Len(RTRIM(JUNK)).
+  COMPUTE A(##+1)=NUMBER(SUBSTR(JUNK,#,1),F1).
COMPUTE ##=##+1 .
END LOOP.

I mean there is less of the WTF? Where in the hell did ## come from.
OTOH, there is always the FM!!!

HTH, David

Jon Peck,
Python ;-))) On this it's like taking a chainsaw to a toothpick ;-))))
LOL!

On Tue, 14 Sep 2010 07:53:01 +0200, John F Hall <[hidden email]> wrote:

>Not often I ask for advice, but I am helping someone out with a data set
>which has a string variable that needs splitting.  The strings consist of
>one letter ot two, followed by three digits, each of which indicates
>something different.  I've sorted the letter codes manually, but I still
>need to generate three new variables, one for each digit eg:
>
>f502       5 0 2
>f503       5 0 3
>f504       5 0 2
>f521       5 2 1
>fy101     1 0 1
>fy102     1 0 2
>fy111     1 1 1
>fy121     1 2 1
>
>I could do it manually in the Data Editor by scrolling down and deleting the
>letters  then creating three variables arithmetically with
>
>compute x = trunc (y/100)
>compute z = mod (y/10)
>
>etc etc, but someone out there will have a much neater solution.
>
>John Hall
>[hidden email]
>http://surveyresearch.weebly.com
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Richard Ristow
OK, a bit of aesthetics -- I take the liberty, David, because you are good at programming aesthetics; and both of us think about them a good deal.

At 03:10 AM 9/15/2010, David Marso wrote:

VECTOR A (3).
LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
COMPUTE ##=##+1 .
+  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
END LOOP.

The increment BEFORE the compute is DELIBERATE and deals with the BASE-1 vectors EVIL LAUGH!!!
DON'T YOU LOVE the IMPLICIT initialization of scratch variables! Usually one would say COMPUTE ##=1 before using it, but scratch variables are initialized to 0.

Actually, whether or not I love how scratch variables are initialized, I'd avoid writing code that relies on it where that might be obscure. Explicitly initializing, as in

VECTOR A (3).
*  Add the following line: .
COMPUTE ##=0.
LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
COMPUTE ##=##+1 .
+  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
END LOOP.

clarifies how you expect "##" to be computed, and used.

Besides, if you add the line, the code works; if you don't, it doesn't. :-(

Scratch variables are initialized to 0; but only once per transformation program, not once per case. If you don't initialize, "##" starts the second case with its last value from the previous case, namely 3; it's incremented to 4, which is an invalid index for vector A; and all blows up in a flurry of warnings.

=============================
APPENDIX: Test data, and code
(output not posted)
=============================
DATA LIST LIST/
   JUNK,     Yr_n1, Yr_n2, Yr_n3
   (A8,      3F2).
BEGIN DATA
   f502       5 0 2
   f503       5 0 3
   f504       5 0 2
   f521       5 2 1
   fy101     1 0 1
   fy102     1 0 2
   fy111     1 1 1
   fy121     1 2 1
END DATA.
LIST.

VECTOR A (3).
COMPUTE ##=0.
LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
COMPUTE ##=##+1 .
+  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
END LOOP.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

David Marso
Administrator
Good Catch Richard,
The Implicit initialization was probably a symptom of the hour ;-)
> At 03:10 AM 9/15/2010, David Marso wrote:
I should probably not touch my send button after midnight .
David

On Tue, Sep 21, 2010 at 11:38 PM, Richard Ristow
<[hidden email]> wrote:

> OK, a bit of aesthetics -- I take the liberty, David, because you are good
> at programming aesthetics; and both of us think about them a good deal.
>
> At 03:10 AM 9/15/2010, David Marso wrote:
>
> VECTOR A (3).
> LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
> COMPUTE ##=##+1 .
> +  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
> END LOOP.
>
> The increment BEFORE the compute is DELIBERATE and deals with the BASE-1
> vectors EVIL LAUGH!!!
> DON'T YOU LOVE the IMPLICIT initialization of scratch variables! Usually one
> would say COMPUTE ##=1 before using it, but scratch variables are
> initialized to 0.
>
> Actually, whether or not I love how scratch variables are initialized, I'd
> avoid writing code that relies on it where that might be obscure. Explicitly
> initializing, as in
>
> VECTOR A (3).
> *  Add the following line: .
> COMPUTE ##=0.
> LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
> COMPUTE ##=##+1 .
> +  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
> END LOOP.
>
> clarifies how you expect "##" to be computed, and used.
>
> Besides, if you add the line, the code works; if you don't, it doesn't. :-(
>
> Scratch variables are initialized to 0; but only once per transformation
> program, not once per case. If you don't initialize, "##" starts the second
> case with its last value from the previous case, namely 3; it's incremented
> to 4, which is an invalid index for vector A; and all blows up in a flurry
> of warnings.
>
> =============================
> APPENDIX: Test data, and code
> (output not posted)
> =============================
> DATA LIST LIST/
>    JUNK,     Yr_n1, Yr_n2, Yr_n3
>    (A8,      3F2).
> BEGIN DATA
>    f502       5 0 2
>    f503       5 0 3
>    f504       5 0 2
>    f521       5 2 1
>    fy101     1 0 1
>    fy102     1 0 2
>    fy111     1 1 1
>    fy121     1 2 1
> END DATA.
> LIST.
>
> VECTOR A (3).
> COMPUTE ##=0.
> LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)).
> COMPUTE ##=##+1 .
> +  COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1).
> END LOOP.
>
> LIST.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Roberts, Michael-2
In reply to this post by Jon K Peck

Jon (and anyone else who may have a clue!),

 

I have been trying to get this method (Python)to work, but cannot do so.  I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed.  Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’.  I ran the command  for help as part of a program.  How should I invoke this extension?

 

TIA
Mike

 

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Tuesday, September 14, 2010 8:59 AM
To: [hidden email]
Subject: Re: Splitting a string

 


Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)

spssinc trans result=x y z
/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.

It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.

Regular expressions such as these provide powerful pattern-based string manipulation techniques.

Regards,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435


From:

John F Hall <[hidden email]>

To:

[hidden email]

Date:

09/13/2010 11:58 PM

Subject:

[SPSSX-L] Splitting a string

Sent by:

"SPSSX(r) Discussion" <[hidden email]>

 





Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]
http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Jon K Peck

Please send me your actual code.  Note that SPSSINC TRANS is an extension command and should  be run like regular syntax, that is, not in a program.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: "Roberts, Michael" <[hidden email]>
To: Jon K Peck/Chicago/IBM@IBMUS, "[hidden email]" <[hidden email]>
Date: 09/22/2010 04:17 PM
Subject: RE: Splitting a string





Jon (and anyone else who may have a clue!),
 
I have been trying to get this method (Python)to work, but cannot do so.  I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed.  Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’.  I ran the command  for help as part of a program.  How should I invoke this extension?
 
TIA
Mike

 
 
 
 
 
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck
Sent:
Tuesday, September 14, 2010 8:59 AM
To:
[hidden email]
Subject:
Re: Splitting a string

 

Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)


spssinc trans result=x y z

/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.


It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.


Regular expressions such as these provide powerful pattern-based string manipulation techniques.


Regards,

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435

From: John F Hall <[hidden email]>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>

 






Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]

http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Roberts, Michael-2

Hi Jon,

 

Never mind… I got it to work once I restarted my system and ran the program as regular syntax.  Your spssinc_trans.py file explains its workings, which I tried to follow (somewhat), but the syntax brevity is very attractive.  FWIW, below is the syntax I used.  I am still not sure how to get the “help” command/function to return anything other than “error” messages(?)

 

 data list list/

 testdata

  (a8).

begin data

 e8021

 v2950

 e80051

 e8003

 e72103

 e6241

 e80231

 v90050

end data.

list.

 

spssinc trans result=pt1 pt2 pt3 pt4

/formula 're.search(r"(\d)(\d)(\d)(\d)",testdata).groups()'.

formats pt1 pt2 pt3 pt4 (N).

 

Best Regards

Mike

 

From: Jon K Peck [mailto:[hidden email]]
Sent: Wednesday, September 22, 2010 6:39 PM
To: Roberts, Michael
Cc: [hidden email]
Subject: RE: Splitting a string

 


Please send me your actual code.  Note that SPSSINC TRANS is an extension command and should  be run like regular syntax, that is, not in a program.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435


From:

"Roberts, Michael" <[hidden email]>

To:

Jon K Peck/Chicago/IBM@IBMUS, "[hidden email]" <[hidden email]>

Date:

09/22/2010 04:17 PM

Subject:

RE: Splitting a string

 





Jon (and anyone else who may have a clue!),
 
I have been trying to get this method (Python)to work, but cannot do so.  I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed.  Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’.  I ran the command  for help as part of a program.  How should I invoke this extension?
 
TIA
Mike

 
 
 
 
 
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck
Sent:
Tuesday, September 14, 2010 8:59 AM
To:
[hidden email]
Subject:
Re: Splitting a string

 

Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)


spssinc trans result=x y z

/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.


It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.


Regular expressions such as these provide powerful pattern-based string manipulation techniques.


Regards,

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435

From:

John F Hall <[hidden email]>

To:

[hidden email]

Date:

09/13/2010 11:58 PM

Subject:

[SPSSX-L] Splitting a string

Sent by:

"SPSSX(r) Discussion" <[hidden email]>


 

 






Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]

http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Splitting a string

Jon K Peck

Glad you got this to work.  You can get help from the dialog box, but to get the syntax help, just do this.


spssinc trans /help.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: "Roberts, Michael" <[hidden email]>
To: [hidden email]
Date: 09/23/2010 09:01 AM
Subject: Re: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Hi Jon,
 
Never mind… I got it to work once I restarted my system and ran the program as regular syntax.  Your spssinc_trans.py file explains its workings, which I tried to follow (somewhat), but the syntax brevity is very attractive.  FWIW, below is the syntax I used.  I am still not sure how to get the “help” command/function to return anything other than “error” messages(?)
 
 data list list/
 testdata
  (a8).
begin data
 e8021
 v2950
 e80051
 e8003
 e72103
 e6241
 e80231
 v90050
end data.
list.
 
spssinc trans result=pt1 pt2 pt3 pt4
/formula 're.search(r"(\d)(\d)(\d)(\d)",testdata).groups()'.
formats pt1 pt2 pt3 pt4 (N).
 
Best Regards
Mike
 
From: Jon K Peck [mailto:peck@...]
Sent:
Wednesday, September 22, 2010 6:39 PM
To:
Roberts, Michael
Cc:
[hidden email]
Subject:
RE: Splitting a string

 

Please send me your actual code.  Note that SPSSINC TRANS is an extension command and should  be run like regular syntax, that is, not in a program.


Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435

From: "Roberts, Michael" <[hidden email]>
To: Jon K Peck/Chicago/IBM@IBMUS, "[hidden email]" <[hidden email]>
Date: 09/22/2010 04:17 PM
Subject: RE: Splitting a string

 






Jon (and anyone else who may have a clue!),

 
I have been trying to get this method (Python)to work, but cannot do so.  I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed.  Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’.  I ran the command  for help as part of a program.  How should I invoke this extension?

 
TIA
Mike

 
 
 
 
 
 
From:
SPSSX(r) Discussion [
[hidden email]] On Behalf Of Jon K Peck
Sent:
Tuesday, September 14, 2010 8:59 AM
To:
[hidden email]
Subject:
Re: Splitting a string

 


Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central.  (I know John won't like this, but here it is anyway.)


spssinc trans result=x y z

/formula 're.search(r"(\d)(\d)(\d)",f).groups()'.


It creates numeric variables x, y, and z each holding a single digit, where f is the input variable.


Regular expressions such as these provide powerful pattern-based string manipulation techniques.


Regards,

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435

From: John F Hall <[hidden email]>
To: [hidden email]
Date: 09/13/2010 11:58 PM
Subject: [SPSSX-L] Splitting a string
Sent by: "SPSSX(r) Discussion" <[hidden email]>


 


 






Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting.  The strings consist of
one letter ot two, followed by three digits, each of which indicates
something different.  I've sorted the letter codes manually, but I still
need to generate three new variables, one for each digit eg:

f502       5 0 2
f503       5 0 3
f504       5 0 2
f521       5 2 1
fy101     1 0 1
fy102     1 0 2
fy111     1 1 1
fy121     1 2 1

I could do it manually in the Data Editor by scrolling down and deleting the
letters  then creating three variables arithmetically with

compute x = trunc (y/100)
compute z = mod (y/10)

etc etc, but someone out there will have a much neater solution.

John Hall
[hidden email]

http://surveyresearch.weebly.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD