|
Not often I ask for advice, but I am helping someone out with a data set
which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall [hidden email] http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi John,
If it is always the last three characters of the string that hold the digits you need, then the following should create the numeric variable you can do the arithmetic on: COMPUTE y=NUMBER(CHAR.SUBSTR(str,CHAR.LENGTH(str)-2),f8). Cheers, Kylie. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of John F Hall Sent: Tuesday, 14 September 2010 3:23 PM To: [hidden email] Subject: Splitting a string Not often I ask for advice, but I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall [hidden email] http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by John F Hall
At 01:53 AM 9/14/2010, John F Hall wrote:
I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter or two, followed by three digits, each of which indicates something different. I've sorted the letter codes, but I still need to generate three new variables, one for each digit eg: This is tested: STRING #Buffer (A8). NUMERIC #Pointer (F3). STRING Prefix (A5). NUMERIC My_n1, My_n2, My_n3 (F2). COMPUTE #Buffer=LTRIM(Input). COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). COMPUTE Prefix = SUBSTR(#Buffer,1,#Pointer). COMPUTE #Buffer = SUBSTR(#Buffer, #Pointer). DO REPEAT TARGET = My_n1 My_n2 My_n3. . COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). . DO IF #Pointer GT 0. . COMPUTE Target = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1). . COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1). . END IF. END REPEAT. LIST. List |-----------------------------|---------------------------| |Output Created |14-SEP-2010 02:30:50 | |-----------------------------|---------------------------| Input Yr_n1 Yr_n2 Yr_n3 Prefix My_n1 My_n2 My_n3 f502 5 0 2 f5 5 0 2 f503 5 0 3 f5 5 0 3 f504 5 0 2 f5 5 0 4 f521 5 2 1 f5 5 2 1 fy101 1 0 1 fy1 1 0 1 fy102 1 0 2 fy1 1 0 2 fy111 1 1 1 fy1 1 1 1 fy121 1 2 1 fy1 1 2 1 Number of cases read: 8 Number of cases listed: 8 ============================= APPENDIX: Test data, and code ============================= DATA LIST LIST/ Input, Yr_n1, Yr_n2, Yr_n3 (A8, 3F2). BEGIN DATA f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 END DATA. LIST. STRING #Buffer (A8). NUMERIC #Pointer (F3). STRING Prefix (A5). NUMERIC My_n1, My_n2, My_n3 (F2). COMPUTE #Buffer=LTRIM(Input). COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). COMPUTE Prefix = SUBSTR(#Buffer,1,#Pointer). COMPUTE #Buffer = SUBSTR(#Buffer, #Pointer). DO REPEAT TARGET = My_n1 My_n2 My_n3. . COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). . DO IF #Pointer GT 0. . COMPUTE Target = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1). . COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1). . END IF. END REPEAT. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by John F Hall
BOTHER. I posted with a bug that appended the first digit to the
'prefix'.
At 01:53 AM 9/14/2010, John F Hall wrote: I have a string variable that needs splitting. The strings consist of one letter or two, followed by three digits, each of which indicates something different. I need to generate three new variables, one for each digit eg: Here's a correction; the change is in bold italic: STRING #Buffer (A8). NUMERIC #Pointer (F3). STRING Prefix (A5). NUMERIC My_n1, My_n2, My_n3 (F2). COMPUTE #Buffer=LTRIM(Input). COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). COMPUTE Prefix = SUBSTR(#Buffer,1,#Pointer-1). COMPUTE #Buffer = SUBSTR(#Buffer, #Pointer). DO REPEAT TARGET = My_n1 My_n2 My_n3. . COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). . DO IF #Pointer GT 0. . COMPUTE Target = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1). . COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1). . END IF. END REPEAT. LIST. List |-----------------------------|---------------------------| |Output Created |14-SEP-2010 02:45:11 | |-----------------------------|---------------------------| Input Yr_n1 Yr_n2 Yr_n3 Prefix My_n1 My_n2 My_n3 f502 5 0 2 f 5 0 2 f503 5 0 3 f 5 0 3 f504 5 0 2 f 5 0 4 f521 5 2 1 f 5 2 1 fy101 1 0 1 fy 1 0 1 fy102 1 0 2 fy 1 0 2 fy111 1 1 1 fy 1 1 1 fy121 1 2 1 fy 1 2 1 Number of cases read: 8 Number of cases listed: 8 ============================= APPENDIX: Test data, and code (Revised) ============================= DATA LIST LIST/ Input, Yr_n1, Yr_n2, Yr_n3 (A8, 3F2). BEGIN DATA f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 END DATA. LIST. STRING #Buffer (A8). NUMERIC #Pointer (F3). STRING Prefix (A5). NUMERIC My_n1, My_n2, My_n3 (F2). COMPUTE #Buffer=LTRIM(Input). COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). COMPUTE Prefix = SUBSTR(#Buffer,1,#Pointer-1). COMPUTE #Buffer = SUBSTR(#Buffer, #Pointer). DO REPEAT TARGET = My_n1 My_n2 My_n3. . COMPUTE #Pointer = INDEX (#Buffer,'0123456789',1). . DO IF #Pointer GT 0. . COMPUTE Target = NUMBER(SUBSTR(#Buffer,#Pointer,1),F1). . COMPUTE #Buffer = SUBSTR(#Buffer,#Pointer+1). . END IF. END REPEAT. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Kylie
Kylie, Richard
Barely had time to get my breakfast and already 2
solutions. Richard's will save some complex arithmetic, but Kylie's syntax
will come in handy should I need something similar in future. Always
something new to learn!
Mille fois merci.
John
|
|
In reply to this post by John F Hall
Just tried Richard's original, but it generated an
error.
COMPUTE #Buffer=LTRIM (Input). Error # 4285 in column 24. Text: Input Incorrect variable name: either the name is more than 64 characters, or it is not defined by a previous command. Execution of this command stops. Tried again with a couple of variations, but new variables
contain only sysmis. Perhaps I should have said I'm reading from a *.sav
file and the variable to split is called student.
Will do Kylie's as well: give me some practice in logic.
|
|
In reply to this post by John F Hall
Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central. (I know John won't like this, but here it is anyway.) spssinc trans result=x y z /formula 're.search(r"(\d)(\d)(\d)",f).groups()'. It creates numeric variables x, y, and z each holding a single digit, where f is the input variable. Regular expressions such as these provide powerful pattern-based string manipulation techniques. Regards, Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Not often I ask for advice, but I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall [hidden email] http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Regular perl-expressions are powerfull, but the current problem ist simple.
Just cut off the last three characters. This can be done with original SPSS-functions:
compute length=length(rtrim(string)) /*count the characters*/. string stringnum (a3) /*declare a string*/. compute stringnum=char.substr(string,length-2) /*cut off*/. compute numbers=NUMBER(stringnum,f3 ) /*transform into a number*/. execute. Reinhart Willers Am 14.09.2010 um 14:58 schrieb Jon K Peck:
|
|
Sure, but you still have to split the remaining 3-digit value into separate variables to meet the original specification. Easy enough to do with separate computes and trunc, but the code is starting to add up, and it is a bit less robust to formatting errors. Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Regular perl-expressions are powerfull, but the current problem ist simple. Just cut off the last three characters. This can be done with original SPSS-functions: compute length=length(rtrim(string)) /*count the characters*/. string stringnum (a3) /*declare a string*/. compute stringnum=char.substr(string,length-2) /*cut off*/. compute numbers=NUMBER(stringnum,f3 ) /*transform into a number*/. execute. Reinhart Willers Am 14.09.2010 um 14:58 schrieb Jon K Peck: Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central. (I know John won't like this, but here it is anyway.) spssinc trans result=x y z /formula 're.search(r"(\d)(\d)(\d)",f).groups()'. It creates numeric variables x, y, and z each holding a single digit, where f is the input variable. Regular expressions such as these provide powerful pattern-based string manipulation techniques. Regards, Jon Peck SPSS, an IBM Company peck@... 312-651-3435
Not often I ask for advice, but I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall johnfhall@... http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jon K Peck

Jon
I love the fancy stuff: it's just that the kind of
students I taught (and still help) have enough problems running basic SPSS
syntax without having to get their heads round additional bells and whistles,
especially when they have to download extensions etc. Perhaps in 5 years
I'll get round to a second and third level of tutorials, but things like
attitude measurement and scale construction will take precedence.
Meanwhile I have to finish blocks 2 and 3 and decide how much plain English
introduction to write for inferential stats without using any
equations.
Neither solution from Kylie and Richard worked
first time. Kylie's worked when I substituted a variable name for
str. Meanwhile I had already stripped the letters off manually by copying
the column to Word, using find/replace and then pasting it back into a new
column. Only 15 codes and it only took a couple of minutes.
*Kylie's (amended) .
COMPUTE
y=NUMBER(CHAR.SUBSTR(student,CHAR.LENGTH(student)-2),f8).
* My check . compute check = y - var00004 .
freq check .
The following arithmetic produced the variables needed once the letters
were stripped off.
compute year = trunc (y/100) .
compute perform = trunc ((y - year*100)/10) . compute groupid = y - (year*100) - (perform * 10) . freq year to groupid .
Thanks to everyone else for their
suggestions.
Why does the font size change irrevocably after we
copy output from SPSS?
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In reply to this post by John F Hall
At 03:55 AM 9/14/2010, John F Hall wrote:
Just tried Richard's original, but it generated an error. That'll do it, all right. You'll see that, in my test data, I named the variable to split "Input", since you hadn't given another name. So, Error # 4285 in column 24. Text: Input Nope, in your code "Input" wasn't defined by a previous command, nor part of the input file. Try running my code (I recommend the corrected version), but replacing variable name "Input" by "student". String variable "#Buffer" should be at least as long as variable "student". ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Richard
I'll have to try it later on a copy of the
file. I substituted the wrong bit in my amended syntax. Meanwhile
the problem has been sorted. See my reply to Jon.
John
|
|
Administrator
|
In reply to this post by John F Hall
John,
Use your variable name INSTEAD OF JUNK. Or rename your variable to JUNK. ---------------- VECTOR A (3). LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). COMPUTE ##=##+1 . + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). END LOOP. Might need CHAR.SUBSTR rather than SUBSTR with newer version. The increment BEFORE the compute is DELIBERATE and deals with the BASE1 vectors EVIL LAUGH!!! DON'T YOU LOVE the IMPLICIT initialization of scratch variables! Usually one would say COMPUTE ##=1 before using it, but scratch variables are initialized to 0. I could alternatively have used the following, but somehow I find the first to be aesthetically preferable as it at least resembles some sort of explicit initialization : VECTOR A (3). LOOP #=INDEX(junk,"0123456789",1) to Len(RTRIM(JUNK)). + COMPUTE A(##+1)=NUMBER(SUBSTR(JUNK,#,1),F1). COMPUTE ##=##+1 . END LOOP. I mean there is less of the WTF? Where in the hell did ## come from. OTOH, there is always the FM!!! HTH, David Jon Peck, Python ;-))) On this it's like taking a chainsaw to a toothpick ;-)))) LOL! On Tue, 14 Sep 2010 07:53:01 +0200, John F Hall <[hidden email]> wrote: >Not often I ask for advice, but I am helping someone out with a data set >which has a string variable that needs splitting. The strings consist of >one letter ot two, followed by three digits, each of which indicates >something different. I've sorted the letter codes manually, but I still >need to generate three new variables, one for each digit eg: > >f502 5 0 2 >f503 5 0 3 >f504 5 0 2 >f521 5 2 1 >fy101 1 0 1 >fy102 1 0 2 >fy111 1 1 1 >fy121 1 2 1 > >I could do it manually in the Data Editor by scrolling down and deleting the >letters then creating three variables arithmetically with > >compute x = trunc (y/100) >compute z = mod (y/10) > >etc etc, but someone out there will have a much neater solution. > >John Hall >[hidden email] >http://surveyresearch.weebly.com > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
OK, a bit of aesthetics -- I take the liberty, David, because you are
good at programming aesthetics; and both of us think about them a good
deal.
At 03:10 AM 9/15/2010, David Marso wrote: VECTOR A (3). Actually, whether or not I love how scratch variables are initialized, I'd avoid writing code that relies on it where that might be obscure. Explicitly initializing, as in VECTOR A (3). * Add the following line: . COMPUTE ##=0. LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). COMPUTE ##=##+1 . + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). END LOOP. clarifies how you expect "##" to be computed, and used. Besides, if you add the line, the code works; if you don't, it doesn't. :-( Scratch variables are initialized to 0; but only once per transformation program, not once per case. If you don't initialize, "##" starts the second case with its last value from the previous case, namely 3; it's incremented to 4, which is an invalid index for vector A; and all blows up in a flurry of warnings. ============================= APPENDIX: Test data, and code (output not posted) ============================= DATA LIST LIST/ JUNK, Yr_n1, Yr_n2, Yr_n3 (A8, 3F2). BEGIN DATA f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 END DATA. LIST. VECTOR A (3). COMPUTE ##=0. LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). COMPUTE ##=##+1 . + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). END LOOP. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Good Catch Richard,
The Implicit initialization was probably a symptom of the hour ;-) > At 03:10 AM 9/15/2010, David Marso wrote: I should probably not touch my send button after midnight . David On Tue, Sep 21, 2010 at 11:38 PM, Richard Ristow <[hidden email]> wrote: > OK, a bit of aesthetics -- I take the liberty, David, because you are good > at programming aesthetics; and both of us think about them a good deal. > > At 03:10 AM 9/15/2010, David Marso wrote: > > VECTOR A (3). > LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). > COMPUTE ##=##+1 . > + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). > END LOOP. > > The increment BEFORE the compute is DELIBERATE and deals with the BASE-1 > vectors EVIL LAUGH!!! > DON'T YOU LOVE the IMPLICIT initialization of scratch variables! Usually one > would say COMPUTE ##=1 before using it, but scratch variables are > initialized to 0. > > Actually, whether or not I love how scratch variables are initialized, I'd > avoid writing code that relies on it where that might be obscure. Explicitly > initializing, as in > > VECTOR A (3). > * Add the following line: . > COMPUTE ##=0. > LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). > COMPUTE ##=##+1 . > + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). > END LOOP. > > clarifies how you expect "##" to be computed, and used. > > Besides, if you add the line, the code works; if you don't, it doesn't. :-( > > Scratch variables are initialized to 0; but only once per transformation > program, not once per case. If you don't initialize, "##" starts the second > case with its last value from the previous case, namely 3; it's incremented > to 4, which is an invalid index for vector A; and all blows up in a flurry > of warnings. > > ============================= > APPENDIX: Test data, and code > (output not posted) > ============================= > DATA LIST LIST/ > JUNK, Yr_n1, Yr_n2, Yr_n3 > (A8, 3F2). > BEGIN DATA > f502 5 0 2 > f503 5 0 3 > f504 5 0 2 > f521 5 2 1 > fy101 1 0 1 > fy102 1 0 2 > fy111 1 1 1 > fy121 1 2 1 > END DATA. > LIST. > > VECTOR A (3). > COMPUTE ##=0. > LOOP #=INDEX(JUNK,"0123456789",1) to Len(RTRIM(JUNK)). > COMPUTE ##=##+1 . > + COMPUTE A(##)=NUMBER(SUBSTR(JUNK,#,1),F1). > END LOOP. > > LIST. > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
In reply to this post by Jon K Peck
Jon (and anyone else who may have a clue!), I have been trying to get this method (Python)to work, but
cannot do so. I am using ver. 17.0.2 and have the SPSSINC_TRANS extension
downloaded and installed. Also, when I try to get the help information, I
get the following error message: TypeError: unsupported operand type(s) for /: ‘module’
and ‘_Helper’. I ran the command for help as part of a
program. How should I invoke this extension? TIA From: SPSSX(r) Discussion
[mailto:[hidden email]] On Behalf Of Jon K Peck
|
|
Please send me your actual code. Note that SPSSINC TRANS is an extension command and should be run like regular syntax, that is, not in a program. Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Jon (and anyone else who may have a clue!), I have been trying to get this method (Python)to work, but cannot do so. I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed. Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’. I ran the command for help as part of a program. How should I invoke this extension? TIA Mike From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck Sent: Tuesday, September 14, 2010 8:59 AM To: [hidden email] Subject: Re: Splitting a string Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central. (I know John won't like this, but here it is anyway.) spssinc trans result=x y z /formula 're.search(r"(\d)(\d)(\d)",f).groups()'. It creates numeric variables x, y, and z each holding a single digit, where f is the input variable. Regular expressions such as these provide powerful pattern-based string manipulation techniques. Regards, Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Not often I ask for advice, but I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall [hidden email] http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Jon, Never mind… I got it to work once I restarted my system
and ran the program as regular syntax. Your spssinc_trans.py file explains
its workings, which I tried to follow (somewhat), but the syntax brevity is
very attractive. FWIW, below is the syntax I used. I am still not
sure how to get the “help” command/function to return anything
other than “error” messages(?) data list list/ testdata (a8). begin data e8021 v2950 e80051 e8003 e72103 e6241 e80231 v90050 end data. list. spssinc trans result=pt1 pt2 pt3 pt4 /formula
're.search(r"(\d)(\d)(\d)(\d)",testdata).groups()'. formats pt1 pt2 pt3 pt4 (N). Best Regards Mike From: Jon K Peck
[mailto:[hidden email]]
|
|
Glad you got this to work. You can get help from the dialog box, but to get the syntax help, just do this. spssinc trans /help. Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Hi Jon, Never mind… I got it to work once I restarted my system and ran the program as regular syntax. Your spssinc_trans.py file explains its workings, which I tried to follow (somewhat), but the syntax brevity is very attractive. FWIW, below is the syntax I used. I am still not sure how to get the “help” command/function to return anything other than “error” messages(?) data list list/ testdata (a8). begin data e8021 v2950 e80051 e8003 e72103 e6241 e80231 v90050 end data. list. spssinc trans result=pt1 pt2 pt3 pt4 /formula 're.search(r"(\d)(\d)(\d)(\d)",testdata).groups()'. formats pt1 pt2 pt3 pt4 (N). Best Regards Mike From: Jon K Peck [mailto:peck@...] Sent: Wednesday, September 22, 2010 6:39 PM To: Roberts, Michael Cc: [hidden email] Subject: RE: Splitting a string Please send me your actual code. Note that SPSSINC TRANS is an extension command and should be run like regular syntax, that is, not in a program. Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Jon (and anyone else who may have a clue!), I have been trying to get this method (Python)to work, but cannot do so. I am using ver. 17.0.2 and have the SPSSINC_TRANS extension downloaded and installed. Also, when I try to get the help information, I get the following error message: TypeError: unsupported operand type(s) for /: ‘module’ and ‘_Helper’. I ran the command for help as part of a program. How should I invoke this extension? TIA Mike From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck Sent: Tuesday, September 14, 2010 8:59 AM To: [hidden email] Subject: Re: Splitting a string Just for fun, here is a one-command solution using the SPSSINC TRANS extension command available from Developer Central. (I know John won't like this, but here it is anyway.) spssinc trans result=x y z /formula 're.search(r"(\d)(\d)(\d)",f).groups()'. It creates numeric variables x, y, and z each holding a single digit, where f is the input variable. Regular expressions such as these provide powerful pattern-based string manipulation techniques. Regards, Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
Not often I ask for advice, but I am helping someone out with a data set which has a string variable that needs splitting. The strings consist of one letter ot two, followed by three digits, each of which indicates something different. I've sorted the letter codes manually, but I still need to generate three new variables, one for each digit eg: f502 5 0 2 f503 5 0 3 f504 5 0 2 f521 5 2 1 fy101 1 0 1 fy102 1 0 2 fy111 1 1 1 fy121 1 2 1 I could do it manually in the Data Editor by scrolling down and deleting the letters then creating three variables arithmetically with compute x = trunc (y/100) compute z = mod (y/10) etc etc, but someone out there will have a much neater solution. John Hall [hidden email] http://surveyresearch.weebly.com ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
