|
Hi dear Listers.
I would appreciate your help to obtain a Sintax SPSS v15 to normalize a String (millions cases). Sorry for my mistake in the past post. The problem is that i have spaces (1,2,3....), tabulators (1,2....) in between as separators and it must be replaced with one space. Let me show some examples. The string may have up to 12 parts. Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack After normalization, Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack Thanks for your help, Libardo ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi,
A not-so-elegant solution is below. It assumes that the string variable with the names is the first variable in the file. It also assumes that a directory d:/temp is present. It would be nicer to skip the step with the auxiliary txt file. Cheers!! Albert-Jan * sample data. data list free / oldvalue (a60). begin data 'Sara Smith Dockter ' ' Lee Hunter Casidy' 'George Harvey Mora Aito ' ' Marck Mack ' end data. *actual code. compute casenum = $casenum. exe. dataset name source. begin program. import spss, re f = open('d:/temp/workfile.txt', 'w') dataCursor = spss.Cursor([0]) f.write("casenum\tnewvalue\r\n") for i in range(spss.GetCaseCount()): oldval = dataCursor.fetchone()[0] newval = re.sub(r"\s+", ' ', oldval) newval = newval.strip() writestr = str(i + 1) + "\t" + newval + "\r\n" f.write(writestr) dataCursor.close() f.close() end program. GET DATA /TYPE = TXT /FILE = 'D:\temp\workfile.txt' /DELCASE = LINE /DELIMITERS = "\t" /ARRANGEMENT = DELIMITED /FIRSTCASE = 2 /IMPORTCASE = ALL /VARIABLES = casenum F1.0 newvalue A25 . match files / file = * / file = source / by = casenum. exe. dataset close all. ----- Original Message ---- From: Libardo Lopez <[hidden email]> To: [hidden email] Sent: Saturday, January 24, 2009 2:09:47 PM Subject: Normalize a String1 Hi dear Listers. I would appreciate your help to obtain a Sintax SPSS v15 to normalize a String (millions cases). Sorry for my mistake in the past post. The problem is that i have spaces (1,2,3....), tabulators (1,2....) in between as separators and it must be replaced with one space. Let me show some examples. The string may have up to 12 parts. Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack After normalization, Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack Thanks for your help, Libardo ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
The multiple space problem can be solved fairly easily without resorting to Python:
loop if index(rtrim(stringvar), " ")>0. compute stringvar=replace(stringvar, " ", " "). end loop. I thought there was a way to specify tab characters in command syntax, but it isn't what I thought it was; so I don't have a simple syntax solution for the tab problem. ________________________________ From: SPSSX(r) Discussion on behalf of Albert-jan Roskam Sent: Sat 1/24/2009 12:39 PM To: [hidden email] Subject: Re: Normalize a String1 Hi, A not-so-elegant solution is below. It assumes that the string variable with the names is the first variable in the file. It also assumes that a directory d:/temp is present. It would be nicer to skip the step with the auxiliary txt file. Cheers!! Albert-Jan * sample data. data list free / oldvalue (a60). begin data 'Sara Smith Dockter ' ' Lee Hunter Casidy' 'George Harvey Mora Aito ' ' Marck Mack ' end data. *actual code. compute casenum = $casenum. exe. dataset name source. begin program. import spss, re f = open('d:/temp/workfile.txt', 'w') dataCursor = spss.Cursor([0]) f.write("casenum\tnewvalue\r\n") for i in range(spss.GetCaseCount()): oldval = dataCursor.fetchone()[0] newval = re.sub(r"\s+", ' ', oldval) newval = newval.strip() writestr = str(i + 1) + "\t" + newval + "\r\n" f.write(writestr) dataCursor.close() f.close() end program. GET DATA /TYPE = TXT /FILE = 'D:\temp\workfile.txt' /DELCASE = LINE /DELIMITERS = "\t" /ARRANGEMENT = DELIMITED /FIRSTCASE = 2 /IMPORTCASE = ALL /VARIABLES = casenum F1.0 newvalue A25 . match files / file = * / file = source / by = casenum. exe. dataset close all. ----- Original Message ---- From: Libardo Lopez <[hidden email]> To: [hidden email] Sent: Saturday, January 24, 2009 2:09:47 PM Subject: Normalize a String1 Hi dear Listers. I would appreciate your help to obtain a Sintax SPSS v15 to normalize a String (millions cases). Sorry for my mistake in the past post. The problem is that i have spaces (1,2,3....), tabulators (1,2....) in between as separators and it must be replaced with one space. Let me show some examples. The string may have up to 12 parts. Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack After normalization, Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack Thanks for your help, Libardo ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Richard,
Are you saying that after running the syntax below, the string 'multiple space problem' Will go to 'multiple space problem '? Thus, the replace command will pull all characters to the right of the ' ' to the left by one place. >>>The multiple space problem can be solved fairly easily without resorting to Python: loop if index(rtrim(stringvar), " ")>0. compute stringvar=replace(stringvar, " ", " "). end loop. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Oliver, Richard
To get rid of tabs, run something like
compute strvar = replace(strvar, string(09, pib1),' '). Do this before replacing blanks in case there are mixed strings and blanks. (Of course, there is a nicer way to do all this in Python with regular expressions, but I won't go there today.) HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Oliver, Richard Sent: Saturday, January 24, 2009 12:55 PM To: [hidden email] Subject: Re: [SPSSX-L] Normalize a String1 The multiple space problem can be solved fairly easily without resorting to Python: loop if index(rtrim(stringvar), " ")>0. compute stringvar=replace(stringvar, " ", " "). end loop. I thought there was a way to specify tab characters in command syntax, but it isn't what I thought it was; so I don't have a simple syntax solution for the tab problem. ________________________________ From: SPSSX(r) Discussion on behalf of Albert-jan Roskam Sent: Sat 1/24/2009 12:39 PM To: [hidden email] Subject: Re: Normalize a String1 Hi, A not-so-elegant solution is below. It assumes that the string variable with the names is the first variable in the file. It also assumes that a directory d:/temp is present. It would be nicer to skip the step with the auxiliary txt file. Cheers!! Albert-Jan * sample data. data list free / oldvalue (a60). begin data 'Sara Smith Dockter ' ' Lee Hunter Casidy' 'George Harvey Mora Aito ' ' Marck Mack ' end data. *actual code. compute casenum = $casenum. exe. dataset name source. begin program. import spss, re f = open('d:/temp/workfile.txt', 'w') dataCursor = spss.Cursor([0]) f.write("casenum\tnewvalue\r\n") for i in range(spss.GetCaseCount()): oldval = dataCursor.fetchone()[0] newval = re.sub(r"\s+", ' ', oldval) newval = newval.strip() writestr = str(i + 1) + "\t" + newval + "\r\n" f.write(writestr) dataCursor.close() f.close() end program. GET DATA /TYPE = TXT /FILE = 'D:\temp\workfile.txt' /DELCASE = LINE /DELIMITERS = "\t" /ARRANGEMENT = DELIMITED /FIRSTCASE = 2 /IMPORTCASE = ALL /VARIABLES = casenum F1.0 newvalue A25 . match files / file = * / file = source / by = casenum. exe. dataset close all. ----- Original Message ---- From: Libardo Lopez <[hidden email]> To: [hidden email] Sent: Saturday, January 24, 2009 2:09:47 PM Subject: Normalize a String1 Hi dear Listers. I would appreciate your help to obtain a Sintax SPSS v15 to normalize a String (millions cases). Sorry for my mistake in the past post. The problem is that i have spaces (1,2,3....), tabulators (1,2....) in between as separators and it must be replaced with one space. Let me show some examples. The string may have up to 12 parts. Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack After normalization, Sara Smith Dockter Lee Hunter Casidy George Harvey Mora Aito Marck Mack Thanks for your help, Libardo ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Maguin, Eugene
It will keep going through the string value, replacing all instances of two spaces with one space until there are no more instances of two consecutive spaces.
________________________________ From: SPSSX(r) Discussion on behalf of Gene Maguin Sent: Sat 1/24/2009 2:16 PM To: [hidden email] Subject: Re: Normalize a String1 Richard, Are you saying that after running the syntax below, the string 'multiple space problem' Will go to 'multiple space problem '? Thus, the replace command will pull all characters to the right of the ' ' to the left by one place. >>>The multiple space problem can be solved fairly easily without resorting to Python: loop if index(rtrim(stringvar), " ")>0. compute stringvar=replace(stringvar, " ", " "). end loop. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks so much to all of you.
With your help, the syntax to solve my needs is: Compute oldvalue = ltrim(oldvalue). *To get rid of tabs. compute oldvalue = replace(oldvalue, string(09, pib1),' '). loop if index(rtrim(oldvalue), " ")>0. compute oldvalue=replace(oldvalue, " ", " "). end loop. execute. Tanks again, Libardo On Sat, Jan 24, 2009 at 5:53 PM, Oliver, Richard <[hidden email]> wrote: > It will keep going through the string value, replacing all instances of two > spaces with one space until there are no more instances of two consecutive > spaces. > > ________________________________ > > From: SPSSX(r) Discussion on behalf of Gene Maguin > Sent: Sat 1/24/2009 2:16 PM > To: [hidden email] > Subject: Re: Normalize a String1 > > > > Richard, > > Are you saying that after running the syntax below, the string > > 'multiple space problem' > > Will go to > > 'multiple space problem '? > > Thus, the replace command will pull all characters to the right of the ' ' > to the left by one place. > > > >>>The multiple space problem can be solved fairly easily without resorting > to Python: > > loop if index(rtrim(stringvar), " ")>0. > compute stringvar=replace(stringvar, " ", " "). > end loop. > > Gene Maguin > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
