How to use programmability transformation to change all multiple blanks to single blanks.

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

How to use programmability transformation to change all multiple blanks to single blanks.

Art Kendall
  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version of Excel it was split into 20 columns.  Different people doing the entry different years varied on how they broke the text into the 20 columns. They always split between words. When the columns were concatenated into one column they did not trim trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single blanks or even better to compare long strings for being the same except for blanks.

Kludge way of doing what I need to do. only the
new file.
data list list /instring (a100) want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again." 0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words." 0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.

-- 
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Maguin, Eugene

Art, how about the replace function for syntax code? Or how about doing search and replace on the concantenated variable in the data window? Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Art Kendall
Sent: Thursday, May 09, 2013 12:26 PM
To: [hidden email]
Subject: How to use programmability transformation to change all multiple blanks to single blanks.

 

  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version of Excel it was split into 20 columns.  Different people doing the entry different years varied on how they broke the text into the 20 columns. They always split between words. When the columns were concatenated into one column they did not trim trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single blanks or even better to compare long strings for being the same except for blanks.

Kludge way of doing what I need to do. only the
new file.
data list list /instring (a100)
want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again."
0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words."
0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.


--
Art Kendall
Social Research Consultants

Art Kendall
Social Research Consultants

 


View this message in context: How to use programmability transformation to change all multiple blanks to single blanks.
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

David Marso
Administrator
In reply to this post by Art Kendall
STRING newstring(A1200).
COMPUTE newstring=instring.
LOOP.
COMPUTE newstring= replace(newstring,'  ',' ').
END LOOP IF (CHAR.INDEX(newstring,"  ") EQ 0).



Art Kendall wrote
  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version
            of Excel it was split into 20 columns. 
              Different people doing the entry different years varied on how they broke the text
                  into the 20 columns. They always split between words.
                    When the columns were concatenated
                    into one column they did not trim
                      trailing blanks.  So different cases have
                      interspersed extra blanks.
                     
                      Is there a
                        programmability command to change multiple blanks to single blanks or even
                          better to compare long strings for being the
                          same except for blanks.
                     
                      Kludge way of doing what I
                      need to do. only the
                    new file.
      data list list /instring (a100) want (f1).
      begin data
      "This is   a string. This is another. Some more words.
        Again.     Again." 0
      "This is a string. This   is another. Some more words.
        Again. Again." 0
      "This is a string. This is another.     Some more words.
        Again. Again." 0
      "This is a     string. This is another. Some more words.
        Again. Again." 0
      "This is a string. This is another. Some more words.
        Again. Again." 0
      "This    is a string. This is another. Some more words.
        Again. Again." 0
      "This is a string. This     is another. Some more words.
        Again. Again." 0
      "This is a string. This     is another. Some more words."
        1
      "This is a string. This     is
        another. Some more words." 0
    "This is a string. This    
            is another. Some more words." 0
          end data.
      string newstring(a100).
      compute newstring= replace(instring,'  ',' ').
      compute newstring= replace(newstring,'  ',' ').
      compute newstring= replace(newstring,'  ',' ').
      compute newstring= replace(newstring,'  ',' ').
      compute newstring= replace(newstring,'  ',' ').
      compute newstring= replace(newstring,'  ',' ').
      compute changeflag= $casenum ne 1 and newstring ne
        lag(newstring).
      list variables= changeflag want newstring.
       
      In Teco or WordPerfect
        it would be something like
        search for white space, change
            it to a single blank.
     
    --
Art Kendall
Social Research Consultants
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Art Kendall
In reply to this post by Maguin, Eugene
Thank you for getting back to me. The kludge syntax I posted uses the replace function.  Since I do not include the fourth argument of replace it "does all occurrences". However, the function does not back up one character before searching for the next occurrence. That is why the kludge I did calls replace several times.

I ran the first part of the syntax I posted.  Then <edit> <replace> . It still had to be run several times, and it did not paste syntax.

If there is not a programmability way to do it. Another approach would be some thing like
loop.
compute newstring = replace (newstring,'  ',' ').
end loop if index(newstring, ' ') eq 0.

 I thought this would be a place to start learning pythons character pattern routines.

Art Kendall
Social Research Consultants
On 5/9/2013 12:38 PM, Maguin, Eugene wrote:

Art, how about the replace function for syntax code? Or how about doing search and replace on the concantenated variable in the data window? Gene Maguin

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Art Kendall
Sent: Thursday, May 09, 2013 12:26 PM
To: [hidden email]
Subject: How to use programmability transformation to change all multiple blanks to single blanks.

 

  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version of Excel it was split into 20 columns.  Different people doing the entry different years varied on how they broke the text into the 20 columns. They always split between words. When the columns were concatenated into one column they did not trim trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single blanks or even better to compare long strings for being the same except for blanks.

Kludge way of doing what I need to do. only the
new file.
data list list /instring (a100)
want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again."
0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words."
0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.


--
Art Kendall
Social Research Consultants

Art Kendall
Social Research Consultants

 


View this message in context: How to use programmability transformation to change all multiple blanks to single blanks.
Sent from the SPSSX Discussion mailing list archive at Nabble.com.


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Rick Oliver-3
In reply to this post by Maguin, Eugene
*a bit inelegant because index function doesn't seem to recognize difference between one space and two spaces.
*choose a replacement character that you are reasonably confident won't appear in the string.
compute stringvar=replace(stringvar, "  ", "@").
loop if char.index(stringvar, "@@")<>0.
compute stringvar=replace(stringvar,"@@", "@").
end loop.
compute stringvar=replace(stringvar,"@", " ").

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        05/09/2013 11:43 AM
Subject:        Re: How to use programmability transformation to change all              multiple blanks to single blanks.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Art, how about the replace function for syntax code? Or how about doing search and replace on the concantenated variable in the data window? Gene Maguin
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Art Kendall
Sent:
Thursday, May 09, 2013 12:26 PM
To:
[hidden email]
Subject:
How to use programmability transformation to change all multiple blanks to single blanks.

 
  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version of Excel it was split into 20 columns.  Different people doing the entry different years varied on how they broke the text into the 20 columns. They always split between words. When the columns were concatenated into one column they did not trim trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single blanks or even better to compare long strings for being the same except for blanks.

Kludge way of doing what I need to do. only the

new file.
data list list /instring (a100)
want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again."
0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words."
0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.



--
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants

 


View this message in context: How to use programmability transformation to change all multiple blanks to single blanks.
Sent from the
SPSSX Discussion mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Jon K Peck
In reply to this post by Art Kendall
spssinc trans result=newstring type=100
/formula "re.sub(' +', ' ', instring)".

The regular expression ' +' matches any sequence of 1 or more blanks.  The replacement is a single blank.  The command creates a new 100-byte string variable called newstring.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Art Kendall <[hidden email]>
To:        [hidden email],
Date:        05/09/2013 11:26 AM
Subject:        Re: [SPSSX-L] How to use programmability transformation to change              all              multiple blanks to single blanks.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Thank you for getting back to me. The kludge syntax I posted uses the replace function.  Since I do not include the fourth argument of replace it "does all occurrences". However, the function does not back up one character before searching for the next occurrence. That is why the kludge I did calls replace several times.

I ran the first part of the syntax I posted.  Then <edit> <replace> . It still had to be run several times, and it did not paste syntax.

If there is not a programmability way to do it. Another approach would be some thing like
loop.
compute newstring = replace (newstring,'  ',' ').
end loop if index(newstring, ' ') eq 0.

I thought this would be a place to start learning pythons character pattern routines.

Art Kendall
Social Research Consultants

On 5/9/2013 12:38 PM, Maguin, Eugene wrote:
Art, how about the replace function for syntax code? Or how about doing search and replace on the concantenated variable in the data window? Gene Maguin
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Art Kendall
Sent:
Thursday, May 09, 2013 12:26 PM
To:
[hidden email]
Subject:
How to use programmability transformation to change all multiple blanks to single blanks.

 
  I have a 1200 character field with text.  It is supposed to be boiler plate, but when it was put in old version of Excel it was split into 20 columns.  Different people doing the entry different years varied on how they broke the text into the 20 columns. They always split between words. When the columns were concatenated into one column they did not trim trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single blanks or even better to compare long strings for being the same except for blanks.

Kludge way of doing what I need to do. only the

new file.
data list list /instring (a100)
want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again."
0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words."
0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.



--
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants

 


View this message in context: How to use programmability transformation to change all multiple blanks to single blanks.
Sent from the
SPSSX Discussion mailing list archive at Nabble.com.


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

David Marso
Administrator
In reply to this post by Rick Oliver-3
I am using version 21.0.1

For very LONG strings may need to use:
SET MXLOOPS=10000.
prior to running the code.
--
new file.
data list list /instring (a100) want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again." 0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words." 0
"This is a string. This     is another. Some more words." 0
end data.

STRING newstring(A100).
COMPUTE newstring=instring.
LOOP.
COMPUTE newstring= replace(newstring,'  ',' ').
END LOOP IF (CHAR.INDEX(newstring,"  ") EQ 0).

list variables= instring newstring.
The variables are listed in the following order:

LINE   1: instring

LINE   2: newstring




    instring: This is   a string. This is another. Some more words. Again.     Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This   is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This is another.     Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a     string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This    is a string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This     is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.


Number of cases read:  10    Number of cases listed:  10
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

David Marso
Administrator
In reply to this post by Jon K Peck
My question is why bother with python if it can be done with 4 lines of standard SPSS syntax and you don't need to drag python around?

Jon K Peck wrote
spssinc trans result=newstring type=100
/formula "re.sub(' +', ' ', instring)".

The regular expression ' +' matches any sequence of 1 or more blanks.  The
replacement is a single blank.  The command creates a new 100-byte string
variable called newstring.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:   Art Kendall <[hidden email]>
To:     [hidden email],
Date:   05/09/2013 11:26 AM
Subject:        Re: [SPSSX-L] How to use programmability transformation to
change              all              multiple blanks to single blanks.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Thank you for getting back to me. The kludge syntax I posted uses the
replace function.  Since I do not include the fourth argument of replace
it "does all occurrences". However, the function does not back up one
character before searching for the next occurrence. That is why the kludge
I did calls replace several times.

I ran the first part of the syntax I posted.  Then <edit> <replace> . It
still had to be run several times, and it did not paste syntax.

If there is not a programmability way to do it. Another approach would be
some thing like
loop.
compute newstring = replace (newstring,'  ',' ').
end loop if index(newstring, ' ') eq 0.

 I thought this would be a place to start learning pythons character
pattern routines.

Art Kendall
Social Research Consultants
On 5/9/2013 12:38 PM, Maguin, Eugene wrote:
Art, how about the replace function for syntax code? Or how about doing
search and replace on the concantenated variable in the data window? Gene
Maguin

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Art Kendall
Sent: Thursday, May 09, 2013 12:26 PM
To: [hidden email]
Subject: How to use programmability transformation to change all multiple
blanks to single blanks.

  I have a 1200 character field with text.  It is supposed to be boiler
plate, but when it was put in old version of Excel it was split into 20
columns.  Different people doing the entry different years varied on how
they broke the text into the 20 columns. They always split between words.
When the columns were concatenated into one column they did not trim
trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single
blanks or even better to compare long strings for being the same except
for blanks.

Kludge way of doing what I need to do. only the
new file.
data list list /instring (a100) want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again."
0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again." 0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words." 0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.


--
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants


View this message in context: How to use programmability transformation to
change all multiple blanks to single blanks.
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

===================== To manage your subscription to SPSSX-L, send a
message to [hidden email] (not to SPSSX-L), with no body text
except the command. To leave the list, send the command SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command INFO
REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Art Kendall
In reply to this post by David Marso
If the fourth argument of replace is omitted, it is supposed to change all instances of the second argument to the value of the third argument. But it does not back up a character before it looks for the next occurrence.
Art Kendall
Social Research Consultants
On 5/9/2013 2:03 PM, David Marso [via SPSSX Discussion] wrote:
I am using version 21.0.1

For very LONG strings may need to use:
SET MXLOOPS=10000.
prior to running the code.
--
new file.
data list list /instring (a100) want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again." 0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again." 0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words." 0
"This is a string. This     is another. Some more words." 0
end data.

STRING newstring(A100).
COMPUTE newstring=instring.
LOOP.
COMPUTE newstring= replace(newstring,'  ',' ').
END LOOP IF (CHAR.INDEX(newstring,"  ") EQ 0).

list variables= instring newstring.
The variables are listed in the following order:

LINE   1: instring

LINE   2: newstring




    instring: This is   a string. This is another. Some more words. Again.     Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This   is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This is another.     Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a     string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This    is a string. This is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This     is another. Some more words. Again. Again.
   newstring: This is a string. This is another. Some more words. Again. Again.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.

    instring: This is a string. This     is another. Some more words.
   newstring: This is a string. This is another. Some more words.


Number of cases read:  10    Number of cases listed:  10
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"



To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to use programmability transformation to change all multiple blanks to single blanks.

Art Kendall
In reply to this post by David Marso
Because I want to learn pattern matching in strings. 
This is a simple example. It does exactly what 1970 version of  the text editor TECO did with "Cntl+EW" (hold control key hit E, then hit W.)
Art Kendall
Social Research Consultants
On 5/9/2013 2:05 PM, David Marso [via SPSSX Discussion] wrote:
My question is why bother with python if it can be done with 4 lines of standard SPSS syntax and you don't need to drag python around?

Jon K Peck wrote
spssinc trans result=newstring type=100
/formula "re.sub(' +', ' ', instring)".

The regular expression ' +' matches any sequence of 1 or more blanks.  The
replacement is a single blank.  The command creates a new 100-byte string
variable called newstring.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:   Art Kendall <[hidden email]>
To:     [hidden email],
Date:   05/09/2013 11:26 AM
Subject:        Re: [SPSSX-L] How to use programmability transformation to
change              all              multiple blanks to single blanks.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Thank you for getting back to me. The kludge syntax I posted uses the
replace function.  Since I do not include the fourth argument of replace
it "does all occurrences". However, the function does not back up one
character before searching for the next occurrence. That is why the kludge
I did calls replace several times.

I ran the first part of the syntax I posted.  Then <edit> <replace> . It
still had to be run several times, and it did not paste syntax.

If there is not a programmability way to do it. Another approach would be
some thing like
loop.
compute newstring = replace (newstring,'  ',' ').
end loop if index(newstring, ' ') eq 0.

 I thought this would be a place to start learning pythons character
pattern routines.

Art Kendall
Social Research Consultants
On 5/9/2013 12:38 PM, Maguin, Eugene wrote:
Art, how about the replace function for syntax code? Or how about doing
search and replace on the concantenated variable in the data window? Gene
Maguin

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Art Kendall
Sent: Thursday, May 09, 2013 12:26 PM
To: [hidden email]
Subject: How to use programmability transformation to change all multiple
blanks to single blanks.

  I have a 1200 character field with text.  It is supposed to be boiler
plate, but when it was put in old version of Excel it was split into 20
columns.  Different people doing the entry different years varied on how
they broke the text into the 20 columns. They always split between words.
When the columns were concatenated into one column they did not trim
trailing blanks.  So different cases have interspersed extra blanks.

Is there a programmability command to change multiple blanks to single
blanks or even better to compare long strings for being the same except
for blanks.

Kludge way of doing what I need to do. only the
new file.
data list list /instring (a100) want (f1).
begin data
"This is   a string. This is another. Some more words. Again.     Again."
0
"This is a string. This   is another. Some more words. Again. Again." 0
"This is a string. This is another.     Some more words. Again. Again." 0
"This is a     string. This is another. Some more words. Again. Again." 0
"This is a string. This is another. Some more words. Again. Again." 0
"This    is a string. This is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words. Again. Again." 0
"This is a string. This     is another. Some more words." 1
"This is a string. This     is another. Some more words." 0
"This is a string. This     is another. Some more words." 0
end data.
string newstring(a100).
compute newstring= replace(instring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute newstring= replace(newstring,'  ',' ').
compute changeflag= $casenum ne 1 and newstring ne lag(newstring).
list variables= changeflag want newstring.

In Teco or WordPerfect it would be something like
search for white space, change it to a single blank.


--
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants


View this message in context: How to use programmability transformation to
change all multiple blanks to single blanks.
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

===================== To manage your subscription to SPSSX-L, send a
message to [hidden email] (not to SPSSX-L), with no body text
except the command. To leave the list, send the command SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command INFO
REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"



To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants