Removing Puncuation From Last Name & First Name

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Removing Puncuation From Last Name & First Name

Kreischer,Resha M

Hello,

 

Does anyone have syntax that would remove punctuation from a variable?

 

For example, I would like to remove hypens, apostrophes, etc., from first names and last names. Currently, I search by the punctuation and delete it student by student, which takes some time when dealing with 40,000 records.

 

This is what I would like to do:

 

Last Name1                              Last Name 2

Adams-Anderson                       AdamsAnderson

Kennedy-Shankle                       KennedyShankle

Oden-Miller1                              OdenMillder

O’Donnell                                  Odonnell

 

 

 

All ideas are welcome. Thanks.

 

Resha

Reply | Threaded
Open this post in threaded view
|

Re: Removing Puncuation From Last Name & First Name

Bruce Weaver
Administrator
Kreischer,Resha M wrote
Hello,

 

Does anyone have syntax that would remove punctuation from a variable?

 

For example, I would like to remove hypens, apostrophes, etc., from
first names and last names. Currently, I search by the punctuation and
delete it student by student, which takes some time when dealing with
40,000 records.

 

This is what I would like to do:

 

Last Name1                              Last Name 2

Adams-Anderson                       AdamsAnderson

Kennedy-Shankle                       KennedyShankle

Oden-Miller1                              OdenMillder

O'Donnell                                  Odonnell


All ideas are welcome. Thanks.

Resha
data list list / lastname1 (a25).
begin data.
"Adams-Anderson"
"Kennedy-Shankle"
"Oden-Miller1"
"O’Donnell"
end data.

string lastname2 (a25).
compute lastname2 = replace(lastname1,"-","").
compute lastname2 = replace(lastname2,"'","").
compute lastname2 = replace(lastname2,"1","").
list.

Note that LASTNAME2 is used in the REPLACE function on the 2nd and 3rd COMPUTE commands.  If you use LASTNAME1 again, you lose what you did on any previous COMPUTE line (as I learned the hard way).

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Removing Puncuation From Last Name & First Name

Bruce Weaver
Administrator
My apologies if this appears twice.  I was getting a "not accepted" message first time.

Bruce Weaver wrote
Kreischer,Resha M wrote
Hello,

Does anyone have syntax that would remove punctuation from a variable?

For example, I would like to remove hypens, apostrophes, etc., from
first names and last names. Currently, I search by the punctuation and
delete it student by student, which takes some time when dealing with
40,000 records.

This is what I would like to do:

Last Name1                              Last Name 2

Adams-Anderson                       AdamsAnderson
Kennedy-Shankle                       KennedyShankle
Oden-Miller1                              OdenMillder
O'Donnell                                  Odonnell

All ideas are welcome. Thanks.

Resha
data list list / lastname1 (a25).
begin data.
"Adams-Anderson"
"Kennedy-Shankle"
"Oden-Miller1"
"O’Donnell"
end data.

string lastname2 (a25).
compute lastname2 = replace(lastname1,"-","").
compute lastname2 = replace(lastname2,"'","").
compute lastname2 = replace(lastname2,"1","").
list.

Note that LASTNAME2 is used in the REPLACE function on the 2nd and 3rd COMPUTE commands.  If you use LASTNAME1 again, you lose what you did on any previous COMPUTE line (as I learned the hard way).
An off-list reply suggests that what I meant by that last comment was not terribly clear.  So here's my attempt to clarify!

If I had used LASTNAME1 on the right side of each COMPUTE, like this:

string lastname2 (a25).
compute lastname2 = replace(lastname1,"-","").
compute lastname2 = replace(lastname1,"'","").
compute lastname2 = replace(lastname1,"1","").
list.

...then LASTNAME2 would only have the last replacement in the list.  What I meant by learning the hard way, of course, was that I tried it this way when developing my syntax, and then sat scratching my head for a while trying to work out what had gone wrong.  ;-)

Alternatively, I could have added another COMPUTE at the beginning, setting LASTNAME2 = LASTNAME1, and then proceeded to do the replacements.  If I had  done that, then I would want LASTNAME2 on both sides of the equals sign for all three REPLACE's.  

string lastname2 (a25).

* Set LASTNAME2 = LASTNAME1 .
compute lastname2 = lastname1.

* And now do the replacements.
compute lastname2 = replace(lastname2,"-","").
compute lastname2 = replace(lastname2,"'","").
compute lastname2 = replace(lastname2,"1","").
list.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).