SPSSX Discussion

Parsing String Name Variable

Classic

List

Threaded

14 messages Options

Maggie Greiner

Parsing String Name Variable

Dear List.

I know this is a simple question, but I've messed around with substr and
rindex and can't get the syntax to work. Truth be told, I'm not sure what
I'm doing anyway.

I have a file with 2300 records. The First Middle Last names are in one
variable (name). I want to parse out the first, middle and last into three
separate variables. Spaces separate (one space) each of the name
components. The name components are of varying lengths.

This is what I have:

Name
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman

This is what I want

First Middle Last
Brenda Jones
Steven Patrick Leesman

Thanks for any help.

Maggie

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: Parsing String Name Variable

Administrator

How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name first middle last

Brenda Jones Brenda Jones
Chuck Fred Smith Chuck Fred Smith
Alyssa Gwen Mulder Alyssa Gwen Mulder
Steven Patrick Leesman Steven Patrick Leesman

Number of cases read: 4 Number of cases listed: 4

Maggie wrote

Dear List.

I know this is a simple question, but I've messed around with substr and
rindex and can't get the syntax to work. Truth be told, I'm not sure what
I'm doing anyway.

I have a file with 2300 records. The First Middle Last names are in one
variable (name). I want to parse out the first, middle and last into three
separate variables. Spaces separate (one space) each of the name
components. The name components are of varying lengths.

This is what I have:

Name
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman

This is what I want

First Middle Last
Brenda Jones
Steven Patrick Leesman

Thanks for any help.

Maggie

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Rick Oliver-3

Re: Parsing String Name Variable

In reply to this post by Maggie Greiner

There are numerous ways to accomplish this. Here's one:

*sample data.
data list list (",") /name (a100).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.
*end sample data, start code.
string fname mname lname #name (a100).
compute fname=substr(name, 1, index(name, " ")-1).
compute lname=substr(name, rindex(name, " ")+1).
compute #name=replace(name, fname, "").
compute #name=replace(#name, lname, "").
compute mname=ltrim(#name).
list.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]

From: Maggie <[hidden email]>
To: [hidden email]
Date: 07/05/2012 04:13 PM
Subject: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Dear List. I know this is a simple question, but I've messed around with substr and rindex and can't get the syntax to work. Truth be told, I'm not sure what I'm doing anyway. I have a file with 2300 records. The First Middle Last names are in one variable (name). I want to parse out the first, middle and last into three separate variables. Spaces separate (one space) each of the name components. The name components are of varying lengths. This is what I have: Name Brenda Jones Chuck Fred Smith Alyssa Gwen Mulder Steven Patrick Leesman This is what I want First Middle Last Brenda Jones Steven Patrick Leesman Thanks for any help. Maggie ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

David Marso

Re: Parsing String Name Variable

Administrator

In reply to this post by Maggie Greiner

Maggie wrote

Dear List.

I know this is a simple question, but I've messed around with substr and
rindex and can't get the syntax to work. Truth be told, I'm not sure what
I'm doing anyway.

I have a file with 2300 records. The First Middle Last names are in one
variable (name). I want to parse out the first, middle and last into three
separate variables. Spaces separate (one space) each of the name
components. The name components are of varying lengths.

This is what I have:

Name
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman

This is what I want

First Middle Last
Brenda Jones
Steven Patrick Leesman

Thanks for any help.

Maggie

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Jon K Peck

Re: Parsing String Name Variable

Or this, using a Python extension command.
data list / name 1-40 (A).
begin data
john stuart mills
captain crunch
end data
dataset name names.

spssinc trans result = first middle last type=15
/formula "string.split(name)".
do if last = "".
compute last = middle.
compute middle = "".
end if.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: David Marso <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:55 PM
Subject: Re: [SPSSX-L] Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

For simple 2/3 element parsing use Bruce's or Rick's solutions OTOH if you encounter general issues where you need to split out multiple elements from arbitrarily long sequences the following ideas are useful. data list / str 1-40 (A). begin data john stuart mills captain crunch end data string fname mname lname (A20) #str (a40). VECTOR names=fname TO lname. COMPUTE #str=LTRIM(str). COMPUTE #I=1. LOOP . + COMPUTE #Found=INDEX(#str," "). + COMPUTE names(#I)=SUBSTR(#str,1,#Found). + COMPUTE #str=LTRIM(SUBSTR(#str,#Found+1)). + COMPUTE #I=#I+1. END LOOP IF #str=" ". ** Fix middle and last names **. DO IF #I=3. + COMPUTE names(3)=names(2). + COMPUTE names(2)=" ". END IF. list. Maggie wrote > > Dear List. > > I know this is a simple question, but I've messed around with substr and > rindex and can't get the syntax to work. Truth be told, I'm not sure what > I'm doing anyway. > > I have a file with 2300 records. The First Middle Last names are in one > variable (name). I want to parse out the first, middle and last into three > separate variables. Spaces separate (one space) each of the name > components. The name components are of varying lengths. > > This is what I have: > > Name > Brenda Jones > Chuck Fred Smith > Alyssa Gwen Mulder > Steven Patrick Leesman > > This is what I want > > First Middle Last > Brenda Jones > Steven Patrick Leesman > > Thanks for any help. > > Maggie > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714040.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Albert-Jan Roskam

Re: Parsing String Name Variable

Cool, but it won't work with double names (double middle names, noble names). Here are a bunch of names I plucked from wikipedia, the last few are noble names.

data list / naam 1-50 (A).
begin data
Henk Baars
Arjen de Baat
W.G. del Baere
Maarten den Bakker
Cees Bal
Frank van Bakel
Maas van Beek
Daniëlle Bekkering
Dirk Bellemakers
Chantal Beltman
Henk Benjamins
Camiel van den Bergh
Jan van Benthem van den Bergh
Johan van den Berch van Heemstede
Jan-Willem van Beresteyn
Hans-Willem van Bevervoorden van Oldemeule
end data
dataset name namen.

begin program python.
import re
def giveNames(name):
first = re.split("\s+(?=[a-z]?)", name)[0]
last = " ".join(re.split("\s+(?=[A-Z])", name)[1:])
middle = re.sub("(%s|%s)" % (first, last), "", name).strip()
return first, middle, last
end program.

spssinc trans result = first middle last type=30
/formula "giveNames(naam)".
execute.

Regards,
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Jon K Peck <[hidden email]>
To: [hidden email]
Sent: Friday, July 6, 2012 3:00 AM
Subject: Re: [SPSSX-L] Parsing String Name Variable

Or this, using a Python extension command.
data list / name 1-40 (A).
begin data
john stuart mills
captain crunch
end data
dataset name names.

spssinc trans result = first middle last type=15
/formula "string.split(name)".
do if last = "".
compute last = middle.
compute middle = "".
end if.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: David Marso <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:55 PM
Subject: Re: [SPSSX-L] Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

For simple 2/3 element parsing use Bruce's or Rick's solutions OTOH if you encounter general issues where you need to split out multiple elements from arbitrarily long sequences the following ideas are useful. data list / str 1-40 (A). begin data john stuart mills captain crunch end data string fname mname lname (A20) #str (a40). VECTOR names=fname TO lname. COMPUTE #str=LTRIM(str). COMPUTE #I=1. LOOP . + COMPUTE #Found=INDEX(#str," "). + COMPUTE names(#I)=SUBSTR(#str,1,#Found). + COMPUTE #str=LTRIM(SUBSTR(#str,#Found+1)). + COMPUTE #I=#I+1. END LOOP IF #str=" ". ** Fix middle and last names **. DO IF #I=3. + COMPUTE names(3)=names(2). + COMPUTE names(2)=" ". END IF. list. Maggie wrote > > Dear List. > > I know this is a simple question, but I've messed around with substr and > rindex and can't get the syntax to work. Truth be told, I'm not sure what > I'm doing anyway. > > I have a file with 2300 records. The First Middle Last names are in one > variable (name). I want to parse out the first, middle and last into three > separate variables. Spaces separate (one space) each of the name > components. The name components are of varying lengths. > > This is what I have: > > Name > Brenda Jones > Chuck Fred Smith > Alyssa Gwen Mulder > Steven Patrick Leesman > > This is what I want > > First Middle Last > Brenda Jones > Steven Patrick Leesman > > Thanks for any help. > > Maggie > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714040.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Kylie

Automatic reply: Parsing String Name Variable

I am currently out of the office at a conference and will return on Friday July 13th. I will be monitoring my email for urgent issues, but otherwise I will get back to you as soon as possible after my return.

Thank you.

Maggie Greiner

Re: Parsing String Name Variable

In reply to this post by Rick Oliver-3

Bruce, I'm using your syntax, but there are a couple of things I don't understand, and I want to learn. I understand everything up to "name,1,#sp1-1" and "#sp2+1". What does that syntax do?

Again, thanks to everyone for your help.

Maggie.

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:03 PM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

How about this? data list / name(a25). begin data Brenda Jones Chuck Fred Smith Alyssa Gwen Mulder Steven Patrick Leesman end data. string first middle last (a15). compute #sp1 = index(name," "). compute #sp2 = rindex(rtrim(name)," "). compute first = substr(name,1,#sp1-1). compute last = substr(name,#sp2+1). if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1). list. OUTPUT: name first middle last Brenda Jones Brenda Jones Chuck Fred Smith Chuck Fred Smith Alyssa Gwen Mulder Alyssa Gwen Mulder Steven Patrick Leesman Steven Patrick Leesman Number of cases read: 4 Number of cases listed: 4 Maggie wrote > > Dear List. > > I know this is a simple question, but I've messed around with substr and > rindex and can't get the syntax to work. Truth be told, I'm not sure what > I'm doing anyway. > > I have a file with 2300 records. The First Middle Last names are in one > variable (name). I want to parse out the first, middle and last into three > separate variables. Spaces separate (one space) each of the name > components. The name components are of varying lengths. > > This is what I have: > > Name > Brenda Jones > Chuck Fred Smith > Alyssa Gwen Mulder > Steven Patrick Leesman > > This is what I want > > First Middle Last > Brenda Jones > Steven Patrick Leesman > > Thanks for any help. > > Maggie > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ----- -- Bruce Weaver [hidden email]http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com.

Rick Oliver-3

Re: Parsing String Name Variable

They are index values that identify the position (character count) of the first space encountered working from the left and the first space (rindex) working from the right.

The general form of the SUBSTR function is SUBSTR(varname, start location, number of characters). The third argument is optional. If it's omitted, all characters from start location to the end will be included.

Everything between the first and last space will be included in the middle name.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]
Phone: 312.893.4922 | T/L: 206-4922

From: Maggie Greiner <[hidden email]>
To: [hidden email]
Date: 07/06/2012 08:57 AM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Wow! This has been helpful. Thanks to all of you for your syntax help. This has saved me hours and probably days because I sometimes have a short attention span. Thanks Albert, I didn't think about double names, and even where I live names are not as straight forward as they used to be (e.g. Fred Jones); the names of people coming into our systems are more and more complicated. I'll check my list and take that into account.

Bruce, I'm using your syntax, but there are a couple of things I don't understand, and I want to learn. I understand everything up to "name,1,#sp1-1" and "#sp2+1". What does that syntax do?

Again, thanks to everyone for your help.

Maggie.

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:03 PM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name first middle last

Brenda Jones Brenda Jones
Chuck Fred Smith Chuck Fred Smith
Alyssa Gwen Mulder Alyssa Gwen Mulder
Steven Patrick Leesman Steven Patrick Leesman

Number of cases read: 4 Number of cases listed: 4

Maggie wrote
>
> Dear List.
>
> I know this is a simple question, but I've messed around with substr and
> rindex and can't get the syntax to work. Truth be told, I'm not sure what
> I'm doing anyway.
>
> I have a file with 2300 records. The First Middle Last names are in one
> variable (name). I want to parse out the first, middle and last into three
> separate variables. Spaces separate (one space) each of the name
> components. The name components are of varying lengths.
>
> This is what I have:
>
> Name
> Brenda Jones
> Chuck Fred Smith
> Alyssa Gwen Mulder
> Steven Patrick Leesman
>
> This is what I want
>
> First Middle Last
> Brenda Jones
> Steven Patrick Leesman
>
> Thanks for any help.
>
> Maggie
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Bruce Weaver

Re: Parsing String Name Variable

Administrator

In reply to this post by Maggie Greiner

Maggie wrote

Wow! This has been helpful. Thanks to all of you for your syntax help. This has saved me hours and probably days because I sometimes have a short attention span. Thanks Albert, I didn't think about double names, and even where I live names are not as straight forward as they used to be (e.g. Fred Jones); the names of people coming into our systems are more and more complicated. I'll check my list and take that into account.

Bruce, I'm using your syntax, but there are a couple of things I don't understand, and I want to learn. I understand everything up to "name,1,#sp1-1" and "#sp2+1". What does that syntax do?

Again, thanks to everyone for your help.

Maggie.

________________________________

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:03 PM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

________________________________

How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name first middle last

Brenda Jones Brenda Jones
Chuck Fred Smith Chuck Fred Smith
Alyssa Gwen Mulder Alyssa Gwen Mulder
Steven Patrick Leesman Steven Patrick Leesman

Number of cases read: 4 Number of cases listed: 4

Maggie wrote
>
> Dear List.
>
> I know this is a simple question, but I've messed around with substr and
> rindex and can't get the syntax to work. Truth be told, I'm not sure what
> I'm doing anyway.
>
> I have a file with 2300 records. The First Middle Last names are in one
> variable (name). I want to parse out the first, middle and last into three
> separate variables. Spaces separate (one space) each of the name
> components. The name components are of varying lengths.
>
> This is what I have:
>
> Name
> Brenda Jones
> Chuck Fred Smith
> Alyssa Gwen Mulder
> Steven Patrick Leesman
>
> This is what I want
>
> First Middle Last
> Brenda Jones
> Steven Patrick Leesman
>
> Thanks for any help.
>
> Maggie
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

David Marso

Re: Parsing String Name Variable

Administrator

Alternatively keep them as #scratch variables and use PRINT to inspect them in the output.
Note a few mods (CHAR.) to Bruce's code and a fix in the middle computation.
I like/tend to use CAPITAL letters for SPSS commands and lower case for vars.
I find it easier to quickly scan code for basic operations and logic.

STRING first middle last (A15).
COMPUTE #sp1 = CHAR.INDEX(name," "). /* position of 1st space.
COMPUTE #sp2 = CHAR.RINDEX(RTRIM(name)," "). /* position of *last* space.
COMPUTE first = CHAR.SUBSTR(name,1,#sp1-1). /* Sub-string from 1 to #SP1-1.
COMPUTE last = CHAR.SUBSTR(name,#sp2+1). /* Sub-string from #SP2+1 to the end.
IF #sp2 NE #sp1 middle = CHAR.SUBSTR(name,#sp1+1,#sp2-#sp1-1).

PRINT / name first last #sp1 #sp2.
EXE.

Bruce Weaver wrote

Hi Maggie. As Rick explained, my #sp1 and #sp2 scratch variables were indices giving the positions of the first and second spaces. To see this explicitly, you could run a version of my syntax that removes the hash characters (#) from those variables. That will make them regular variables that you can then inspect in the data file afterwards to confirm that they give the positions of the 1st and 2nd spaces -- and when there is only one space (i.e., only first and last names), they will be equal.

string first middle last (a15).
compute sp1 = index(name," "). /* position of 1st space.
compute sp2 = rindex(rtrim(name)," "). /* position of 2nd space.
compute first = substr(name,1,sp1-1). /* Sub-string from 1 to SP1-1.
compute last = substr(name,sp2+1). /* Sub-string from SP2+1 to the end.
* If sp2 EQ sp1, there are only two names, so no middle name.
* If sp2 NE sp1, grab everything between the 2 spaces and assign to MIDDLE.
if sp2 NE sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

HTH.

Maggie wrote

Wow! This has been helpful. Thanks to all of you for your syntax help. This has saved me hours and probably days because I sometimes have a short attention span. Thanks Albert, I didn't think about double names, and even where I live names are not as straight forward as they used to be (e.g. Fred Jones); the names of people coming into our systems are more and more complicated. I'll check my list and take that into account.

Bruce, I'm using your syntax, but there are a couple of things I don't understand, and I want to learn. I understand everything up to "name,1,#sp1-1" and "#sp2+1". What does that syntax do?

Again, thanks to everyone for your help.

Maggie.

________________________________

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:03 PM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

________________________________

How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name first middle last

Brenda Jones Brenda Jones
Chuck Fred Smith Chuck Fred Smith
Alyssa Gwen Mulder Alyssa Gwen Mulder
Steven Patrick Leesman Steven Patrick Leesman

Number of cases read: 4 Number of cases listed: 4

Maggie wrote
>
> Dear List.
>
> I know this is a simple question, but I've messed around with substr and
> rindex and can't get the syntax to work. Truth be told, I'm not sure what
> I'm doing anyway.
>
> I have a file with 2300 records. The First Middle Last names are in one
> variable (name). I want to parse out the first, middle and last into three
> separate variables. Spaces separate (one space) each of the name
> components. The name components are of varying lengths.
>
> This is what I have:
>
> Name
> Brenda Jones
> Chuck Fred Smith
> Alyssa Gwen Mulder
> Steven Patrick Leesman
>
> This is what I want
>
> First Middle Last
> Brenda Jones
> Steven Patrick Leesman
>
> Thanks for any help.
>
> Maggie
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Bruce Weaver

Re: Parsing String Name Variable

Administrator

Thanks David. I didn't think of PRINT.

By the way, I LIKE the uppercase letters for commands too, but am usually too lazy to hold down the shift key while I type them. ;-)

And before someone jumps in suggesting that I use auto-complete (or whatever it's called), I've turned it off because I find it gets in the way.

David Marso wrote

Alternatively keep them as #scratch variables and use PRINT to inspect them in the output.
Note a few mods (CHAR.) to Bruce's code and a fix in the middle computation.
I like/tend to use CAPITAL letters for SPSS commands and lower case for vars.
I find it easier to quickly scan code for basic operations and logic.

STRING first middle last (A15).
COMPUTE #sp1 = CHAR.INDEX(name," "). /* position of 1st space.
COMPUTE #sp2 = CHAR.RINDEX(RTRIM(name)," "). /* position of *last* space.
COMPUTE first = CHAR.SUBSTR(name,1,#sp1-1). /* Sub-string from 1 to #SP1-1.
COMPUTE last = CHAR.SUBSTR(name,#sp2+1). /* Sub-string from #SP2+1 to the end.
IF #sp2 NE #sp1 middle = CHAR.SUBSTR(name,#sp1+1,#sp2-#sp1-1).

PRINT / name first last #sp1 #sp2.
EXE.

Bruce Weaver wrote

Hi Maggie. As Rick explained, my #sp1 and #sp2 scratch variables were indices giving the positions of the first and second spaces. To see this explicitly, you could run a version of my syntax that removes the hash characters (#) from those variables. That will make them regular variables that you can then inspect in the data file afterwards to confirm that they give the positions of the 1st and 2nd spaces -- and when there is only one space (i.e., only first and last names), they will be equal.

string first middle last (a15).
compute sp1 = index(name," "). /* position of 1st space.
compute sp2 = rindex(rtrim(name)," "). /* position of 2nd space.
compute first = substr(name,1,sp1-1). /* Sub-string from 1 to SP1-1.
compute last = substr(name,sp2+1). /* Sub-string from SP2+1 to the end.
* If sp2 EQ sp1, there are only two names, so no middle name.
* If sp2 NE sp1, grab everything between the 2 spaces and assign to MIDDLE.
if sp2 NE sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

HTH.

Maggie wrote

Wow! This has been helpful. Thanks to all of you for your syntax help. This has saved me hours and probably days because I sometimes have a short attention span. Thanks Albert, I didn't think about double names, and even where I live names are not as straight forward as they used to be (e.g. Fred Jones); the names of people coming into our systems are more and more complicated. I'll check my list and take that into account.

Bruce, I'm using your syntax, but there are a couple of things I don't understand, and I want to learn. I understand everything up to "name,1,#sp1-1" and "#sp2+1". What does that syntax do?

Again, thanks to everyone for your help.

Maggie.

________________________________

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 07/05/2012 05:03 PM
Subject: Re: Parsing String Name Variable
Sent by: "SPSSX(r) Discussion" <[hidden email]>

________________________________

How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name first middle last

Brenda Jones Brenda Jones
Chuck Fred Smith Chuck Fred Smith
Alyssa Gwen Mulder Alyssa Gwen Mulder
Steven Patrick Leesman Steven Patrick Leesman

Number of cases read: 4 Number of cases listed: 4

Maggie wrote
>
> Dear List.
>
> I know this is a simple question, but I've messed around with substr and
> rindex and can't get the syntax to work. Truth be told, I'm not sure what
> I'm doing anyway.
>
> I have a file with 2300 records. The First Middle Last names are in one
> variable (name). I want to parse out the first, middle and last into three
> separate variables. Spaces separate (one space) each of the name
> components. The name components are of varying lengths.
>
> This is what I have:
>
> Name
> Brenda Jones
> Chuck Fred Smith
> Alyssa Gwen Mulder
> Steven Patrick Leesman
>
> This is what I want
>
> First Middle Last
> Brenda Jones
> Steven Patrick Leesman
>
> Thanks for any help.
>
> Maggie
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

Maggie Greiner

Re: Parsing String Name Variable

In reply to this post by Maggie Greiner

This helped. I did remove the # and I could follow the logic of the code. Thanks.!

Maggie

From: Bruce Weaver <[hidden email]>

To: [hidden email]

Date: 07/06/2012 01:13 PM

Subject: Re: Parsing String Name Variable

Sent by: "SPSSX(r) Discussion" <[hidden email]>

　

　

Hi Maggie. As Rick explained, my #sp1 and #sp2 scratch variables were
indices giving the positions of the first and second spaces. To see this
explicitly, you could run a version of my syntax that removes the hash
characters (#) from those variables. That will make them regular variables
that you can then inspect in the data file afterwards to confirm that they
give the positions of the 1st and 2nd spaces -- and when there is only one
space (i.e., only first and last names), they will be equal.

string first middle last (a15).
compute sp1 = index(name," "). /* position of 1st space.
compute sp2 = rindex(rtrim(name)," "). /* position of 2nd space.
compute first = substr(name,1,sp1-1). /* Sub-string from 1 to SP1-1.
compute last = substr(name,sp2+1). /* Sub-string from SP2+1 to the end.
* If sp2 EQ sp1, there are only two names, so no middle name.
* If sp2 NE sp1, grab everything between the 2 spaces and assign to MIDDLE.
if sp2 NE sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

HTH.

Maggie wrote

>
> Wow! This has been helpful. Thanks to all of you for your syntax help.
> This has saved me hours and probably days because I sometimes have a short
> attention span. Thanks Albert, I didn't think about double names, and even
> where I live names are not as straight forward as they used to be (e.g.
> Fred Jones); the names of people coming into our systems are more and
> more complicated. I'll check my list and take that into account.
>
> Bruce, I'm using your syntax, but there are a couple of things I don't
> understand, and I want to learn. I understand everything up to
> "name,1,#sp1-1" and "#sp2+1". What does that syntax do?
>
> Again, thanks to everyone for your help.
>
> Maggie.
>

Art Kendall

Re: Parsing String Name Variable

In reply to this post by Bruce Weaver

The use of color in the syntax editor and the auto indent tool do a lot to help readability.

However, since long before the age of PCs, I have often asked SPSS to implement these things, but they have never been implemented.

Perhaps someone up on Python could write something to do what PRETTY did for FORTRAN in the 1972 era.
1) have all variables be lower case
2) have all functions have initial caps
3) have all reserve words be all caps.
4) leave comments, labels, etc. alone
5) be sure there is a blank space on each side of any symbolic operator.

If my vague understanding of Python is correct, someone who knew what (s)he was doing could also
1)expand abbreviated syntax, e.g., somebody posts syntax that says "freq" it would be expanded to "FREQUENCIES".
2) convert symbolic operators to the conventional operators. "&" to "AND", ">=" to "GE", etc.

Art Kendall
Social Research Consultants

On 7/6/2012 5:02 PM, Bruce Weaver wrote:

Thanks David.  I didn't think of PRINT.

By the way, I LIKE the uppercase letters for commands too, but am usually
too lazy to hold down the shift key while I type them.  ;-)

And before someone jumps in suggesting that I use auto-complete (or whatever
it's called), I've turned it off because I find it gets in the way.




David Marso wrote

Alternatively keep them as #scratch variables and use PRINT to inspect
them in the output.
Note a few mods (CHAR.) to Bruce's code and a fix in the middle
computation.
I like/tend to use CAPITAL letters for SPSS commands and lower case for
vars.
I find it easier to quickly scan code for basic operations and logic.

STRING first middle last (A15).
COMPUTE #sp1 = CHAR.INDEX(name," "). /* position of 1st space.
COMPUTE #sp2 = CHAR.RINDEX(RTRIM(name)," "). /* position of *last* space.
COMPUTE first = CHAR.SUBSTR(name,1,#sp1-1). /* Sub-string from 1 to
#SP1-1.
COMPUTE last = CHAR.SUBSTR(name,#sp2+1). /* Sub-string from #SP2+1 to the
end.
IF #sp2 NE #sp1 middle = CHAR.SUBSTR(name,#sp1+1,#sp2-#sp1-1).

PRINT / name first last #sp1 #sp2.
EXE.


Bruce Weaver wrote

Hi Maggie.  As Rick explained, my #sp1 and #sp2 scratch variables were
indices giving the positions of the first and second spaces.  To see this
explicitly, you could run a version of my syntax that removes the hash
characters (#) from those variables.  That will make them regular
variables that you can then inspect in the data file afterwards to
confirm that they give the positions of the 1st and 2nd spaces -- and
when there is only one space (i.e., only first and last names), they will
be equal.

string first middle last (a15).
compute sp1 = index(name," "). /* position of 1st space.
compute sp2 = rindex(rtrim(name)," "). /* position of 2nd space.
compute first = substr(name,1,sp1-1). /* Sub-string from 1 to SP1-1.
compute last = substr(name,sp2+1). /* Sub-string from SP2+1 to the end.
* If sp2 EQ sp1, there are only two names, so no middle name.
* If sp2 NE sp1, grab everything between the 2 spaces and assign to
MIDDLE.
if sp2 NE sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.


HTH.



Maggie wrote

Wow! This has been helpful. Thanks to all of you for your syntax help.
This has saved me hours and probably days because I sometimes have a
short attention span. Thanks Albert,� I didn't think about double names,
and even where I live names are not as straight forward as they used to
be (e.g. Fred Jones); the names of � people coming into our systems
are� more and more complicated. I'll check my list and take that into
account.
�
Bruce, I'm using your syntax,� but there are a couple of things I don't
understand, and I want to learn. I understand everything up to
"name,1,#sp1-1" and "#sp2+1". What does that syntax do?
�
Again, thanks to everyone for your help.
�
Maggie.


________________________________

�
�
�
From: �  �  �  � Bruce Weaver &lt;bruce.weaver@&gt;
To: �  �  �  � [hidden email]
Date: �  �  �  � 07/05/2012 05:03 PM
Subject: �  �  �  � Re: Parsing String Name Variable
Sent by: �  �  �  � "SPSSX(r) Discussion" &lt;SPSSX-L@.UGA&gt;


________________________________




How about this?

data list / name(a25).
begin data
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman
end data.

string first middle last (a15).
compute #sp1 = index(name," ").
compute #sp2 = rindex(rtrim(name)," ").
compute first = substr(name,1,#sp1-1).
compute last = substr(name,#sp2+1).
if #sp2 NE #sp1 middle = substr(name,#sp1+1,#sp2-#sp1-1).
list.

OUTPUT:
name �  �  �  �  �  �  �  �  �  �  � first �  �  �  �  �  middle �  �  �  �  � last

Brenda Jones �  �  �  �  �  �  � Brenda �  �  �  �  �  �  �  �  �  �  �  �  � Jones
Chuck Fred Smith �  �  �  �  � Chuck �  �  �  �  �  Fred �  �  �  �  �  � Smith
Alyssa Gwen Mulder �  �  �  � Alyssa �  �  �  �  � Gwen �  �  �  �  �  � Mulder
Steven Patrick Leesman �  � Steven �  �  �  �  � Patrick �  �  �  �  Leesman

Number of cases read: � 4 �  � Number of cases listed: � 4


Maggie wrote

Dear List.

I know this is a simple question, but I've messed around with substr
and
rindex and can't get the syntax to work. Truth be told, I'm not sure
what
I'm doing anyway.

I have a file with 2300 records. The First Middle Last names are in one
variable (name). I want to parse out the first, middle and last into
three
separate variables. Spaces separate (one space) each of the name
components. The name components are of varying lengths.

This is what I have:

Name
Brenda Jones
Chuck Fred Smith
Alyssa Gwen Mulder
Steven Patrick Leesman

This is what I want

First �  Middle �  �  Last
Brenda �  �  �  �  �  �  � Jones
Steven � Patrick �  �  Leesman

Thanks for any help.

Maggie

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


-----
--
Bruce Weaver
bweaver@
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714038.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Parsing-String-Name-Variable-tp5714037p5714061.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Art Kendall
Social Research Consultants