|
Colleagues
I hope this forum can help. I am looking to investigate job requirements by carrying out a content analysis of job ads on popular job sites in Australia (eg SEEK, MYCAREER CAREERONE etc). Can anyone recommend suitable software that has been used to do this and also can anyone suggest a convenient way of automatically reading and storing the advertisement's content suitable for later analysis. Thanks Rod |
|
Rod,
To do what what you are thinking is quite a large undertaking, it is going to require a lot of time to do all of that content analysis, and get everything set up to analyze. I think your search for a quick way to do the analysis will lead you down one of two paths either creating a custom program, or putting a great deal of man hours towards your work. If you don't mind investing the time, there are some programs out there that can be helpful in storing web pages for analysis and coding at a later date. The one program I would recommend is called website Extractor. It goes on to web pages and archives the entire web page onto your hard drive for offline viewing. This way you can go on to the web at regular intervals download the conent of the page, and spend time coding your findings at a later date. In terms of analyzing the text of the advertisements, you can copy the text off of the web pages and store it in a program such as "SPSS - Text Analysis" or NVIVO. These programs do not entirely do all of the work for you, but they can speed along the process. Even using these programs you have to read all of your entries to ensure they are classified correctly and to categorize them into categories not detected by the text analyzer. Do not forget that in content analysis a lot of the information you take away not only comes from the number of times individual words are used, but also the tone of the piece and how it is positioned. One final question, Are you sure content analysis is the best way to go? Perhaps if you let us know what your research goals are, we can come up with alternative methodologies for you and your research partners to employ. Notice how I assume you have research partners working on this project with you... make sure that in your research design you find a way to measure inter rater reliability, that is something the list can definitely help you with. Don On 7/15/07, Rod Turner <[hidden email]> wrote: > > Colleagues > I hope this forum can help. > I am looking to investigate job requirements by carrying out a content > analysis of job ads on popular job sites in Australia (eg SEEK, MYCAREER > CAREERONE etc). Can anyone recommend suitable software that has been used to > do this and also can anyone suggest a convenient way of automatically > reading and storing the advertisement's content suitable for later analysis. > > > Thanks > Rod > |
|
Hi all,
Easy question. I have a string variable, we'll call "location", in the form of "Ithica-NY-US." I need to extract the city component of this variable to create a string "city" variable, so, for present purposes, all I would really be concerned with is Ithica. I tried the following (below), which uses RTRIM, figuring that I could just conveniently "trim away" from the right. First step was to compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may only trim off from the first hyphen on the right. Since I wanted to get only city, I would then have to compute the city variable by doing the RTRIM function again, but on citystat But it doesn't quite do the job (actually, it doesn't do much of anything), so I'm likely either using RTRIM incorrectly or I need to use a different function. Seems like a pretty routine procedure, so I took a quick browse through the user's guide, but couldn't find much. What is the easiest way to do this? STRING citystat (A20) . COMPUTE citystat=RTRIM(location, "-"). execute . STRING city (A20) . COMPUTE city=RTRIM(citystat, "-"). execute . Thanks, - Matt --------------------------------- Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. |
|
Hi Matthew:
Try this: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." END DATA. STRING city (A20) . COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). LIST. HTH, Marta > Easy question. I have a string variable, we'll call "location", in the form of "Ithica-NY-US." I need to extract the city component of this variable to create a string "city" variable, so, for present purposes, all I would really be concerned with is Ithica. > > I tried the following (below), which uses RTRIM, figuring that I could just conveniently "trim away" from the right. First step was to compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may only trim off from the first hyphen on the right. Since I wanted to get only city, I would then have to compute the city variable by doing the RTRIM function again, but on citystat But it doesn't quite do the job (actually, it doesn't do much of anything), so I'm likely either using RTRIM incorrectly or I need to use a different function. Seems like a pretty routine procedure, so I took a quick browse through the user's guide, but couldn't find much. What is the easiest way to do this? > > STRING citystat (A20) . > COMPUTE citystat=RTRIM(location, "-"). > execute . > > STRING city (A20) . > COMPUTE city=RTRIM(citystat, "-"). > execute . > > > |
|
Watch out for "Wilkes-Barre-PA-US"
:~) --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Wednesday, July 18, 2007 2:41 AM To: [hidden email] Subject: Re: Breaking up string variables Hi Matthew: Try this: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." END DATA. STRING city (A20) . COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). LIST. HTH, Marta > Easy question. I have a string variable, we'll call "location", in the form of "Ithica-NY-US." I need to extract the city component of this variable to create a string "city" variable, so, for present purposes, all I would really be concerned with is Ithica. > > I tried the following (below), which uses RTRIM, figuring that I could just conveniently "trim away" from the right. First step was to compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may only trim off from the first hyphen on the right. Since I wanted to get only city, I would then have to compute the city variable by doing the RTRIM function again, but on citystat But it doesn't quite do the job (actually, it doesn't do much of anything), so I'm likely either using RTRIM incorrectly or I need to use a different function. Seems like a pretty routine procedure, so I took a quick browse through the user's guide, but couldn't find much. What is the easiest way to do this? > > STRING citystat (A20) . > COMPUTE citystat=RTRIM(location, "-"). > execute . > > STRING city (A20) . > COMPUTE city=RTRIM(citystat, "-"). > execute . > > > |
|
Marks, Jim escribió:
> Watch out for "Wilkes-Barre-PA-US" > > Thanks for pointing it out. How about this then?: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." "Wilkes-Barre-PA-US" END DATA. STRING #step city (A20). * In two steps *. COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). LIST. > --jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Marta Garcia-Granero > Sent: Wednesday, July 18, 2007 2:41 AM > To: [hidden email] > Subject: Re: Breaking up string variables > > Hi Matthew: > > Try this: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > END DATA. > > STRING city (A20) . > COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). > LIST. > > HTH, > Marta > >> Easy question. I have a string variable, we'll call "location", in >> > the form of "Ithica-NY-US." I need to extract the city component of this > variable to create a string "city" variable, so, for present purposes, > all I would really be concerned with is Ithica. > >> I tried the following (below), which uses RTRIM, figuring that I >> > could just conveniently "trim away" from the right. First step was to > compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may > only trim off from the first hyphen on the right. Since I wanted to get > only city, I would then have to compute the city variable by doing the > RTRIM function again, but on citystat But it doesn't quite do the job > (actually, it doesn't do much of anything), so I'm likely either using > RTRIM incorrectly or I need to use a different function. Seems like a > pretty routine procedure, so I took a quick browse through the user's > guide, but couldn't find much. What is the easiest way to do this? > >> STRING citystat (A20) . >> COMPUTE citystat=RTRIM(location, "-"). >> execute . >> >> STRING city (A20) . >> COMPUTE city=RTRIM(citystat, "-"). >> execute . >> >> >> >> > > |
|
In reply to this post by Marks, Jim
Here is a little Python code to do this job - and it allows for Wilkes-Barre. Explanation follows the code. This approach does not use regular expressions, since the problem is so simple, but a regular expression could be used to do a much more complex extraction.
This code assumes that the values always have the specified form. It would need a little modification to handle blank or more irregular values. data list free/cityStateCountry(a30). begin data. ithaca-NY-US chicago-IL-US wilkes-barre-pa-ca end data. begin program. import spss from spssdata import * cursor = Spssdata(indexes='cityStateCountry', accessType='w') cursor.append(vdef("city", vtype=20)) cursor.commitdict() for case in cursor: city = "-".join(case.cityStateCountry.split("-")[:-2]) cursor.casevalues([city]) cursor.CClose() end program. - The cursor = line gets access to the cityStateCountry variable in the active dataset and specifies write mode. - cursor.append defines a new string variable named city with a width of 20 (A20). - cursor.commitdict gets ready to pass the data. - the for case.. line loops over the data. - city = gets the city value out of the cityStateCountry variable. It first splits up the variable at each "-" into a list of values. Then it takes all but the last two items (the state and country) and joins the items back together with the "-" again. That's how it accommodates Wilkes-Barre. - cursor.casevalues passes the city value to SPSS - cursor.CClose ends the access to the data. This code requires SPSS 14.0.1 or later, Python, the Python programmability plug-in, and a few modules from SPSS Developer Central (www.spss.com/devcentral). HTH. Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marks, Jim Sent: Wednesday, July 18, 2007 6:15 PM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Watch out for "Wilkes-Barre-PA-US" :~) --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Wednesday, July 18, 2007 2:41 AM To: [hidden email] Subject: Re: Breaking up string variables Hi Matthew: Try this: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." END DATA. STRING city (A20) . COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). LIST. HTH, Marta > Easy question. I have a string variable, we'll call "location", in the form of "Ithica-NY-US." I need to extract the city component of this variable to create a string "city" variable, so, for present purposes, all I would really be concerned with is Ithica. > > I tried the following (below), which uses RTRIM, figuring that I could just conveniently "trim away" from the right. First step was to compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may only trim off from the first hyphen on the right. Since I wanted to get only city, I would then have to compute the city variable by doing the RTRIM function again, but on citystat But it doesn't quite do the job (actually, it doesn't do much of anything), so I'm likely either using RTRIM incorrectly or I need to use a different function. Seems like a pretty routine procedure, so I took a quick browse through the user's guide, but couldn't find much. What is the easiest way to do this? > > STRING citystat (A20) . > COMPUTE citystat=RTRIM(location, "-"). > execute . > > STRING city (A20) . > COMPUTE city=RTRIM(citystat, "-"). > execute . > > > |
|
In reply to this post by Marta Garcia-Granero
Well, what about
aix-en-provence-pa-us ? -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Thursday, July 19, 2007 3:46 AM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Marks, Jim escribió: > Watch out for "Wilkes-Barre-PA-US" > > Thanks for pointing it out. How about this then?: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." "Wilkes-Barre-PA-US" END DATA. STRING #step city (A20). * In two steps *. COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). LIST. > --jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Marta Garcia-Granero > Sent: Wednesday, July 18, 2007 2:41 AM > To: [hidden email] > Subject: Re: Breaking up string variables > > Hi Matthew: > > Try this: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > END DATA. > > STRING city (A20) . > COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). > LIST. > > HTH, > Marta > >> Easy question. I have a string variable, we'll call "location", in >> > the form of "Ithica-NY-US." I need to extract the city component of this > variable to create a string "city" variable, so, for present purposes, > all I would really be concerned with is Ithica. > >> I tried the following (below), which uses RTRIM, figuring that I >> > could just conveniently "trim away" from the right. First step was to > compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may > only trim off from the first hyphen on the right. Since I wanted to get > only city, I would then have to compute the city variable by doing the > RTRIM function again, but on citystat But it doesn't quite do the job > (actually, it doesn't do much of anything), so I'm likely either using > RTRIM incorrectly or I need to use a different function. Seems like a > pretty routine procedure, so I took a quick browse through the user's > guide, but couldn't find much. What is the easiest way to do this? > >> STRING citystat (A20) . >> COMPUTE citystat=RTRIM(location, "-"). >> execute . >> >> STRING city (A20) . >> COMPUTE city=RTRIM(citystat, "-"). >> execute . >> >> >> >> > > |
|
In reply to this post by Matthew Reeder
It may be better to work from the right, rather than from the left to find the characters prior to the 2nd to last hyphen.
Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peck, Jon Sent: Thursday, July 19, 2007 12:41 PM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Well, what about aix-en-provence-pa-us ? -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Thursday, July 19, 2007 3:46 AM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Marks, Jim escribió: > Watch out for "Wilkes-Barre-PA-US" > > Thanks for pointing it out. How about this then?: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." "Wilkes-Barre-PA-US" END DATA. STRING #step city (A20). * In two steps *. COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). LIST. > --jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf > Of Marta Garcia-Granero > Sent: Wednesday, July 18, 2007 2:41 AM > To: [hidden email] > Subject: Re: Breaking up string variables > > Hi Matthew: > > Try this: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > END DATA. > > STRING city (A20) . > COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). > LIST. > > HTH, > Marta > >> Easy question. I have a string variable, we'll call "location", in >> > the form of "Ithica-NY-US." I need to extract the city component of > this variable to create a string "city" variable, so, for present > purposes, all I would really be concerned with is Ithica. > >> I tried the following (below), which uses RTRIM, figuring that I >> > could just conveniently "trim away" from the right. First step was to > compute citystat (e.g., Ithica-NY), thinking that the RTRIM function > may only trim off from the first hyphen on the right. Since I wanted > to get only city, I would then have to compute the city variable by > doing the RTRIM function again, but on citystat But it doesn't quite > do the job (actually, it doesn't do much of anything), so I'm likely > either using RTRIM incorrectly or I need to use a different function. > Seems like a pretty routine procedure, so I took a quick browse > through the user's guide, but couldn't find much. What is the easiest way to do this? > >> STRING citystat (A20) . >> COMPUTE citystat=RTRIM(location, "-"). >> execute . >> >> STRING city (A20) . >> COMPUTE city=RTRIM(citystat, "-"). >> execute . >> >> >> >> > > PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
|
In reply to this post by Matthew Reeder
Or...if the last 6 characters are always the ones to drop (2 digit state and 2 digit country codes plus 2 hyphens)
You could do something like compute city=substr(location,1,length(location)-6)) Then it never matters how many hyphens there are within the city name. Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Melissa Ives Sent: Thursday, July 19, 2007 1:10 PM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables It may be better to work from the right, rather than from the left to find the characters prior to the 2nd to last hyphen. Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peck, Jon Sent: Thursday, July 19, 2007 12:41 PM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Well, what about aix-en-provence-pa-us ? -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Thursday, July 19, 2007 3:46 AM To: [hidden email] Subject: Re: [SPSSX-L] Breaking up string variables Marks, Jim escribió: > Watch out for "Wilkes-Barre-PA-US" > > Thanks for pointing it out. How about this then?: * Sample dataset*. DATA LIST LIST/location(A20). BEGIN DATA "Ithica-NY-US." "Wilkes-Barre-PA-US" END DATA. STRING #step city (A20). * In two steps *. COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). LIST. > --jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf > Of Marta Garcia-Granero > Sent: Wednesday, July 18, 2007 2:41 AM > To: [hidden email] > Subject: Re: Breaking up string variables > > Hi Matthew: > > Try this: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > END DATA. > > STRING city (A20) . > COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). > LIST. > > HTH, > Marta > >> Easy question. I have a string variable, we'll call "location", in >> > the form of "Ithica-NY-US." I need to extract the city component of > this variable to create a string "city" variable, so, for present > purposes, all I would really be concerned with is Ithica. > >> I tried the following (below), which uses RTRIM, figuring that I >> > could just conveniently "trim away" from the right. First step was to > compute citystat (e.g., Ithica-NY), thinking that the RTRIM function > may only trim off from the first hyphen on the right. Since I wanted > to get only city, I would then have to compute the city variable by > doing the RTRIM function again, but on citystat But it doesn't quite > do the job (actually, it doesn't do much of anything), so I'm likely > either using RTRIM incorrectly or I need to use a different function. > Seems like a pretty routine procedure, so I took a quick browse > through the user's guide, but couldn't find much. What is the easiest way to do this? > >> STRING citystat (A20) . >> COMPUTE citystat=RTRIM(location, "-"). >> execute . >> >> STRING city (A20) . >> COMPUTE city=RTRIM(citystat, "-"). >> execute . >> >> >> >> > > PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
|
In reply to this post by Peck, Jon
On 7/19/07, Hal 9000 <[hidden email]> wrote:
> > Haha, the obvious solution is to work backwards from the 2 hyphens > seperating state/country. There are no hyphenated states, right? > > On 7/19/07, Peck, Jon <[hidden email]> wrote: > > > > Well, what about > > aix-en-provence-pa-us > > ? > > > > -----Original Message----- > > From: SPSSX(r) Discussion [mailto: [hidden email]] On Behalf > > Of Marta Garcia-Granero > > Sent: Thursday, July 19, 2007 3:46 AM > > To: [hidden email] > > Subject: Re: [SPSSX-L] Breaking up string variables > > > > Marks, Jim escribió: > > > Watch out for "Wilkes-Barre-PA-US" > > > > > > > > Thanks for pointing it out. How about this then?: > > > > * Sample dataset*. > > DATA LIST LIST/location(A20). > > BEGIN DATA > > "Ithica-NY-US." > > "Wilkes-Barre-PA-US" > > END DATA. > > > > STRING #step city (A20). > > * In two steps *. > > COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). > > COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). > > LIST. > > > > > > > --jim > > > > > > -----Original Message----- > > > From: SPSSX(r) Discussion [mailto: [hidden email]] On Behalf > > Of > > > Marta Garcia-Granero > > > Sent: Wednesday, July 18, 2007 2:41 AM > > > To: [hidden email] > > > Subject: Re: Breaking up string variables > > > > > > Hi Matthew: > > > > > > Try this: > > > > > > * Sample dataset*. > > > DATA LIST LIST/location(A20). > > > BEGIN DATA > > > "Ithica-NY-US." > > > END DATA. > > > > > > STRING city (A20) . > > > COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). > > > LIST. > > > > > > HTH, > > > Marta > > > > > >> Easy question. I have a string variable, we'll call "location", in > > >> > > > the form of "Ithica-NY-US." I need to extract the city component of > > this > > > variable to create a string "city" variable, so, for present purposes, > > > all I would really be concerned with is Ithica. > > > > > >> I tried the following (below), which uses RTRIM, figuring that I > > >> > > > could just conveniently "trim away" from the right. First step was to > > > compute citystat (e.g., Ithica-NY), thinking that the RTRIM function > > may > > > only trim off from the first hyphen on the right. Since I wanted to > > get > > > only city, I would then have to compute the city variable by doing the > > > RTRIM function again, but on citystat But it doesn't quite do the job > > > > > (actually, it doesn't do much of anything), so I'm likely either using > > > RTRIM incorrectly or I need to use a different function. Seems like a > > > pretty routine procedure, so I took a quick browse through the user's > > > guide, but couldn't find much. What is the easiest way to do this? > > > > > >> STRING citystat (A20) . > > >> COMPUTE citystat=RTRIM(location, "-"). > > >> execute . > > >> > > >> STRING city (A20) . > > >> COMPUTE city=RTRIM(citystat, "-"). > > >> execute . > > >> > > >> > > >> > > >> > > > > > > > > > > |
|
In reply to this post by Melissa Ives
Melissa Ives escribió:
> It may be better to work from the right, rather than from the left to find the characters prior to the 2nd to last hyphen. > That's exactly what my solution does (uses RINDEX instead of INDEX). Marta > Melissa > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peck, Jon > Sent: Thursday, July 19, 2007 12:41 PM > To: [hidden email] > Subject: Re: [SPSSX-L] Breaking up string variables > > Well, what about > aix-en-provence-pa-us > ? > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero > Sent: Thursday, July 19, 2007 3:46 AM > To: [hidden email] > Subject: Re: [SPSSX-L] Breaking up string variables > > Marks, Jim escribió: > >> Watch out for "Wilkes-Barre-PA-US" >> >> >> > Thanks for pointing it out. How about this then?: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > "Wilkes-Barre-PA-US" > END DATA. > > STRING #step city (A20). > * In two steps *. > COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). > COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). > LIST. > > > >> --jim >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf >> Of Marta Garcia-Granero >> Sent: Wednesday, July 18, 2007 2:41 AM >> To: [hidden email] >> Subject: Re: Breaking up string variables >> >> Hi Matthew: >> >> Try this: >> >> * Sample dataset*. >> DATA LIST LIST/location(A20). >> BEGIN DATA >> "Ithica-NY-US." >> END DATA. >> >> STRING city (A20) . >> COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). >> LIST. >> >> HTH, >> Marta >> >> >>> Easy question. I have a string variable, we'll call "location", in >>> >>> >> the form of "Ithica-NY-US." I need to extract the city component of >> this variable to create a string "city" variable, so, for present >> purposes, all I would really be concerned with is Ithica. >> >> >>> I tried the following (below), which uses RTRIM, figuring that I >>> >>> >> could just conveniently "trim away" from the right. First step was to >> compute citystat (e.g., Ithica-NY), thinking that the RTRIM function >> may only trim off from the first hyphen on the right. Since I wanted >> to get only city, I would then have to compute the city variable by >> doing the RTRIM function again, but on citystat But it doesn't quite >> do the job (actually, it doesn't do much of anything), so I'm likely >> either using RTRIM incorrectly or I need to use a different function. >> Seems like a pretty routine procedure, so I took a quick browse >> through the user's guide, but couldn't find much. What is the easiest way to do this? >> >> >>> STRING citystat (A20) . >>> COMPUTE citystat=RTRIM(location, "-"). >>> execute . >>> >>> STRING city (A20) . >>> COMPUTE city=RTRIM(citystat, "-"). >>> execute . >>> >>> >>> >>> >>> >> > > > PRIVILEGED AND CONFIDENTIAL INFORMATION > This transmittal and any attachments may contain PRIVILEGED AND > CONFIDENTIAL information and is intended only for the use of the > addressee. If you are not the designated recipient, or an employee > or agent authorized to deliver such transmittals to the designated > recipient, you are hereby notified that any dissemination, > copying or publication of this transmittal is strictly prohibited. If > you have received this transmittal in error, please notify us > immediately by replying to the sender and delete this copy from your > system. You may also call us at (309) 827-6026 for assistance. > > |
|
In reply to this post by Peck, Jon
Peck, Jon escribió:
> Well, what about > aix-en-provence-pa-us > ? > No problem with the solution I provided (using RINDEX) since it works in two steps, eliminating country first, and then state, leaving the city untouched (even if it has a bunch of hyphens) Regards, Marta > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero > Sent: Thursday, July 19, 2007 3:46 AM > To: [hidden email] > Subject: Re: [SPSSX-L] Breaking up string variables > > Marks, Jim escribió: > >> Watch out for "Wilkes-Barre-PA-US" >> >> >> > Thanks for pointing it out. How about this then?: > > * Sample dataset*. > DATA LIST LIST/location(A20). > BEGIN DATA > "Ithica-NY-US." > "Wilkes-Barre-PA-US" > END DATA. > > STRING #step city (A20). > * In two steps *. > COMPUTE #step = SUBSTR(location,1,RINDEX(location,"-")-1). > COMPUTE city = SUBSTR(#step,1,RINDEX(#step,"-")-1). > LIST. > > > >> --jim >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of >> Marta Garcia-Granero >> Sent: Wednesday, July 18, 2007 2:41 AM >> To: [hidden email] >> Subject: Re: Breaking up string variables >> >> Hi Matthew: >> >> Try this: >> >> * Sample dataset*. >> DATA LIST LIST/location(A20). >> BEGIN DATA >> "Ithica-NY-US." >> END DATA. >> >> STRING city (A20) . >> COMPUTE city=SUBSTR(location,1,INDEX(location,"-")-1). >> LIST. >> >> HTH, >> Marta >> >> >>> Easy question. I have a string variable, we'll call "location", in >>> >>> >> the form of "Ithica-NY-US." I need to extract the city component of this >> variable to create a string "city" variable, so, for present purposes, >> all I would really be concerned with is Ithica. >> >> >>> I tried the following (below), which uses RTRIM, figuring that I >>> >>> >> could just conveniently "trim away" from the right. First step was to >> compute citystat (e.g., Ithica-NY), thinking that the RTRIM function may >> only trim off from the first hyphen on the right. Since I wanted to get >> only city, I would then have to compute the city variable by doing the >> RTRIM function again, but on citystat But it doesn't quite do the job >> (actually, it doesn't do much of anything), so I'm likely either using >> RTRIM incorrectly or I need to use a different function. Seems like a >> pretty routine procedure, so I took a quick browse through the user's >> guide, but couldn't find much. What is the easiest way to do this? >> >> >>> STRING citystat (A20) . >>> COMPUTE citystat=RTRIM(location, "-"). >>> execute . >>> >>> STRING city (A20) . >>> COMPUTE city=RTRIM(citystat, "-"). >>> execute . >>> >>> >>> >>> >>> >> |
| Free forum by Nabble | Edit this page |
