Text manipulation

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Text manipulation

Chris-315
I've got a text field that contains alpha, numerics and symbols and I want
to subdivide this one field into many. Problem is the breaks between
characters are not consistent from record to record (e.g., "simvastatin--
PO 80mg TAB" and "mevacor--PO 20mg TAB").  Ideally I would like to make 2
fields with these data. One for drug name as string
(containing "simvastatin" and "mevacor) and another containing the dose as
numerics (80 and 20).  In other words, using the characters "--" as a
reference point, I'd like to extract all to the left into a strig field
NEWFIELD1 AND characters number 4 and 5 to the right of "--" into a
numeric field NEWFIELD2.  How would I go about doing this??? HELP!  Thanks
in advance for any help!
Reply | Threaded
Open this post in threaded view
|

Re: Text manipulation

Ken Chui
Suppose your variable is called "drug"; the result drug name is called
"drugname", and dosage as "dosage"; both 20 spaces long... then this syntax
should be able to do the job:

STRING drugname (A20) .
COMPUTE drugname=SUBSTR(drug,1,INDEX(drug,"-")-1).
EXE .
STRING dosage (A20) .
COMPUTE dosage=SUBSTR(drug,INDEX(drug,"-")+2).
EXE .

If the durg name is too short, you can change that (A20) to anything higher
as long as it's smaller than (A255).

On Tue, 10 Oct 2006 18:01:19 -0400, Chris <[hidden email]> wrote:

>I've got a text field that contains alpha, numerics and symbols and I want
>to subdivide this one field into many. Problem is the breaks between
>characters are not consistent from record to record (e.g., "simvastatin--
>PO 80mg TAB" and "mevacor--PO 20mg TAB").  Ideally I would like to make 2
>fields with these data. One for drug name as string
>(containing "simvastatin" and "mevacor) and another containing the dose as
>numerics (80 and 20).  In other words, using the characters "--" as a
>reference point, I'd like to extract all to the left into a strig field
>NEWFIELD1 AND characters number 4 and 5 to the right of "--" into a
>numeric field NEWFIELD2.  How would I go about doing this??? HELP!  Thanks
>in advance for any help!
Reply | Threaded
Open this post in threaded view
|

Re: Text manipulation

Marks, Jim
In reply to this post by Chris-315
Chris:

You can do this with INDEX and SUBSTR:

*** (not tested).
STRING  newfield1 (a20).

COMPUTE newfield1 =
  SUBSTR(oldfield,1,INDEX(oldfield,'--')-1).

COMPUTE newfield2 =
        NUMBER( SUBSTR(oldfield,INDEX(oldfield,'--')+4,2),F8.0).

EXECUTE.

SUBSTR returns part of a string-- SUBSTR(stringvar, startposition,
endposition).
INDEX returns a number based on the location of a target within a
string-- INDEX(stringvar,'target').

The output of INDEX gives a number for SUBSTR.

You can do arithmetic with the results of INDEX-- INDEX( )-1 gives one
position before the target.

You might need to adjust the start and end positions in the syntax above
to get the exact position you need.

Note that in your examples the start position is different:

        simvastatin-- PO 80mg TAB
                --> '-- PO 80' the number is 7 characters after --
        mevacor--PO 20mg TAB
        --> '--PO 20' the number is 6 characters after --

--jim


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Chris
Sent: Tuesday, October 10, 2006 5:01 PM
To: [hidden email]
Subject: Text manipulation

I've got a text field that contains alpha, numerics and symbols and I
want to subdivide this one field into many. Problem is the breaks
between characters are not consistent from record to record (e.g.,
"simvastatin-- PO 80mg TAB" and "mevacor--PO 20mg TAB").  Ideally I
would like to make 2 fields with these data. One for drug name as string
(containing "simvastatin" and "mevacor) and another containing the dose
as numerics (80 and 20).  In other words, using the characters "--" as a
reference point, I'd like to extract all to the left into a strig field
NEWFIELD1 AND characters number 4 and 5 to the right of "--" into a
numeric field NEWFIELD2.  How would I go about doing this??? HELP!
Thanks in advance for any help!
Reply | Threaded
Open this post in threaded view
|

Re: Text manipulation

Maguin, Eugene
In reply to this post by Chris-315
Chris,

Read carefully the documentation (syntax reference) for the various string
functions. Anyway. Let field be your existing text string.

Compute b=index(field,'--').
String newfield1(a__). /* you'll need to set the length here.
Do if (b ne 0).
+  compute newfield1=substr(field,1,b-1).
+  compute newfield2=numeric(substr(field,b+5,2),f2.0).
End if.
*  I assume all dosages are 2 digits. If not, you'll need to
*  adjust the read format. Read the syntax reference.


Gene Maguin