Trailing negative signs Explained

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Trailing negative signs Explained

Norton, John
Hi List,

What follows is a long and somewhat elementary explanation for the solution I just posted.  I thought to provide it now to offer more information so that those unfamiliar with, or just starting out using SPSS command language can better understand what I've suggested as a solution to the question about extracting the numeric information from a string variable when there are trailing negative signs.  Keep in mind, however, that mine is one way to skin this cat (apologies to the ASPCA); as is frequently the case with managing data and composing code, there are other ways too.

The solution I suggested included 3 functions.  I embedded one function - the INDEX() function - within another function - the SUBSTR() function.  These I embedded within the NUMBER() function.  Essentially, the thought process is this:

First, I assumed that there was a constant characteristic among all values, which is the trailing negative sign. I did not assume that it's location was constant, and therefore, I needed to use the INDEX() function to locate the position of the minus sign.  The INDEX() function has two arguments contained within the parentheses: first the target variable is specified, and then within quotations (which are required) I identified the target character.  All elements in the arguments for functions are separated by commas.  If I were interested in creating a variable which just records the location of the target character, I could use a single COMPUTE statement thus:

COMPUTE locatn = INDEX(str_var,"-").


Secondly, I wanted to extract all the information which appears to the left of the minus sign.  For this I used the substring, or SUBSTR() function.  This function has 3 elements which must be specified within the parentheses.  They are the target variable, the position within the string variable at which I want to begin my extractions, and the length or number of characters I want to extract.  So, when I use a SUBSTR() function, my thought processes are as follows: "extract from this variable and starting in this position, these many characters".  So, a substring function which looks like: SUBSTR(target_var,1,5) could be read like, "extract from the variable 'target_var' starting in the first position, and take 5 characters".  So if the target value is "abcdefg" the result would be "abcde".  The COMPUTE statement for this could look like:

COMPUTE extraction = SUBSTR(str_var,1,5).

But the challenge now is what to do when the number of characters you want to extract is not constant.  In the example above, the number of characters I wanted to extract was 5.  But what to do when that number is variable?  The thought process is to determine whether there is a common character - common to all values within the target variable - on which you can locate and use as a reference point.  In Dan's request, it was the negative or minus sign, but other times, it could be the space between a first and last name, or the comma separating last name from the first.

Remember above, I used the INDEX() function to locate the position of that common character.  The though process here is that I want all characters preceding that common character.  So, if that common character is lo9cated in the 4th position in the string, then I want the first 3 characters.  Put another way, if the target character is located in the 4th position, then I want to extract all characters starting from the first position through the 3rd position (or the location of the target character minus one position).

So, now using Dan's data (or who I assumed it would look) we have:
1-
12-
123-
1234-
12345-

The target character is the minus sign, and exists in positions 2, 3, 4, 5 and 6 of the target variable.  So here, I embed the INDEX() function to define the 3rd element of the SUBSTR(), thus:

COMPUTE new_var = SUBSTR(str_var,1,INDEX(str_var,"-")-1).

The thought processes here may be read as follows: Extract from "str_var" starting in the 1st position, and take all characters up to the position of the negative sign, minus 1.   So, if the negative sign is in the 4th position, take 4-1, or a total of 3 characters from the target variable, starting with the first character.

Now, keep in mind that at this point in the game, the target variable is string, and the characters - though they happen to be numbers - remain defined as a string.  If we want those numbers to be defined as a number in SPSS, we need the NUMERIC() function to do this.  This function has 2 elements, the target variable, and the numeric format.

Again, I embed the functions thus:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

The thought process can be read from the above thus: The SUBSTR() function will extract characters from "str_var" starting in the 1st position, and continue up to the position of the negative sign, minus 1.  Then the NUMBER function will convert the characters from their current definition of string to numeric data.  (The "F8" instructs SPSS to use a floating point format).

I hope this was useful; I recognize many who contribute to this list are already quite conversant in SPSS command language, almost to the extent composing poetic code.  However, for those new to SPSS and/or its command language (often called "syntax") I hope deconstructing the fairly complex command into its elements and breaking it down was useful.

John Norton
SPSS Inc.

-----Original Message-----
From: Norton, John
Sent: Monday, July 02, 2007 9:08 AM
To: Dan Meir; [hidden email]
Subject: RE: Trailing negative signs

Hi Dan,

Assuming that your data look like the following, you can use an INDEX(), NUMBER() and a SUBSTR() function:

Assume your data look like this (and is called "str_var"):
1-
12-
123-
1234-
12345-

You can extract the numeric information by using the following COMPUTE statement:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

HTH

John Norton
SPSS Inc.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dan Meir
Sent: Monday, July 02, 2007 8:18 AM
To: [hidden email]
Subject: Trailing negative signs

Hello listers,

I have been looking for a solution to convert (string) data that includes
trailing negative signs into regular numeric data.  REPLACE is not an
option as I am using version 13.0.  Is there a way to use RTRIM?  It
hasn't  been useful so far.

Thanks everyone.

Dan
Reply | Threaded
Open this post in threaded view
|

Re: Trailing negative signs Explained

Dan Meir
John,

This solution makes a lot of sense (and the explanation will help as well).

Thanks!

Dan




----- Original Message ----
From: "Norton, John" <[hidden email]>
To: Dan Meir <[hidden email]>; [hidden email]
Sent: Monday, July 2, 2007 10:15:26 AM
Subject: Trailing negative signs Explained


Hi List,

What follows is a long and somewhat elementary explanation for the solution I just posted.  I thought to provide it now to offer more information so that those unfamiliar with, or just starting out using SPSS command language can better understand what I've suggested as a solution to the question about extracting the numeric information from a string variable when there are trailing negative signs.  Keep in mind, however, that mine is one way to skin this cat (apologies to the ASPCA); as is frequently the case with managing data and composing code, there are other ways too.

The solution I suggested included 3 functions.  I embedded one function - the INDEX() function - within another function - the SUBSTR() function.  These I embedded within the NUMBER() function.  Essentially, the thought process is this:

First, I assumed that there was a constant characteristic among all values, which is the trailing negative sign. I did not assume that it's location was constant, and therefore, I needed to use the INDEX() function to locate the position of the minus sign.  The INDEX() function has two arguments contained within the parentheses: first the target variable is specified, and then within quotations (which are required) I identified the target character.  All elements in the arguments for functions are separated by commas.  If I were interested in creating a variable which just records the location of the target character, I could use a single COMPUTE statement thus:

COMPUTE locatn = INDEX(str_var,"-").


Secondly, I wanted to extract all the information which appears to the left of the minus sign.  For this I used the substring, or SUBSTR() function.  This function has 3 elements which must be specified within the parentheses.  They are the target variable, the position within the string variable at which I want to begin my extractions, and the length or number of characters I want to extract.  So, when I use a SUBSTR() function, my thought processes are as follows: "extract from this variable and starting in this position, these many characters".  So, a substring function which looks like: SUBSTR(target_var,1,5) could be read like, "extract from the variable 'target_var' starting in the first position, and take 5 characters".  So if the target value is "abcdefg" the result would be "abcde".  The COMPUTE statement for this could look like:

COMPUTE extraction = SUBSTR(str_var,1,5).

But the challenge now is what to do when the number of characters you want to extract is not constant.  In the example above, the number of characters I wanted to extract was 5.  But what to do when that number is variable?  The thought process is to determine whether there is a common character - common to all values within the target variable - on which you can locate and use as a reference point.  In Dan's request, it was the negative or minus sign, but other times, it could be the space between a first and last name, or the comma separating last name from the first.

Remember above, I used the INDEX() function to locate the position of that common character.  The though process here is that I want all characters preceding that common character.  So, if that common character is lo9cated in the 4th position in the string, then I want the first 3 characters.  Put another way, if the target character is located in the 4th position, then I want to extract all characters starting from the first position through the 3rd position (or the location of the target character minus one position).

So, now using Dan's data (or who I assumed it would look) we have:
1-
12-
123-
1234-
12345-

The target character is the minus sign, and exists in positions 2, 3, 4, 5 and 6 of the target variable.  So here, I embed the INDEX() function to define the 3rd element of the SUBSTR(), thus:

COMPUTE new_var = SUBSTR(str_var,1,INDEX(str_var,"-")-1).

The thought processes here may be read as follows: Extract from "str_var" starting in the 1st position, and take all characters up to the position of the negative sign, minus 1.   So, if the negative sign is in the 4th position, take 4-1, or a total of 3 characters from the target variable, starting with the first character.

Now, keep in mind that at this point in the game, the target variable is string, and the characters - though they happen to be numbers - remain defined as a string.  If we want those numbers to be defined as a number in SPSS, we need the NUMERIC() function to do this.  This function has 2 elements, the target variable, and the numeric format.

Again, I embed the functions thus:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

The thought process can be read from the above thus: The SUBSTR() function will extract characters from "str_var" starting in the 1st position, and continue up to the position of the negative sign, minus 1.  Then the NUMBER function will convert the characters from their current definition of string to numeric data.  (The "F8" instructs SPSS to use a floating point format).

I hope this was useful; I recognize many who contribute to this list are already quite conversant in SPSS command language, almost to the extent composing poetic code.  However, for those new to SPSS and/or its command language (often called "syntax") I hope deconstructing the fairly complex command into its elements and breaking it down was useful.

John Norton
SPSS Inc.

-----Original Message-----
From: Norton, John
Sent: Monday, July 02, 2007 9:08 AM
To: Dan Meir; [hidden email]
Subject: RE: Trailing negative signs

Hi Dan,

Assuming that your data look like the following, you can use an INDEX(), NUMBER() and a SUBSTR() function:

Assume your data look like this (and is called "str_var"):
1-
12-
123-
1234-
12345-

You can extract the numeric information by using the following COMPUTE statement:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

HTH

John Norton
SPSS Inc.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dan Meir
Sent: Monday, July 02, 2007 8:18 AM
To: [hidden email]
Subject: Trailing negative signs

Hello listers,

I have been looking for a solution to convert (string) data that includes
trailing negative signs into regular numeric data.  REPLACE is not an
option as I am using version 13.0.  Is there a way to use RTRIM?  It
hasn't  been useful so far.

Thanks everyone.

Dan



____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more.
http://mobile.yahoo.com/go?refer=1GNXIC
Reply | Threaded
Open this post in threaded view
|

Re: Trailing negative signs Explained

Dan Meir
In reply to this post by Norton, John
John,

Actually, there indeed are both negative and positive values.  Now I have been able to create 2 variables, one for the positive and one for the negatives and computed a third which is a COMPUTE.

RECODE var1 (CONVERT) INTO var1_num.
COMPUTE var2= NUMBER(SUBSTR(var1,1,INDEX(var1,"-")-1),F20).
RECODE var2 var1_num (SYSMIS=0) .
COMPUTE final_variable = var1_num + (var2 * - 1) .
Format final_variable (f30.0).
EXECUTE .


It's a little long, but it definitely works.  Do you have a more elegant solution?

Thanks again,

Dan



----- Original Message ----
From: "Norton, John" <[hidden email]>
To: Dan Meir <[hidden email]>
Sent: Monday, July 2, 2007 10:51:15 AM
Subject: RE: Trailing negative signs Explained


My pleasure.  But I’m realizing that I’m assuming that all cases have that negative sign.  Is that the case?  Or, are some of the values positive while others are negative?  If the latter is the case, then you’ll have a bigger head ache to deal with, because if no negative sign exists, then all your positive values will be returned as system missing in your new variable!!  (Oops!!)
 
If this is a problem – if I’m correct in suspecting that your variable has both negative *and* positive values – please post a response to the list and me stating this, and I’ll post a solution.  I encourage posting to the LIST so all can benefit.
 
Thanks,
 
John
 



From: Dan Meir [mailto:[hidden email]]
Sent: Monday, July 02, 2007 10:44 AM
To: Norton, John; [hidden email]
Subject: Re: Trailing negative signs Explained
 
John,
 
This solution makes a lot of sense (and the explanation will help as well).
 
Thanks!

Dan
 


 
----- Original Message ----
From: "Norton, John" <[hidden email]>
To: Dan Meir <[hidden email]>; [hidden email]
Sent: Monday, July 2, 2007 10:15:26 AM
Subject: Trailing negative signs Explained
Hi List,

What follows is a long and somewhat elementary explanation for the solution I just posted.  I thought to provide it now to offer more information so that those unfamiliar with, or just starting out using SPSS command language can better understand what I've suggested as a solution to the question about extracting the numeric information from a string variable when there are trailing negative signs.  Keep in mind, however, that mine is one way to skin this cat (apologies to the ASPCA); as is frequently the case with managing data and composing code, there are other ways too.  

The solution I suggested included 3 functions.  I embedded one function - the INDEX() function - within another function - the SUBSTR() function.  These I embedded within the NUMBER() function.  Essentially, the thought process is this:

First, I assumed that there was a constant characteristic among all values, which is the trailing negative sign. I did not assume that it's location was constant, and therefore, I needed to use the INDEX() function to locate the position of the minus sign.  The INDEX() function has two arguments contained within the parentheses: first the target variable is specified, and then within quotations (which are required) I identified the target character.  All elements in the arguments for functions are separated by commas.  If I were interested in creating a variable which just records the location of the target character, I could use a single COMPUTE statement thus:

COMPUTE locatn = INDEX(str_var,"-").


Secondly, I wanted to extract all the information which appears to the left of the minus sign.  For this I used the substring, or SUBSTR() function.  This function has 3 elements which must be specified within the parentheses.  They are the target variable, the position within the string variable at which I want to begin my extractions, and the length or number of characters I want to extract.  So, when I use a SUBSTR() function, my thought processes are as follows: "extract from this variable and starting in this position, these many characters".  So, a substring function which looks like: SUBSTR(target_var,1,5) could be read like, "extract from the variable 'target_var' starting in the first position, and take 5 characters".  So if the target value is "abcdefg" the result would be "abcde".  The COMPUTE statement for this could look like:

COMPUTE extraction = SUBSTR(str_var,1,5).

But the challenge now is what to do when the number of characters you want to extract is not constant.  In the example above, the number of characters I wanted to extract was 5.  But what to do when that number is variable?  The thought process is to determine whether there is a common character - common to all values within the target variable - on which you can locate and use as a reference point.  In Dan's request, it was the negative or minus sign, but other times, it could be the space between a first and last name, or the comma separating last name from the first.

Remember above, I used the INDEX() function to locate the position of that common character.  The though process here is that I want all characters preceding that common character.  So, if that common character is lo9cated in the 4th position in the string, then I want the first 3 characters.  Put another way, if the target character is located in the 4th position, then I want to extract all characters starting from the first position through the 3rd position (or the location of the target character minus one position).

So, now using Dan's data (or who I assumed it would look) we have:
1-
12-
123-
1234-
12345-

The target character is the minus sign, and exists in positions 2, 3, 4, 5 and 6 of the target variable.  So here, I embed the INDEX() function to define the 3rd element of the SUBSTR(), thus:

COMPUTE new_var = SUBSTR(str_var,1,INDEX(str_var,"-")-1).

The thought processes here may be read as follows: Extract from "str_var" starting in the 1st position, and take all characters up to the position of the negative sign, minus 1.   So, if the negative sign is in the 4th position, take 4-1, or a total of 3 characters from the target variable, starting with the first character.

Now, keep in mind that at this point in the game, the target variable is string, and the characters - though they happen to be numbers - remain defined as a string.  If we want those numbers to be defined as a number in SPSS, we need the NUMERIC() function to do this.  This function has 2 elements, the target variable, and the numeric format.

Again, I embed the functions thus:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

The thought process can be read from the above thus: The SUBSTR() function will extract characters from "str_var" starting in the 1st position, and continue up to the position of the negative sign, minus 1.  Then the NUMBER function will convert the characters from their current definition of string to numeric data.  (The "F8" instructs SPSS to use a floating point format).

I hope this was useful; I recognize many who contribute to this list are already quite conversant in SPSS command language, almost to the extent composing poetic code.  However, for those new to SPSS and/or its command language (often called "syntax") I hope deconstructing the fairly complex command into its elements and breaking it down was useful.

John Norton
SPSS Inc.

-----Original Message-----
From: Norton, John
Sent: Monday, July 02, 2007 9:08 AM
To: Dan Meir; [hidden email]
Subject: RE: Trailing negative signs

Hi Dan,

Assuming that your data look like the following, you can use an INDEX(), NUMBER() and a SUBSTR() function:

Assume your data look like this (and is called "str_var"):
1-
12-
123-
1234-
12345-

You can extract the numeric information by using the following COMPUTE statement:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

HTH

John Norton
SPSS Inc.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dan Meir
Sent: Monday, July 02, 2007 8:18 AM
To: [hidden email]
Subject: Trailing negative signs

Hello listers,

I have been looking for a solution to convert (string) data that includes
trailing negative signs into regular numeric data.  REPLACE is not an
option as I am using version 13.0.  Is there a way to use RTRIM?  It
hasn't  been useful so far.

Thanks everyone.

Dan
 
 



Park yourself in front of a world of choices in alternative vehicles.
Visit the Yahoo! Auto Green Center.


       
____________________________________________________________________________________Ready for the edge of your seat?
Check out tonight's top picks on Yahoo! TV.
http://tv.yahoo.com/
Reply | Threaded
Open this post in threaded view
|

Re: Trailing negative signs Explained

Norton, John
Hi Dan,

 

Use a DO IF procedure to execute the COMPUTE statement which uses the INDEX() function when there is a negative sign, and a different COMPUTE (which doesn't use the INDEX() function) when the value in the target variable is positive:

 

DO IF (INDEX(str_var,"-") <> 0).

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

ELSE.

COMPUTE num_var = NUMBER(SUBSTR(str_var,1),F8).

END IF.

EXE.

 

Looking at the second COMPUTE statement, notice that the SUBSTR() function only defines two of the elements.  In my earlier post, I mentioned that there are 3 elements in this argument.  But in this second COMPUTE statement, the last element - which specifies the number of characters to be extracted - is not defined.  By default, then, the extraction will take all characters.

 

Note also that the two COMPUTE statements create and populate only one variable, here called "num_var." Given that the DO IF employs a condition to the processing, we can execute one COMPUTE under certain conditions, and another COMPUTE which will populate the same variable in a different way when alternate conditions exist.

 

The DO IF/ELSE IF process is perfect for conditional processing, and is essentially a logical structure within which transformations (and only transformations) can be executed.  So, reading the DO IF might sound something like this:  "Do the following transformations if this condition is true...  If this condition is not true, then skip across all the conditional transformations and evaluate whether the alternative condition is true.."   So, if it's true that the location of the negative sign in str_var is in a position greater than 0, then the first COMPUTE is executed.  But if that condition is evaluated as false, then skip to the next criteria - in this case, the "ELSE" covers all other situations - and execute that COMPUTE statement.

 

HTH,

 

John Norton

SPSS Inc.

 

 

________________________________

From: Dan Meir [mailto:[hidden email]]
Sent: Monday, July 02, 2007 11:14 AM
To: Norton, John; [hidden email]
Subject: Re: Trailing negative signs Explained

 

John,

 

Actually, there indeed are both negative and positive values.  Now I have been able to create 2 variables, one for the positive and one for the negatives and computed a third which is a COMPUTE.

 

RECODE var1 (CONVERT) INTO var1_num.

COMPUTE var2= NUMBER(SUBSTR(var1,1,INDEX(var1,"-")-1),F20).

RECODE var2 var1_num (SYSMIS=0) .

COMPUTE final_variable = var1_num + (var2 * - 1) .

Format final_variable (f30.0).

EXECUTE .

 

 

It's a little long, but it definitely works.  Do you have a more elegant solution?

 

Thanks again,


Dan

 

----- Original Message ----
From: "Norton, John" <[hidden email]>
To: Dan Meir <[hidden email]>
Sent: Monday, July 2, 2007 10:51:15 AM
Subject: RE: Trailing negative signs Explained

My pleasure.  But I'm realizing that I'm assuming that all cases have that negative sign.  Is that the case?  Or, are some of the values positive while others are negative?  If the latter is the case, then you'll have a bigger head ache to deal with, because if no negative sign exists, then all your positive values will be returned as system missing in your new variable!!  (Oops!!)

 

If this is a problem - if I'm correct in suspecting that your variable has both negative *and* positive values - please post a response to the list and me stating this, and I'll post a solution.  I encourage posting to the LIST so all can benefit.

 

Thanks,

 

John

 

________________________________

From: Dan Meir [mailto:[hidden email]]
Sent: Monday, July 02, 2007 10:44 AM
To: Norton, John; [hidden email]
Subject: Re: Trailing negative signs Explained

 

John,

 

This solution makes a lot of sense (and the explanation will help as well).

 

Thanks!


Dan

 



 

----- Original Message ----
From: "Norton, John" <[hidden email]>
To: Dan Meir <[hidden email]>; [hidden email]
Sent: Monday, July 2, 2007 10:15:26 AM
Subject: Trailing negative signs Explained

Hi List,

What follows is a long and somewhat elementary explanation for the solution I just posted.  I thought to provide it now to offer more information so that those unfamiliar with, or just starting out using SPSS command language can better understand what I've suggested as a solution to the question about extracting the numeric information from a string variable when there are trailing negative signs.  Keep in mind, however, that mine is one way to skin this cat (apologies to the ASPCA); as is frequently the case with managing data and composing code, there are other ways too.  

The solution I suggested included 3 functions.  I embedded one function - the INDEX() function - within another function - the SUBSTR() function.  These I embedded within the NUMBER() function.  Essentially, the thought process is this:

First, I assumed that there was a constant characteristic among all values, which is the trailing negative sign. I did not assume that it's location was constant, and therefore, I needed to use the INDEX() function to locate the position of the minus sign.  The INDEX() function has two arguments contained within the parentheses: first the target variable is specified, and then within quotations (which are required) I identified the target character.  All elements in the arguments for functions are separated by commas.  If I were interested in creating a variable which just records the location of the target character, I could use a single COMPUTE statement thus:

COMPUTE locatn = INDEX(str_var,"-").


Secondly, I wanted to extract all the information which appears to the left of the minus sign.  For this I used the substring, or SUBSTR() function.  This function has 3 elements which must be specified within the parentheses.  They are the target variable, the position within the string variable at which I want to begin my extractions, and the length or number of characters I want to extract.  So, when I use a SUBSTR() function, my thought processes are as follows: "extract from this variable and starting in this position, these many characters".  So, a substring function which looks like: SUBSTR(target_var,1,5) could be read like, "extract from the variable 'target_var' starting in the first position, and take 5 characters".  So if the target value is "abcdefg" the result would be "abcde".  The COMPUTE statement for this could look like:

COMPUTE extraction = SUBSTR(str_var,1,5).

But the challenge now is what to do when the number of characters you want to extract is not constant.  In the example above, the number of characters I wanted to extract was 5.  But what to do when that number is variable?  The thought process is to determine whether there is a common character - common to all values within the target variable - on which you can locate and use as a reference point.  In Dan's request, it was the negative or minus sign, but other times, it could be the space between a first and last name, or the comma separating last name from the first.

Remember above, I used the INDEX() function to locate the position of that common character.  The though process here is that I want all characters preceding that common character.  So, if that common character is lo9cated in the 4th position in the string, then I want the first 3 characters.  Put another way, if the target character is located in the 4th position, then I want to extract all characters starting from the first position through the 3rd position (or the location of the target character minus one position).

So, now using Dan's data (or who I assumed it would look) we have:
1-
12-
123-
1234-
12345-

The target character is the minus sign, and exists in positions 2, 3, 4, 5 and 6 of the target variable.  So here, I embed the INDEX() function to define the 3rd element of the SUBSTR(), thus:

COMPUTE new_var = SUBSTR(str_var,1,INDEX(str_var,"-")-1).

The thought processes here may be read as follows: Extract from "str_var" starting in the 1st position, and take all characters up to the position of the negative sign, minus 1.   So, if the negative sign is in the 4th position, take 4-1, or a total of 3 characters from the target variable, starting with the first character.

Now, keep in mind that at this point in the game, the target variable is string, and the characters - though they happen to be numbers - remain defined as a string.  If we want those numbers to be defined as a number in SPSS, we need the NUMERIC() function to do this.  This function has 2 elements, the target variable, and the numeric format.

Again, I embed the functions thus:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

The thought process can be read from the above thus: The SUBSTR() function will extract characters from "str_var" starting in the 1st position, and continue up to the position of the negative sign, minus 1.  Then the NUMBER function will convert the characters from their current definition of string to numeric data.  (The "F8" instructs SPSS to use a floating point format).

I hope this was useful; I recognize many who contribute to this list are already quite conversant in SPSS command language, almost to the extent composing poetic code.  However, for those new to SPSS and/or its command language (often called "syntax") I hope deconstructing the fairly complex command into its elements and breaking it down was useful.

John Norton
SPSS Inc.

-----Original Message-----
From: Norton, John
Sent: Monday, July 02, 2007 9:08 AM
To: Dan Meir; [hidden email]
Subject: RE: Trailing negative signs

Hi Dan,

Assuming that your data look like the following, you can use an INDEX(), NUMBER() and a SUBSTR() function:

Assume your data look like this (and is called "str_var"):
1-
12-
123-
1234-
12345-

You can extract the numeric information by using the following COMPUTE statement:

COMPUTE num_var = NUMBER(SUBSTR(str_var,1,INDEX(str_var,"-")-1),F8).

HTH

John Norton
SPSS Inc.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dan Meir
Sent: Monday, July 02, 2007 8:18 AM
To: [hidden email]
Subject: Trailing negative signs

Hello listers,

I have been looking for a solution to convert (string) data that includes
trailing negative signs into regular numeric data.  REPLACE is not an
option as I am using version 13.0.  Is there a way to use RTRIM?  It
hasn't  been useful so far.

Thanks everyone.

Dan

 

 

________________________________

Park yourself in front of a world of choices in alternative vehicles.
Visit the Yahoo! Auto Green Center. <http://us.rd.yahoo.com/evt=48246/*http:/autos.yahoo.com/green_center/;_ylc=X3oDMTE5cDF2bXZzBF9TAzk3MTA3MDc2BHNlYwNtYWlsdGFncwRzbGsDZ3JlZW4tY2VudGVy>

 

 

________________________________

Be a better Heartthrob. Get better relationship answers <http://us.rd.yahoo.com/evt=48255/*http:/answers.yahoo.com/dir/_ylc=X3oDMTI5MGx2aThyBF9TAzIxMTU1MDAzNTIEX3MDMzk2NTQ1MTAzBHNlYwNCQUJwaWxsYXJfTklfMzYwBHNsawNQcm9kdWN0X3F1ZXN0aW9uX3BhZ2U-?link=list&sid=396545433> from someone who knows.
Yahoo! Answers - Check it out.