Computing multiple string variables into one numeric column

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing multiple string variables into one numeric column

Jay Vicknair
Hello,

I'm trying to SPSS to convert multiple STRING dates into one column that
gives me either a 1 or a 0.

Usually if I just wanted to look for one case and compute it into my new
variable after running CASETOVARS, I would use something like this...

COMPUTE NUMERIC = ANY("01/20/12",STRING.1 TO STRING.20)

but now I have identical dates with trailing alpha characters

so I tried

COMPUTE NUMERIC = ANY("01/20/12A","01/20/12B","01/20/12C",STRING.1 TO
STRING.20)

but it only pulls from 01/20/12A.

What is the function to help me say I want to look at all the dates in my
variables that match my syntax?

Thank You

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

Bruce Weaver
Administrator
You could compute a set of scratch variables corresponding to string.1 to string.20, but extract only the first 8 characters.  Then your ANY approach should work.  Here's an example using only 3 variables to illustrate.

data list free / string.1 to string.3 (3a9) .
begin data
01/20/12A 02/20/12B 01/21/12C
02/20/12A 01/20/12B 04/20/12C
01/21/12A 01/22/12B 01/23/12C
01/20/12 05/20/12 01/22/12
06/20/12 05/20/12 01/22/12
end data.
list.

string #s1 to #s3 (a8).
do repeat old = string.1 to string.3 / new = #s1 to #s3 .
- compute new = substr(old,1,8).
end repeat.
compute flag = ANY("01/20/12",#s1 to #s3).
formats flag (f1.0).
list flag string.1 to string.3 .

OUTPUT:

flag string.1  string.2  string.3

  1  01/20/12A 02/20/12B 01/21/12C
  1  02/20/12A 01/20/12B 04/20/12C
  0  01/21/12A 01/22/12B 01/23/12C
  1  01/20/12  05/20/12  01/22/12
  0  06/20/12  05/20/12  01/22/12

Number of cases read:  5    Number of cases listed:  5


Jay Vicknair wrote
Hello,

I'm trying to SPSS to convert multiple STRING dates into one column that
gives me either a 1 or a 0.

Usually if I just wanted to look for one case and compute it into my new
variable after running CASETOVARS, I would use something like this...

COMPUTE NUMERIC = ANY("01/20/12",STRING.1 TO STRING.20)

but now I have identical dates with trailing alpha characters

so I tried

COMPUTE NUMERIC = ANY("01/20/12A","01/20/12B","01/20/12C",STRING.1 TO
STRING.20)

but it only pulls from 01/20/12A.

What is the function to help me say I want to look at all the dates in my
variables that match my syntax?

Thank You

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

Albert-Jan Roskam
In reply to this post by Jay Vicknair
Hi,

Perhaps:
compute num = 0.
do repeat #x = "01/20/12A","01/20/12B","01/20/12C".
+if( any(#x,STRING.1 TO STRING.20) ) num = num + 1.
end repeat.
recode num (1 thru hi = 1) (else = copy).
fre num.
 
Cheers!! Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Jay Vicknair <[hidden email]>
To: [hidden email]
Sent: Friday, January 20, 2012 2:27 PM
Subject: [SPSSX-L] Computing multiple string variables into one numeric column

Hello,

I'm trying to SPSS to convert multiple STRING dates into one column that
gives me either a 1 or a 0.

Usually if I just wanted to look for one case and compute it into my new
variable after running CASETOVARS, I would use something like this...

COMPUTE NUMERIC = ANY("01/20/12",STRING.1 TO STRING.20)

but now I have identical dates with trailing alpha characters

so I tried

COMPUTE NUMERIC = ANY("01/20/12A","01/20/12B","01/20/12C",STRING.1 TO
STRING.20)

but it only pulls from 01/20/12A.

What is the function to help me say I want to look at all the dates in my
variables that match my syntax?

Thank You

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

Bruce Weaver
Administrator
I assumed A, B, and C were given as examples of trailing characters.  My solution sets FLAG = 1 if the date is correct after stripping off any trailing character--or even if there is no trailing character.   Albert-Jan's solution, on the other hand, looks only for the correct date with  A, B or C tacked on to the end.  Re-reading the original post, I think Albert-Jan's solution is giving exactly what you asked for.

Cheers,
Bruce


Albert-Jan Roskam wrote
Hi,

Perhaps:
compute num = 0.
do repeat #x = "01/20/12A","01/20/12B","01/20/12C".
+if( any(#x,STRING.1 TO STRING.20) ) num = num + 1.
end repeat.
recode num (1 thru hi = 1) (else = copy).
fre num.

 
Cheers!!
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


>________________________________
> From: Jay Vicknair <[hidden email]>
>To: [hidden email] 
>Sent: Friday, January 20, 2012 2:27 PM
>Subject: [SPSSX-L] Computing multiple string variables into one numeric              column
>
>Hello,
>
>I'm trying to SPSS to convert multiple STRING dates into one column that
>gives me either a 1 or a 0.
>
>Usually if I just wanted to look for one case and compute it into my new
>variable after running CASETOVARS, I would use something like this...
>
>COMPUTE NUMERIC = ANY("01/20/12",STRING.1 TO STRING.20)
>
>but now I have identical dates with trailing alpha characters
>
>so I tried
>
>COMPUTE NUMERIC = ANY("01/20/12A","01/20/12B","01/20/12C",STRING.1 TO
>STRING.20)
>
>but it only pulls from 01/20/12A.
>
>What is the function to help me say I want to look at all the dates in my
>variables that match my syntax?
>
>Thank You
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD
>
>
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

David Marso
Administrator
You people are all thinking too hard!;-)
Key here is the INDEX function !!!!!
This will pick up any instances with leading fudge as well.
It believe it incumbent upon Jay (OP) to serve up some data samples if the following doesn't work.
--

data list free / string.1 to string.3 (3a9) .
begin data
01/20/12A 02/20/12B 01/21/12C
02/20/12A 01/20/12B 04/20/12C
01/21/12A 01/22/12B 01/23/12C
01/20/12 05/20/12 01/22/12
06/20/12 05/20/12 01/22/12
end data.
list.
* Don't forget to change the two 3's to 20!!!.
VECTOR STR = string.1 TO string.3.
LOOP #=1 TO 3.
COMPUTE NUMSTR = INDEX(STR(#), "01/20/12" ) GE 1.
END LOOP IF NUMSTR.
LIST.
 
STRING.1  STRING.2  STRING.3    NUMSTR

01/20/12A 02/20/12B 01/21/12C     1.00
02/20/12A 01/20/12B 04/20/12C     1.00
01/21/12A 01/22/12B 01/23/12C      .00
01/20/12  05/20/12  01/22/12      1.00
06/20/12  05/20/12  01/22/12       .00


Number of cases read:  5    Number of cases listed:  5

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

David Marso
Administrator
In reply to this post by Bruce Weaver
"Re-reading the original post, I think Albert-Jan's solution is giving exactly what you asked for."
After rereading the original post 3 or 4 times I believe your original reading is correct (any trailer).
AFAICT: My syntax behaves the same as yours.
OTOH:  My syntax goes for the jugular with INDEX rather than SUBSTR and immediately bails when found rather than tediously continuing to plod through the remaining list in the DO REPEAT.
OTOH(2).  It would be nice if Jay were to clarify his data situation with sample input and any qualifications of what trailing characters qualify.
NOTE on ANY:  You *CAN NOT* search a list within a list without some outer LOOPing context.
Albert-Jan and Bruce seem to prefer DO REPEAT.  I tend to opt for the VECTOR/LOOP due to the capability of immediate termination rather than the exhaustive search of the DO REPEAT
HTH, David

Bruce Weaver wrote
I assumed A, B, and C were given as examples of trailing characters.  My solution sets FLAG = 1 if the date is correct after stripping off any trailing character--or even if there is no trailing character.   Albert-Jan's solution, on the other hand, looks only for the correct date with  A, B or C tacked on to the end.  Re-reading the original post, I think Albert-Jan's solution is giving exactly what you asked for.

Cheers,
Bruce


Albert-Jan Roskam wrote
Hi,

Perhaps:
compute num = 0.
do repeat #x = "01/20/12A","01/20/12B","01/20/12C".
+if( any(#x,STRING.1 TO STRING.20) ) num = num + 1.
end repeat.
recode num (1 thru hi = 1) (else = copy).
fre num.

 
Cheers!!
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


>________________________________
> From: Jay Vicknair <[hidden email]>
>To: [hidden email] 
>Sent: Friday, January 20, 2012 2:27 PM
>Subject: [SPSSX-L] Computing multiple string variables into one numeric              column
>
>Hello,
>
>I'm trying to SPSS to convert multiple STRING dates into one column that
>gives me either a 1 or a 0.
>
>Usually if I just wanted to look for one case and compute it into my new
>variable after running CASETOVARS, I would use something like this...
>
>COMPUTE NUMERIC = ANY("01/20/12",STRING.1 TO STRING.20)
>
>but now I have identical dates with trailing alpha characters
>
>so I tried
>
>COMPUTE NUMERIC = ANY("01/20/12A","01/20/12B","01/20/12C",STRING.1 TO
>STRING.20)
>
>but it only pulls from 01/20/12A.
>
>What is the function to help me say I want to look at all the dates in my
>variables that match my syntax?
>
>Thank You
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD
>
>
>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

Bruce Weaver
Administrator
In reply to this post by David Marso
Yep...that's definitely a nicer way of flagging cases where "01/20/12" is present in one of the string variables.  (INDEX crossed my mind, oddly enough; but I didn't think of sticking it in a loop with "END LOOP IF".  Then by the time I figured out I needed to loop, it didn't occur to me to try INDEX again.)

If Jay wants to flag only those cases where "01/20/12" has A, B or C tacked on the end (which is still not clear to me), the COMPUTE in the middle of the loop would be:

COMPUTE NUMSTR = ANY(STR(#),"01/20/12A","01/20/12B","01/20/12C").


David Marso wrote
You people are all thinking too hard!;-)
Key here is the INDEX function !!!!!
This will pick up any instances with leading fudge as well.
It believe it incumbent upon Jay (OP) to serve up some data samples if the following doesn't work.
--

data list free / string.1 to string.3 (3a9) .
begin data
01/20/12A 02/20/12B 01/21/12C
02/20/12A 01/20/12B 04/20/12C
01/21/12A 01/22/12B 01/23/12C
01/20/12 05/20/12 01/22/12
06/20/12 05/20/12 01/22/12
end data.
list.
* Don't forget to change the two 3's to 20!!!.
VECTOR STR = string.1 TO string.3.
LOOP #=1 TO 3.
COMPUTE NUMSTR = INDEX(STR(#), "01/20/12" ) GE 1.
END LOOP IF NUMSTR.
LIST.
 
STRING.1  STRING.2  STRING.3    NUMSTR

01/20/12A 02/20/12B 01/21/12C     1.00
02/20/12A 01/20/12B 04/20/12C     1.00
01/21/12A 01/22/12B 01/23/12C      .00
01/20/12  05/20/12  01/22/12      1.00
06/20/12  05/20/12  01/22/12       .00


Number of cases read:  5    Number of cases listed:  5
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Computing multiple string variables into one numeric column

David Marso
Administrator
If there are MANY acceptable trailers I would probably do that as:
COMPUTE NUMSTR=INDEX(STR(#), "01/20/12" ) GE 1 AND ANY(SUBSTR(STR(#),9,1),"A","B","C").
Then simply list them in the second clause.
--

Bruce Weaver wrote
Yep...that's definitely a nicer way of flagging cases where "01/20/12" is present in one of the string variables.  (INDEX crossed my mind, oddly enough; but I didn't think of sticking it in a loop with "END LOOP IF".  Then by the time I figured out I needed to loop, it didn't occur to me to try INDEX again.)

If Jay wants to flag only those cases where "01/20/12" has A, B or C tacked on the end (which is still not clear to me), the COMPUTE in the middle of the loop would be:

COMPUTE NUMSTR = ANY(STR(#),"01/20/12A","01/20/12B","01/20/12C").


David Marso wrote
You people are all thinking too hard!;-)
Key here is the INDEX function !!!!!
This will pick up any instances with leading fudge as well.
It believe it incumbent upon Jay (OP) to serve up some data samples if the following doesn't work.
--

data list free / string.1 to string.3 (3a9) .
begin data
01/20/12A 02/20/12B 01/21/12C
02/20/12A 01/20/12B 04/20/12C
01/21/12A 01/22/12B 01/23/12C
01/20/12 05/20/12 01/22/12
06/20/12 05/20/12 01/22/12
end data.
list.
* Don't forget to change the two 3's to 20!!!.
VECTOR STR = string.1 TO string.3.
LOOP #=1 TO 3.
COMPUTE NUMSTR = INDEX(STR(#), "01/20/12" ) GE 1.
END LOOP IF NUMSTR.
LIST.
 
STRING.1  STRING.2  STRING.3    NUMSTR

01/20/12A 02/20/12B 01/21/12C     1.00
02/20/12A 01/20/12B 04/20/12C     1.00
01/21/12A 01/22/12B 01/23/12C      .00
01/20/12  05/20/12  01/22/12      1.00
06/20/12  05/20/12  01/22/12       .00


Number of cases read:  5    Number of cases listed:  5
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"