Clean up a string variable

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Date calculations

David Marso
Administrator
Hi Joan,
  Here is a slightly different approach using string parsing rather than the complicated MOD TRUNC / biz.
Makes me dizzy trying to mentally parse that.
HTH, David
--
DATA LIST LIST / id , times, timef .
BEGIN DATA
1, 20110127202010, 20110127202521,
2, 20110128223019, 20110129001031,
3, 20110129012051, 20110129012541,
4, 20110130202010, 20110130202521,
5, 20110131143059, 20110131144531,
6, 20110201162041, 20110201162531,
7, 20110202022031, 20110202023501,
8, 20110203152021, 20110203154531,
9, 20110204232011, 20110205000521
END DATA.
DO REPEAT #1=#Y1 #M1 #D1 #H1 #MN1 #S1
        / #2=#Y2 #M2 #D2 #H2 #MN2 #S2
        / B=   1   5   7   9   11  13
        / L=   4   2   2   2    2   2 .
COMPUTE #1 = NUMBER(SUBSTR(STRING(times,F14) ,B,L),F4).
COMPUTE #2 = NUMBER(SUBSTR(STRING(timef,F14) ,B,L),F4).
END REPEAT.
**DTTMDIFF is the interval in seconds.
COMPUTE DTTMDIFF= DATE.MDY(#M2,#D2,#Y2)+TIME.HMS(#H2,#MN2,#S2)
                -(DATE.MDY(#M1,#D1,#Y1)+TIME.HMS(#H1,#MN1,#S1)).
FORMATS times timef (F14.0).
LIST.

      ID          TIMES          TIMEF DTTMDIFF

    1.00 20110127202010 20110127202521   311.00
    2.00 20110128223019 20110129001031  6012.00
    3.00 20110129012051 20110129012541   290.00
    4.00 20110130202010 20110130202521   311.00
    5.00 20110131143059 20110131144531   872.00
    6.00 20110201162041 20110201162531   290.00
    7.00 20110202022031 20110202023501   870.00
    8.00 20110203152021 20110203154531  1510.00
    9.00 20110204232011 20110205000521  2710.00


Number of cases read:  9    Number of cases listed:  9





joan casellas wrote
Hi,

I'm trying to calculate time spent completing a questionnaire. By default, I get the variables in a weird format (numeric), so before I can start to do any calculations I have to do some transformations. So far, I think I've done quite well, but when calculating time spent (delta=time2-time1), I get negative numbers due the 24 hours issue. Any suggestions?

I also would like to recode time1, time2, date1 and date2 into a specific band ranges. Is it possible?

For example:

Val lab time1 time2
1 'less than 5 min'
2 '6 to 10 min'
3 '11 to 15 min'
4 '16 to 20 min'
5 '21 to 25 min'
6 '26 to 30 min'
7 'more than 30 min'.

For example:

Val lab date1 date2
1 'from 27/01/2011 to 28/01/2011'
2 'from 29/01/2011 to 31/01/2011'
3 'from 01/02/2011 to 01/02/2011'
4 'from 02/02/2011 to 03/02/2011'.


****times=time started questionnaire***.
****timef=time finished questionnaire****.
****20110127202010= year, month, day, hours, minutes, seconds)****.


**********************SYNTAX***************************************************.

DATA LIST LIST / id , times, timef .
BEGIN DATA
1, 20110127202010, 20110127202521,
2, 20110128223019, 20110129001031,
3, 20110129012051, 20110129012541,
4, 20110130202010, 20110130202521,
5, 20110131143059, 20110131144531,
6, 20110201162041, 20110201162531,
7, 20110202022031, 20110202023501,
8, 20110203152021, 20110203154531,
9, 20110204232011, 20110205000521
END DATA.

**********************SYNTAX***************************************************.

COMPUTE time1=TIME.HMS(MOD(TRUNC(times/10000),100),MOD(TRUNC(times/100),100),MOD(times,100)).
COMPUTE time2=TIME.HMS(MOD(TRUNC(timef/10000),100),MOD(TRUNC(timef/100),100),MOD(timef,100)).
COMPUTE delta=time2-time1.
FORMATS time1 time2 delta (TIME10).
EXECUTE.


COMPUTE date1=DATE.DMY(MOD(TRUNC(times/1000000),100), MOD(TRUNC(times/100000000),100), MOD(TRUNC(times/10000000000),10000)).
COMPUTE date2=DATE.DMY(MOD(TRUNC(timef/1000000),100), MOD(TRUNC(timef/100000000),100), MOD(TRUNC(timef/10000000000),10000)).
FORMATS date1(DATE10) date2(DATE10).
EXECUTE.

*******************************************************************************.

Any improvements on the syntax would be much appreciated. I would like to find a link with all date formats available for spss. Any suggestions?


Thanks in advance!!!!

Joan Casellas Vega
Media Research Analyst
Phone: +44 7920 761 870

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Date calculations

Bruce Weaver
Administrator
Very nice, David!  I was getting a bit dizzy too, and thought about using SUBSTR.  But without the looping, it was not going to improve much on the original approach.


David Marso wrote
Hi Joan,
  Here is a slightly different approach using string parsing rather than the complicated MOD TRUNC / biz.
Makes me dizzy trying to mentally parse that.
HTH, David
--
DATA LIST LIST / id , times, timef .
BEGIN DATA
1, 20110127202010, 20110127202521,
2, 20110128223019, 20110129001031,
3, 20110129012051, 20110129012541,
4, 20110130202010, 20110130202521,
5, 20110131143059, 20110131144531,
6, 20110201162041, 20110201162531,
7, 20110202022031, 20110202023501,
8, 20110203152021, 20110203154531,
9, 20110204232011, 20110205000521
END DATA.
DO REPEAT #1=#Y1 #M1 #D1 #H1 #MN1 #S1
        / #2=#Y2 #M2 #D2 #H2 #MN2 #S2
        / B=   1   5   7   9   11  13
        / L=   4   2   2   2    2   2 .
COMPUTE #1 = NUMBER(SUBSTR(STRING(times,F14) ,B,L),F4).
COMPUTE #2 = NUMBER(SUBSTR(STRING(timef,F14) ,B,L),F4).
END REPEAT.
**DTTMDIFF is the interval in seconds.
COMPUTE DTTMDIFF= DATE.MDY(#M2,#D2,#Y2)+TIME.HMS(#H2,#MN2,#S2)
                -(DATE.MDY(#M1,#D1,#Y1)+TIME.HMS(#H1,#MN1,#S1)).
FORMATS times timef (F14.0).
LIST.

      ID          TIMES          TIMEF DTTMDIFF

    1.00 20110127202010 20110127202521   311.00
    2.00 20110128223019 20110129001031  6012.00
    3.00 20110129012051 20110129012541   290.00
    4.00 20110130202010 20110130202521   311.00
    5.00 20110131143059 20110131144531   872.00
    6.00 20110201162041 20110201162531   290.00
    7.00 20110202022031 20110202023501   870.00
    8.00 20110203152021 20110203154531  1510.00
    9.00 20110204232011 20110205000521  2710.00


Number of cases read:  9    Number of cases listed:  9





joan casellas wrote
Hi,

I'm trying to calculate time spent completing a questionnaire. By default, I get the variables in a weird format (numeric), so before I can start to do any calculations I have to do some transformations. So far, I think I've done quite well, but when calculating time spent (delta=time2-time1), I get negative numbers due the 24 hours issue. Any suggestions?

I also would like to recode time1, time2, date1 and date2 into a specific band ranges. Is it possible?

For example:

Val lab time1 time2
1 'less than 5 min'
2 '6 to 10 min'
3 '11 to 15 min'
4 '16 to 20 min'
5 '21 to 25 min'
6 '26 to 30 min'
7 'more than 30 min'.

For example:

Val lab date1 date2
1 'from 27/01/2011 to 28/01/2011'
2 'from 29/01/2011 to 31/01/2011'
3 'from 01/02/2011 to 01/02/2011'
4 'from 02/02/2011 to 03/02/2011'.


****times=time started questionnaire***.
****timef=time finished questionnaire****.
****20110127202010= year, month, day, hours, minutes, seconds)****.


**********************SYNTAX***************************************************.

DATA LIST LIST / id , times, timef .
BEGIN DATA
1, 20110127202010, 20110127202521,
2, 20110128223019, 20110129001031,
3, 20110129012051, 20110129012541,
4, 20110130202010, 20110130202521,
5, 20110131143059, 20110131144531,
6, 20110201162041, 20110201162531,
7, 20110202022031, 20110202023501,
8, 20110203152021, 20110203154531,
9, 20110204232011, 20110205000521
END DATA.

**********************SYNTAX***************************************************.

COMPUTE time1=TIME.HMS(MOD(TRUNC(times/10000),100),MOD(TRUNC(times/100),100),MOD(times,100)).
COMPUTE time2=TIME.HMS(MOD(TRUNC(timef/10000),100),MOD(TRUNC(timef/100),100),MOD(timef,100)).
COMPUTE delta=time2-time1.
FORMATS time1 time2 delta (TIME10).
EXECUTE.


COMPUTE date1=DATE.DMY(MOD(TRUNC(times/1000000),100), MOD(TRUNC(times/100000000),100), MOD(TRUNC(times/10000000000),10000)).
COMPUTE date2=DATE.DMY(MOD(TRUNC(timef/1000000),100), MOD(TRUNC(timef/100000000),100), MOD(TRUNC(timef/10000000000),10000)).
FORMATS date1(DATE10) date2(DATE10).
EXECUTE.

*******************************************************************************.

Any improvements on the syntax would be much appreciated. I would like to find a link with all date formats available for spss. Any suggestions?


Thanks in advance!!!!

Joan Casellas Vega
Media Research Analyst
Phone: +44 7920 761 870

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
12