Reading Date & Time from CSV File

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Reading Date & Time from CSV File

John Fiedler
I am trying to parse a CSV tile with a time and date variable that looks
like this:
2011-01-18 00:54:19.000
What format do I use in my GET DATA command to read this?
Thanks in advance!
JOHN

John Fiedler
[hidden email]
Oreon Inc.
195 Wilderness Way
Boise ID 83716-3383
www.oreon.net   Please visit our evolving website!
+1.208.344.3255 (v)
+1.208.381.0944 (f)
+1.208.344.9979 (h
+1.208.863.3727 (m) No cellular service at or near place of business nor at
home.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

mpirritano
Read the variable in as a string. Not sure if other folks here would
agree but as far as I can tell best practice for reading in data is to
read everything in initially as strings. You can always change a string
to a number but you can't always change a number to a string. Make
sense?

Then you can extract info from the string variable. For example you
could create an SPSS readable date variable. You can run the following
(datevar is the new version of the variable, and datestr is the original
string version of the variables):

numeric datevar(adate10).

compute datevar =
date.mdy(number(char.substr(ltrim(rtrim(datestr)),6,2),f2.0),
number(char.substr(ltrim(rtrim(datestr)), 9,2), f2.0),
number(char.substr(ltrim(rtrim(datestr)),1,4), f4.0)).

exe.

This will create a variable that looks like 01/18/2011. That's the
American Date format.

Don't forget about the rule that you don't overwrite a variable, create
a new revised version.


Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
John Fiedler
Sent: Friday, February 11, 2011 7:47 AM
To: [hidden email]
Subject: Reading Date & Time from CSV File

I am trying to parse a CSV tile with a time and date variable that looks
like this:
2011-01-18 00:54:19.000
What format do I use in my GET DATA command to read this?
Thanks in advance!
JOHN

John Fiedler
[hidden email]
Oreon Inc.
195 Wilderness Way
Boise ID 83716-3383
www.oreon.net   Please visit our evolving website!
+1.208.344.3255 (v)
+1.208.381.0944 (f)
+1.208.344.9979 (h
+1.208.863.3727 (m) No cellular service at or near place of business nor
at
home.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

David Marso
Administrator
Matthew,
I must respectfully disagree with everything you have just said ;-)
data list / dt (sdate) tm (time).
begin data
2011-01-18 00:54:19.000
end data.
COMPUTE datetime=dt+tm.
FORMATS datetime (DATETIME).
list.

        DT       TM             DATETIME

2011/01/18  0:54:01 18-JAN-2011 00:54:01


Number of cases read:  1    Number of cases listed:  1

BTW:  If the data were formatted as follows
DD-MM-YYYY HH:MM:SS you could read it as a DATETIME variable.
data list / dttm (datetime) .
begin data
18-01-2011 00:54:19
end data.
list.
                DTTM

18-JAN-2011 00:54:19


Number of cases read:  2    Number of cases listed:  2

Pirritano, Matthew-2 wrote
"Read the variable in as a string. Not sure if other folks here would
agree but as far as I can tell best practice for reading in data is to
read everything in initially as strings. "
--
>  I can't see why you would believe this!
>  If it can be read as a numeric field then do so rather than the overhead of converting.

"You can always change a string to a number but you can't always change a number to a string. Make
sense?"
> You can NOT always change a string to a NUMBER.  Only if the string has the appropriate format!

"Then you can extract info from the string variable. For example you
could create an SPSS readable date variable. You can run the following
(datevar is the new version of the variable, and datestr is the original
string version of the variables):"

numeric datevar(adate10).

compute datevar =
date.mdy(number(char.substr(ltrim(rtrim(datestr)),6,2),f2.0),
number(char.substr(ltrim(rtrim(datestr)), 9,2), f2.0),
number(char.substr(ltrim(rtrim(datestr)),1,4), f4.0)).

exe.

This will create a variable that looks like 01/18/2011. That's the
American Date format.

"Don't forget about the rule that you don't overwrite a variable, create
a new revised version."
I'm not sure I would call this a "rule" so much as a personal preference and some sort of security against code which doesn't work correctly.


Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
John Fiedler
Sent: Friday, February 11, 2011 7:47 AM
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Reading Date & Time from CSV File

I am trying to parse a CSV tile with a time and date variable that looks
like this:
2011-01-18 00:54:19.000
What format do I use in my GET DATA command to read this?
Thanks in advance!
JOHN

John Fiedler
JohnFiedler@oreon.net
Oreon Inc.
195 Wilderness Way
Boise ID 83716-3383
www.oreon.net   Please visit our evolving website!
+1.208.344.3255 (v)
+1.208.381.0944 (f)
+1.208.344.9979 (h
+1.208.863.3727 (m) No cellular service at or near place of business nor
at
home.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

Richard Ristow
In reply to this post by John Fiedler
At 10:46 AM 2/11/2011, John Fiedler wrote:

>I am trying to parse a CSV tile with a time and date variable that looks
>like this:
>2011-01-18 00:54:19.000
>What format do I use in my GET DATA command to read this?

Matthew Pirritano's quite right that you have to read it as a text
field and parse it.

To read the file

         ID,Text,Timestamp
         101,Posted date,2011-01-18 00:54:19.000

I used this code:
GET DATA TYPE=TXT
         /FILE='C:\Documents and Settings\Richard\My Documents'         +
                 '\Technical\spssx-l\Z-2011\'                           +
                 '2011-02-11 Fiedler-Reading Date & Time from CSV File' +
                 '-'                                                    +
                 'TEST DATA edited.csv'
         /ARRANGEMENT=DELIMITED /DELIMITERS=','
         /FIRSTCASE  =2
         /VARIABLES  =ID          F3
                      Text        A12
                      TimeString  A25.

FORMATS ID   (N3).
DATASET NAME TestData WINDOW=FRONT.
LIST.
List
|-----------------------------|---------------------------|
|Output Created               |11-FEB-2011 13:16:36       |
|-----------------------------|---------------------------|
[TestData]

  ID Text         TimeString

101 Posted date  2011-01-18 00:54:19.000

Number of cases read:  1    Number of cases listed:  1


Second, I parsed it like this -- it's easier than suggested. You
can't read the whole string as a date-time value, but you can read
the date and time in one step each. Like this (assuming there are no
leading blanks in the input):

NUMERIC Date      (SDATE10)
         Time      (TIME12.3)
         TimeStamp (DATETIME25.3).

COMPUTE #BREAK    = INDEX(TimeString,' ').
COMPUTE Date      = NUMBER(SUBSTR(TimeString,1,#BREAK),SDATE10).
COMPUTE Time      = NUMBER(SUBSTR(TimeString,  #BREAK),TIME20.3).
COMPUTE TimeStamp = Date + Time.

LIST /VARIABLES=ID TimeString Date Time TimeStamp.

List
|-----------------------------|---------------------------|
|Output Created               |11-FEB-2011 13:16:36       |
|-----------------------------|---------------------------|
[TestData]

  ID
TimeString                      Date         Time                 TimeStamp

101 2011-01-18 00:54:19.000   2011/01/18  0:54:19.000  18-JAN-2011 00:54:19.000

Number of cases read:  1    Number of cases listed:  1

==============================
APPENDIX: Test data, then code
==============================
ID,Text,Timestamp
101,Posted date,2011-01-18 00:54:19.000
==============================
*  C:\Documents and Settings\Richard\My Documents                    .
*    \Technical\spssx-l\Z-2011\                                      .
*    2011-02-11 Fiedler-Reading Date & Time from CSV File.SPS        .

*  In response to posting                                            .
*  Date:    Fri, 11 Feb 2011 08:46:34 -0700                          .
*  From:    John Fiedler <[hidden email]>                     .
*  Subject: Reading Date & Time from CSV File                        .
*  To: [hidden email]                                      .

*  "I am trying to parse a CSV tile with a time and date variable    .
*  that looks like this:                                             .
*  2011-01-18 00:54:19.000                                           .
*  What format do I use in my GET DATA command to read this?"        .
*                                                                    .
*  There'll be only a few SPSS statements, but they're complex       .
*  enough -- notably, giving the full path for the test file --      .
*  that it's worth keeping the code in more than just a scratch      .
*  file.                                                             .

NEW FILE.
GET DATA TYPE=TXT
         /FILE='C:\Documents and Settings\Richard\My Documents'         +
                 '\Technical\spssx-l\Z-2011\'                           +
                 '2011-02-11 Fiedler-Reading Date & Time from CSV File' +
                 '-'                                                    +
                 'TEST DATA edited.csv'
         /ARRANGEMENT=DELIMITED /DELIMITERS=','
         /FIRSTCASE  =2
         /VARIABLES  =ID          F3
                      Text        A12
                      TimeString  A25.

FORMATS ID   (N3).
DATASET NAME TestData WINDOW=FRONT.

LIST.

NUMERIC Date      (SDATE10)
         Time      (TIME12.3)
         TimeStamp (DATETIME25.3).

COMPUTE #BREAK    = INDEX(TimeString,' ').
COMPUTE Date      = NUMBER(SUBSTR(TimeString,1,#BREAK),SDATE10).
COMPUTE Time      = NUMBER(SUBSTR(TimeString,  #BREAK),TIME20.3).
COMPUTE TimeStamp = Date + Time.

LIST /VARIABLES=ID TimeString Date Time TimeStamp.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

David Marso
Administrator
<quote author="Richard Ristow">
At 10:46 AM 2/11/2011, John Fiedler wrote:

>I am trying to parse a CSV tile with a time and date variable that looks
>like this:
>2011-01-18 00:54:19.000
>What format do I use in my GET DATA command to read this?

"Matthew Pirritano's quite right that you have to read it as a text
field and parse it."

You *DO NOT* have to it read as a text field ;-)  .  
Read directly as two separate fields (a DATE and a TIME) then sum them.
-----------
<SNIP>
Why not just forget about GET DATA and just use good "old fashioned" DATA LIST.
Read the date field as SDATE and the time field as TIME formats then sum them to get the DATETIME representation.  Why bother with a STRING variable at all?
In this case I read the time field as a scratch variable and add it to the date field,
reformatting it later from SDATE to DATETIME.  
Personally I have NEVER been all that fond of GET DATA.

data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
COMPUTE dttm=dttm+#tm.
FORMATS dttm (DATETIME).
LIST.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

David Marso
Administrator
In reply to this post by Richard Ristow
"Matthew Pirritano's quite right that you have to read it as a text
field and parse it."

You *DO NOT* have to it read as a text field ;-)  .
Read directly as two separate fields (a DATE and a TIME) then sum them.
-----------
<SNIP>
Why not just forget about GET DATA and just use good "old fashioned" DATA LIST.
Read the date field as SDATE and the time field as TIME formats then
sum them to get the DATETIME representation.  Why bother with a STRING
variable at all?
In this case I read the time field as a scratch variable and add it to
the date field,
reformatting it later from SDATE to DATETIME.
Personally I have NEVER been all that fond of GET DATA.

data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
COMPUTE dttm=dttm+#tm.
FORMATS dttm (DATETIME).
LIST.

On Fri, Feb 11, 2011 at 1:36 PM, Richard Ristow [via SPSSX Discussion]
<[hidden email]> wrote:

> At 10:46 AM 2/11/2011, John Fiedler wrote:
>
>>I am trying to parse a CSV tile with a time and date variable that looks
>>like this:
>>2011-01-18 00:54:19.000
>>What format do I use in my GET DATA command to read this?
>
> Matthew Pirritano's quite right that you have to read it as a text
> field and parse it.
>
> To read the file
>
>          ID,Text,Timestamp
>          101,Posted date,2011-01-18 00:54:19.000
>
> I used this code:
> GET DATA TYPE=TXT
>          /FILE='C:\Documents and Settings\Richard\My Documents'         +
>                  '\Technical\spssx-l\Z-2011\'                           +
>                  '2011-02-11 Fiedler-Reading Date & Time from CSV File' +
>                  '-'                                                    +
>                  'TEST DATA edited.csv'
>          /ARRANGEMENT=DELIMITED /DELIMITERS=','
>          /FIRSTCASE  =2
>          /VARIABLES  =ID          F3
>                       Text        A12
>                       TimeString  A25.
>
> FORMATS ID   (N3).
> DATASET NAME TestData WINDOW=FRONT.
> LIST.
> List
> |-----------------------------|---------------------------|
> |Output Created               |11-FEB-2011 13:16:36       |
> |-----------------------------|---------------------------|
> [TestData]
>
>   ID Text         TimeString
>
> 101 Posted date  2011-01-18 00:54:19.000
>
> Number of cases read:  1    Number of cases listed:  1
>
>
> Second, I parsed it like this -- it's easier than suggested. You
> can't read the whole string as a date-time value, but you can read
> the date and time in one step each. Like this (assuming there are no
> leading blanks in the input):
>
> NUMERIC Date      (SDATE10)
>          Time      (TIME12.3)
>          TimeStamp (DATETIME25.3).
>
> COMPUTE #BREAK    = INDEX(TimeString,' ').
> COMPUTE Date      = NUMBER(SUBSTR(TimeString,1,#BREAK),SDATE10).
> COMPUTE Time      = NUMBER(SUBSTR(TimeString,  #BREAK),TIME20.3).
> COMPUTE TimeStamp = Date + Time.
>
> LIST /VARIABLES=ID TimeString Date Time TimeStamp.
>
> List
> |-----------------------------|---------------------------|
> |Output Created               |11-FEB-2011 13:16:36       |
> |-----------------------------|---------------------------|
> [TestData]
>
>   ID
> TimeString                      Date         Time                 TimeStamp
>
> 101 2011-01-18 00:54:19.000   2011/01/18  0:54:19.000  18-JAN-2011
> 00:54:19.000
>
> Number of cases read:  1    Number of cases listed:  1
>
> ==============================
> APPENDIX: Test data, then code
> ==============================
> ID,Text,Timestamp
> 101,Posted date,2011-01-18 00:54:19.000
> ==============================
> *  C:\Documents and Settings\Richard\My Documents                    .
> *    \Technical\spssx-l\Z-2011\                                      .
> *    2011-02-11 Fiedler-Reading Date & Time from CSV File.SPS        .
>
> *  In response to posting                                            .
> *  Date:    Fri, 11 Feb 2011 08:46:34 -0700                          .
> *  From:    John Fiedler <[hidden email]>                     .
> *  Subject: Reading Date & Time from CSV File                        .
> *  To: [hidden email]                                      .
>
> *  "I am trying to parse a CSV tile with a time and date variable    .
> *  that looks like this:                                             .
> *  2011-01-18 00:54:19.000                                           .
> *  What format do I use in my GET DATA command to read this?"        .
> *                                                                    .
> *  There'll be only a few SPSS statements, but they're complex       .
> *  enough -- notably, giving the full path for the test file --      .
> *  that it's worth keeping the code in more than just a scratch      .
> *  file.                                                             .
>
> NEW FILE.
> GET DATA TYPE=TXT
>          /FILE='C:\Documents and Settings\Richard\My Documents'         +
>                  '\Technical\spssx-l\Z-2011\'                           +
>                  '2011-02-11 Fiedler-Reading Date & Time from CSV File' +
>                  '-'                                                    +
>                  'TEST DATA edited.csv'
>          /ARRANGEMENT=DELIMITED /DELIMITERS=','
>          /FIRSTCASE  =2
>          /VARIABLES  =ID          F3
>                       Text        A12
>                       TimeString  A25.
>
> FORMATS ID   (N3).
> DATASET NAME TestData WINDOW=FRONT.
>
> LIST.
>
> NUMERIC Date      (SDATE10)
>          Time      (TIME12.3)
>          TimeStamp (DATETIME25.3).
>
> COMPUTE #BREAK    = INDEX(TimeString,' ').
> COMPUTE Date      = NUMBER(SUBSTR(TimeString,1,#BREAK),SDATE10).
> COMPUTE Time      = NUMBER(SUBSTR(TimeString,  #BREAK),TIME20.3).
> COMPUTE TimeStamp = Date + Time.
>
> LIST /VARIABLES=ID TimeString Date Time TimeStamp.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://spssx-discussion.1045642.n5.nabble.com/Reading-Date-Time-from-CSV-File-tp3381430p3381781.html
> To unsubscribe from Reading Date & Time from CSV File, click here.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

Bruce Weaver
Administrator
In reply to this post by David Marso
David, I can't get your method to work for a CSV file that contains lines like this:

101,Posted date,2011-01-18 00:54:19.000,more text out here

If I allow both commas and blank spaces as field separators, the string "Posted date" gets split into two variables, and everything gets messed up.  So it looks to me like your approach will work in a CSV file only if all blank spaces can be treated as field separators (in addition to commas).  Or am I missing something?

Bruce


<quote author="David Marso">
Richard Ristow wrote
At 10:46 AM 2/11/2011, John Fiedler wrote:

>I am trying to parse a CSV tile with a time and date variable that looks
>like this:
>2011-01-18 00:54:19.000
>What format do I use in my GET DATA command to read this?

"Matthew Pirritano's quite right that you have to read it as a text
field and parse it."

You *DO NOT* have to it read as a text field ;-)  .  
Read directly as two separate fields (a DATE and a TIME) then sum them.
-----------
<SNIP>
Why not just forget about GET DATA and just use good "old fashioned" DATA LIST.
Read the date field as SDATE and the time field as TIME formats then sum them to get the DATETIME representation.  Why bother with a STRING variable at all?
In this case I read the time field as a scratch variable and add it to the date field,
reformatting it later from SDATE to DATETIME.  
Personally I have NEVER been all that fond of GET DATA.

data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
COMPUTE dttm=dttm+#tm.
FORMATS dttm (DATETIME).
LIST.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

David Marso
Administrator
GACK!!!!   My bad!!!  I missed the (vital) CSV part of the question.
I suspect that if either the string literals are quoted it would be fine.
One reason I completely deplore free-formatted data in favor of fixed record fields.
OTOH, if the DATETIME field were in DD-MM-YY HH:MM:SS it would probably work perfectly fine reading as a single DATETIME formatted variable.
-----
Good catch Bruce!
...
<quote author="Bruce Weaver">
David, I can't get your method to work for a CSV file that contains lines like this:

101,Posted date,2011-01-18 00:54:19.000,more text out here

If I allow both commas and blank spaces as field separators, the string "Posted date" gets split into two variables, and everything gets messed up.  So it looks to me like your approach will work in a CSV file only if all blank spaces can be treated as field separators (in addition to commas).  Or am I missing something?

Bruce


David Marso wrote
Richard Ristow wrote
At 10:46 AM 2/11/2011, John Fiedler wrote:

>I am trying to parse a CSV tile with a time and date variable that looks
>like this:
>2011-01-18 00:54:19.000
>What format do I use in my GET DATA command to read this?

"Matthew Pirritano's quite right that you have to read it as a text
field and parse it."

You *DO NOT* have to it read as a text field ;-)  .  
Read directly as two separate fields (a DATE and a TIME) then sum them.
-----------
<SNIP>
Why not just forget about GET DATA and just use good "old fashioned" DATA LIST.
Read the date field as SDATE and the time field as TIME formats then sum them to get the DATETIME representation.  Why bother with a STRING variable at all?
In this case I read the time field as a scratch variable and add it to the date field,
reformatting it later from SDATE to DATETIME.  
Personally I have NEVER been all that fond of GET DATA.

data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
COMPUTE dttm=dttm+#tm.
FORMATS dttm (DATETIME).
LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

Jon K Peck
David is right to be queasy about this format.  It's not that it is free format.  A good mantra here is "don't guess".  It would be very easy for this file to be read incorrectly.  There is no text delimiter such as " used, so a comma in the text would split a field.  Obviously spaces don't work as separators either.  If you read the file into Excel, the date/time field displays as 54:19.0 although the field value contains everything between the commas.  (Excel inferred a format of mm:ss.0)  CSV format is not standardized.  I would be careful to validate any data read in this way.

Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435




From:        David Marso <[hidden email]>
To:        [hidden email]
Date:        02/11/2011 12:43 PM
Subject:        Re: [SPSSX-L] Reading Date & Time from CSV File
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




GACK!!!!   My bad!!!  I missed the (vital) CSV part of the question.
I suspect that if either the string literals are quoted it would be fine.
One reason I completely deplore free-formatted data in favor of fixed record
fields.
OTOH, if the DATETIME field were in DD-MM-YY HH:MM:SS it would probably work
perfectly fine reading as a single DATETIME formatted variable.
-----
Good catch Bruce!
...

David, I can't get your method to work for a CSV file that contains lines
like this:

101,Posted date,2011-01-18 00:54:19.000,more text out here

If I allow both commas and blank spaces as field separators, the string
"Posted date" gets split into two variables, and everything gets messed up.
So it looks to me like your approach will work in a CSV file only if all
blank spaces can be treated as field separators (in addition to commas).  Or
am I missing something?

Bruce



David Marso wrote:
>
>
> Richard Ristow wrote:
>>
>> At 10:46 AM 2/11/2011, John Fiedler wrote:
>>
>>>I am trying to parse a CSV tile with a time and date variable that looks
>>>like this:
>>>2011-01-18 00:54:19.000
>>>What format do I use in my GET DATA command to read this?
>>
>> "Matthew Pirritano's quite right that you have to read it as a text
>> field and parse it."
>>
>> You *DO NOT* have to it read as a text field ;-)  .
>> Read directly as two separate fields (a DATE and a TIME) then sum them.
>> -----------
>> <SNIP>
>> Why not just forget about GET DATA and just use good "old fashioned" DATA
>> LIST.
>> Read the date field as SDATE and the time field as TIME formats then sum
>> them to get the DATETIME representation.  Why bother with a STRING
>> variable at all?
>> In this case I read the time field as a scratch variable and add it to
>> the date field,
>> reformatting it later from SDATE to DATETIME.
>> Personally I have NEVER been all that fond of GET DATA.
>>
>> data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
>> COMPUTE dttm=dttm+#tm.
>> FORMATS dttm (DATETIME).
>> LIST.
>>
>>
>>
>
>

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Reading-Date-Time-from-CSV-File-tp3381430p3381902.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

John Fiedler

Thanks everyone!

Being a cautious type, I read it as alpha and parsed the date and time

 

NUMERIC date(ADATE10).

COMPUTE date = DATE.MDY(NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),6,2),f2.0),NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)), 9,2), f2.0), NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),1,4), f4.0)).

NUMERIC time(TIME12.3).

COMPUTE time = TIME.HMS(NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),12,2),F2.0),NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)), 15,2), f2.0), NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),18,6), f6.3)).

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Friday, February 11, 2011 1:45 PM
To: [hidden email]
Subject: Re: Reading Date & Time from CSV File

 

David is right to be queasy about this format.  It's not that it is free format.  A good mantra here is "don't guess".  It would be very easy for this file to be read incorrectly.  There is no text delimiter such as " used, so a comma in the text would split a field.  Obviously spaces don't work as separators either.  If you read the file into Excel, the date/time field displays as 54:19.0 although the field value contains everything between the commas.  (Excel inferred a format of mm:ss.0)  CSV format is not standardized.  I would be careful to validate any data read in this way.

Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435




From:        David Marso <[hidden email]>
To:        [hidden email]
Date:        02/11/2011 12:43 PM
Subject:        Re: [SPSSX-L] Reading Date & Time from CSV File
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





GACK!!!!   My bad!!!  I missed the (vital) CSV part of the question.
I suspect that if either the string literals are quoted it would be fine.
One reason I completely deplore free-formatted data in favor of fixed record
fields.
OTOH, if the DATETIME field were in DD-MM-YY HH:MM:SS it would probably work
perfectly fine reading as a single DATETIME formatted variable.
-----
Good catch Bruce!
...

David, I can't get your method to work for a CSV file that contains lines
like this:

101,Posted date,2011-01-18 00:54:19.000,more text out here

If I allow both commas and blank spaces as field separators, the string
"Posted date" gets split into two variables, and everything gets messed up.
So it looks to me like your approach will work in a CSV file only if all
blank spaces can be treated as field separators (in addition to commas).  Or
am I missing something?

Bruce



David Marso wrote:
>
>
> Richard Ristow wrote:
>>
>> At 10:46 AM 2/11/2011, John Fiedler wrote:
>>
>>>I am trying to parse a CSV tile with a time and date variable that looks
>>>like this:
>>>2011-01-18 00:54:19.000
>>>What format do I use in my GET DATA command to read this?
>>
>> "Matthew Pirritano's quite right that you have to read it as a text
>> field and parse it."
>>
>> You *DO NOT* have to it read as a text field ;-)  .
>> Read directly as two separate fields (a DATE and a TIME) then sum them.
>> -----------
>> <SNIP>
>> Why not just forget about GET DATA and just use good "old fashioned" DATA
>> LIST.
>> Read the date field as SDATE and the time field as TIME formats then sum
>> them to get the DATETIME representation.  Why bother with a STRING
>> variable at all?
>> In this case I read the time field as a scratch variable and add it to
>> the date field,
>> reformatting it later from SDATE to DATETIME.
>> Personally I have NEVER been all that fond of GET DATA.
>>
>> data list FILE "C:\DATETIMETEST.txt" / dttm (sdate) #tm (time).
>> COMPUTE dttm=dttm+#tm.
>> FORMATS dttm (DATETIME).
>> LIST.
>>
>>
>>
>
>

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Reading-Date-Time-from-CSV-File-tp3381430p3381902.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Reading Date & Time from CSV File

Richard Ristow
At 04:01 PM 2/11/2011, John Fiedler wrote:

>Being a cautious type, I read it as alpha and parsed the date and time
>
>NUMERIC date(ADATE10).
>COMPUTE date =
>DATE.MDY(NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),6,2),f2.0),NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),
>9,2), f2.0), NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),1,4), f4.0)).
>NUMERIC time(TIME12.3).
>COMPUTE time =
>TIME.HMS(NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),12,2),F2.0),NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),
>15,2), f2.0), NUMBER(CHAR.SUBSTR(LTRIM(RTRIM(date_time)),18,6), f6.3)).

Right. But remember that you can also convert the date and time with
one NUMBER function each, which also saves having to count character
positions. See my posting of Friday, 11 Feb.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

searching string fields

parisec
In reply to this post by David Marso
Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

Rick Oliver-3
Well, the second part is easy.

do if index(lower(stringvar), "urban")>0.
  compute numvar=1.
else if index(lower(stringvar), "rural")>0.
  compute numvar=2.
end if.

Note that any values of stringvar that do not contain either "urban" or "rural" will be system-missing for numvar.

If all values of stringvar contain either "urban" or "rural" this could be simplified to:

compute numvar=(index(lower(stringvar), "rural")>0)+1.

This time, all values that do not contain "rural" will be 1 for numvar, even if they don't contain "urban".



From:        "Parise, Carol A." <[hidden email]>
To:        [hidden email]
Date:        02/25/2011 01:54 PM
Subject:        searching string fields
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

Maguin, Eugene
In reply to this post by parisec
Hi Carol,
 
Question 1. It looks as if you want to convert a string date to a date format numeric. NewDate is A10 and structured as MM/DD/YYYY. Leading zeros need not be included, i.e., '04/05/2006' = '4/5/2006'. So, it's just a string to number (date) conversion.
 
compute numdate=number(stringdate,adate10).
 
Question 2. I'm assuming that urban and rural can not appear together in the field. Look at the index function, as in
 
compute urban=0.
if (index(stringvar,'urban') ge 1) urban=1.
compute rural=0.
if (index(stringvar,'rural') ge 1) rural=1.
 
Be aware that the above code should pick up 'suburban' or 'exurban' or 'urban-oasis'. So you may want to use ' urban ' as the test string. Same for rural.
 
Gene Maguin


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Parise, Carol A.
Sent: Friday, February 25, 2011 2:54 PM
To: [hidden email]
Subject: searching string fields

Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

Mark Casazza-2
In reply to this post by parisec
This could probably be shortened (and the exe & list commands removed), but it works:

--
data list fixed/datevar 1-10 (a) locvar 12-17 (a).
begin data
11/02/2009 urban
99/10/2008 rural
03/99/2007 urban
end data.
exe.
list.

numeric newdate (adate).
if index(datevar,'99') = 0 newdate=number(datevar,adate).
if index(datevar,'99') = 1 newdate=number(concat('06',substr(datevar,3,10)),adate).
if index(datevar,'99') = 4 newdate=number(concat(substr(datevar,1,3),'15',substr(datevar,6,10)),adate).
exe.
list.

numeric location (f1).
if index(lower(locvar),'urban') > 0 location = 1.
if index(lower(locvar),'rural') > 0 location = 2.
exe.

list.
--

Mark

“That which can be asserted without evidence, can be dismissed without evidence.”
- Christopher Hitchens


Mark Casazza
Director of Academic Information
The City University of New York
555 West 57th Street, Suite 1240
New York, NY 10019

Phone: 212.541.0396
Fax: 212.541.0392
email: [hidden email]

On 2011-02-25 14:54, Parise, Carol A. wrote:
Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

parisec
In reply to this post by Rick Oliver-3
Thanks all,
 
This solution worked perfectly and will come in handy for future projects until i can find a way to ban string fields.
 
Carol
 
 
 

data list fixed/datevar 1-10 (a) locvar 12-17 (a).
begin data
11/02/2009 urban
99/10/2008 rural
03/99/2007 urban
end data.
exe.
list.

numeric newdate (adate).
if index(datevar,'99') = 0 newdate=number(datevar,adate).
if index(datevar,'99') = 1 newdate=number(concat('06',substr(datevar,3,10)),adate).
if index(datevar,'99') = 4 newdate=number(concat(substr(datevar,1,3),'15',substr(datevar,6,10)),adate).
exe.
list.


numeric location (f1).
if index(lower(locvar),'urban') > 0 location = 1.
if index(lower(locvar),'rural') > 0 location = 2.
exe.

list.
--



From:        "Parise, Carol A." <[hidden email]>
To:        [hidden email]
Date:        02/25/2011 01:54 PM
Subject:        searching string fields
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

David Marso
Administrator
OUCH!.  Even a cursory glance reveals this code to be a time bomb waiting to happen!
It may be fine for dates after 1999, but is wrong for an entire decade.
1990, 1991... 1999.(your first IF will fail).
I would at minimum suggest something like index(datevar,'99/')
but if you ever get another format such as SDATE (YY-MM-DD) the second one will be trouble too.
Gotta rush...
HTH, David
----
Parise, Carol A. wrote
Thanks all,

This solution worked perfectly and will come in handy for future projects until i can find a way to ban string fields.

Carol

data list fixed/datevar 1-10 (a) locvar 12-17 (a).
begin data
11/02/2009 urban
99/10/2008 rural
03/99/2007 urban
end data.
exe.
list.

numeric newdate (adate).
if index(datevar,'99') = 0 newdate=number(datevar,adate).
if index(datevar,'99') = 1 newdate=number(concat('06',substr(datevar,3,10)),adate).
if index(datevar,'99') = 4 newdate=number(concat(substr(datevar,1,3),'15',substr(datevar,6,10)),adate).
exe.
list.

numeric location (f1).
if index(lower(locvar),'urban') > 0 location = 1.
if index(lower(locvar),'rural') > 0 location = 2.
exe.
list.
--



From:        "Parise, Carol A." <PariseC@sutterhealth.org>
To:        SPSSX-L@LISTSERV.UGA.EDU
Date:        02/25/2011 01:54 PM
Subject:        searching string fields
Sent by:        "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
________________________________



Hi all,

I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.

1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.

After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.

I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:

a. Look at oldDate. If it has no 99s in it, then the newdate = olddate

b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.

c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.


2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:

Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.


I think the whole world would run more smoothly if everything were numeric.

Thanks for any assistance.
Carol






Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

parisec
Good to know! Fortunately, all of my dates ranged from 2004-2007 but will explore other options to make this work for pre 2000 since this may come up again.

Thanks


________________________________________
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of David Marso [[hidden email]]
Sent: Friday, February 25, 2011 4:47 PM
To: [hidden email]
Subject: Re: searching string fields

OUCH!.  Even a cursory glance reveals this code to be a time bomb waiting to
happen!
It may be fine for dates after 1999, but is wrong for an entire decade.
1990, 1991... 1999.(your first IF will fail).
I would at minimum suggest something like index(datevar,'99/')
but if you ever get another format such as SDATE (YY-MM-DD) the second one
will be trouble too.
Gotta rush...
HTH, David
----

Parise, Carol A. wrote:

>
> Thanks all,
>
> This solution worked perfectly and will come in handy for future projects
> until i can find a way to ban string fields.
>
> Carol
>
> data list fixed/datevar 1-10 (a) locvar 12-17 (a).
> begin data
> 11/02/2009 urban
> 99/10/2008 rural
> 03/99/2007 urban
> end data.
> exe.
> list.
>
> numeric newdate (adate).
> if index(datevar,'99') = 0 newdate=number(datevar,adate).
> if index(datevar,'99') = 1
> newdate=number(concat('06',substr(datevar,3,10)),adate).
> if index(datevar,'99') = 4
> newdate=number(concat(substr(datevar,1,3),'15',substr(datevar,6,10)),adate).
> exe.
> list.
>
> numeric location (f1).
> if index(lower(locvar),'urban') > 0 location = 1.
> if index(lower(locvar),'rural') > 0 location = 2.
> exe.
> list.
> --
>
>
>
> From:        "Parise, Carol A." <[hidden email]>
> To:        [hidden email]
> Date:        02/25/2011 01:54 PM
> Subject:        searching string fields
> Sent by:        "SPSSX(r) Discussion" <[hidden email]>
> ________________________________
>
>
>
> Hi all,
>
> I have 2 questions regarding string fields and i'm not too familar with
> how to tell SPSS to look at a string field and pull information from it.
>
> 1. I just recieved an excel file and imported into SPSS and noticed that
> my dates came in as numbers versus dates. Upon further inspection, i
> discovered that this was because if a date was unknown, 99 was used as a
> placeholder in a MMDDYYYY field. For example, if the month was unknown,
> the date was entered as 99/22/2011. If the day was unknown, the date was
> 02/99/2011. Because these dates were not legitimate, the data were assumed
> to be numeric.
>
> After discussion with colleagues, we decided to use the median DD or MM
> for these unknown values. So our missing MM would be entered as 06 and our
> missing DD would be 15 so that we can use these cases in our analysis.
>
> I converted the excel fields to text so i have the dates in SPSS in a
> string field and have computed a date field called "newdate".  i want to
> tell SPSS:
>
> a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
>
> b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
>
> c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
>
>
> 2. In the same file, I have a variable with a string of text that
> indicates whether a location is urban or rural. I want to create a new
> numeric variable called Location.  i want to tell SPSS:
>
> Look at the string variable and if it has the word "urban" anywhere in the
> field, the location = 1. If the word "rural" is anywhere in the field,
> Location=2.
>
>
> I think the whole world would run more smoothly if everything were
> numeric.
>
> Thanks for any assistance.
> Carol
>
>
>
>
>
>
>
>
>

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Reading-Date-Time-from-CSV-File-tp3381430p3401039.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

David Marso
Administrator
The "99/" amendment should work unless you get string "dates" as YY/MM/DD.  OTOH, YMMV.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: searching string fields

Art Kendall-2
In reply to this post by Rick Oliver-3
If you had done as in the previous post without assigning  the user missing value, you would have sysmis which would cue you to go back and repair your syntax. One way is like this


compute numvar = -1.
do if index(lower(stringvar), "urban")>0.

  compute numvar=1.
else if index(lower(stringvar), "rural")>0.
  compute numvar=2.
end if.
value labels numvar
 1 'urban'
2 'rural'
-1 'other'.
missing values numvar (lo thru 0).

You would know why the result was missing, ergo a User missing value.
It would not be missing because SPSS software could not follow your directions.

Art Kendall
Social Research Consultants

On 2/25/2011 3:15 PM, Rick Oliver wrote:
Well, the second part is easy.

do if index(lower(stringvar), "urban")>0.
  compute numvar=1.
else if index(lower(stringvar), "rural")>0.
  compute numvar=2.
end if.

Note that any values of stringvar that do not contain either "urban" or "rural" will be system-missing for numvar.

If all values of stringvar contain either "urban" or "rural" this could be simplified to:

compute numvar=(index(lower(stringvar), "rural")>0)+1.

This time, all values that do not contain "rural" will be 1 for numvar, even if they don't contain "urban".



From:        "Parise, Carol A." [hidden email]
To:        [hidden email]
Date:        02/25/2011 01:54 PM
Subject:        searching string fields
Sent by:        "SPSSX(r) Discussion" [hidden email]




Hi all,
 
I have 2 questions regarding string fields and i'm not too familar with how to tell SPSS to look at a string field and pull information from it.
 
1. I just recieved an excel file and imported into SPSS and noticed that my dates came in as numbers versus dates. Upon further inspection, i discovered that this was because if a date was unknown, 99 was used as a placeholder in a MMDDYYYY field. For example, if the month was unknown, the date was entered as 99/22/2011. If the day was unknown, the date was 02/99/2011. Because these dates were not legitimate, the data were assumed to be numeric.
 
After discussion with colleagues, we decided to use the median DD or MM for these unknown values. So our missing MM would be entered as 06 and our missing DD would be 15 so that we can use these cases in our analysis.
 
I converted the excel fields to text so i have the dates in SPSS in a string field and have computed a date field called "newdate".  i want to tell SPSS:
 
a. Look at oldDate. If it has no 99s in it, then the newdate = olddate
 
b. If MM in olddate=99, then newdate =06/oldMM/oldYYYY.
 
c. If DD in olddate =99, then newdate = oldDD/15/oldYYYY.
 
 
2. In the same file, I have a variable with a string of text that indicates whether a location is urban or rural. I want to create a new numeric variable called Location.  i want to tell SPSS:
 
Look at the string variable and if it has the word "urban" anywhere in the field, the location = 1. If the word "rural" is anywhere in the field, Location=2.
 
 
I think the whole world would run more smoothly if everything were numeric.
 
Thanks for any assistance.
Carol
 
 
 
 
 
 
 

12