Problem with date conversion in SPSS Python

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with date conversion in SPSS Python

mgriffiths
I am using SPSS v21 and Python Essentials to try to analyse a dataset that includes dates in SPSS dd-mmm-yyyy format.

There are 20 date fields in my SPSS file, named CalDay_1, CalDay_2 etc. which I want to convert to Python date format to work with. These fields are not necessarily all filled, e.g. a record may have the first three CalDay variables filled with a date, but the rest are missing.

I start by getting the 20 date fields into Python using the spssdata module, and attempt to use cvtDates to convert the formats.

 get_vars = []
   for i in range (20):
      get_vars.append('CalDay_' + str(i+1))

   data=spssdata.Spssdata(indexes=get_vars,accessType='w',cvtDates='ALL')

And then start the case-by-case analysis with:

   for row in data:
   #analysis follows

Everything works as expected for several hundred cases, but then I get a break with the following error:

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    attrition.classify_calls()
  File "C:\Python27\lib\site-packages\attrition.py", line 22, in classify_calls
    for row in data:
  File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 721, in __iter__
    yield self.rettype(self._dateconverter(row))
  File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 646, in _dateconverter
    row[index] = CvtSpssDatetime(row[index])
  File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 863, in CvtSpssDatetime
    if dt < 86400:
TypeError: can't compare datetime.datetime to int

When I trace the offending record, I find that it is the first one where CalDay_1 was missing, but CalDay_2 and subsequent variables are filled.

I'm confused by the error because when I look in spssdata.py, I find that the variable 'dt' should just be whatever the cursor reads in from the date field, so I don't understand how it can already be of type datetime.datetime.

Any ideas?
Reply | Threaded
Open this post in threaded view
|

Re: Problem with date conversion in SPSS Python

Jon K Peck
I'm not sure I have enough of your code to diagnose this, but if you are calling fetchone or the equivalent, then since you specified ALL for cvtDates, what you get back for each case should have date variables already converted to datetime objects, and calling CvtSpssDatetime again on the values would be wrong.

If that isn't the problem, if you want to send me some code and data that show the problem, I'll see if I can sort it out.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        mgriffiths <[hidden email]>
To:        [hidden email]
Date:        03/23/2015 09:11 AM
Subject:        [SPSSX-L] Problem with date conversion in SPSS Python
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I am using SPSS v21 and Python Essentials to try to analyse a dataset that
includes dates in SPSS dd-mmm-yyyy format.

There are 20 date fields in my SPSS file, named CalDay_1, CalDay_2 etc.
which I want to convert to Python date format to work with. These fields are
not necessarily all filled, e.g. a record may have the first three CalDay
variables filled with a date, but the rest are missing.

I start by getting the 20 date fields into Python using the spssdata module,
and attempt to use cvtDates to convert the formats.

get_vars = []
  for i in range (20):
     get_vars.append('CalDay_' + str(i+1))

  data=spssdata.Spssdata(indexes=get_vars,accessType='w',cvtDates='ALL')

And then start the case-by-case analysis with:

  for row in data:
  #analysis follows

Everything works as expected for several hundred cases, but then I get a
break with the following error:

Traceback (most recent call last):
 File "<pyshell#1>", line 1, in <module>
   attrition.classify_calls()
 File "C:\Python27\lib\site-packages\attrition.py", line 22, in
classify_calls
   for row in data:
 File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 721, in
__iter__
   yield self.rettype(self._dateconverter(row))
 File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 646, in
_dateconverter
   row[index] = CvtSpssDatetime(row[index])
 File "C:\Python27\lib\site-packages\spssdata\spssdata.py", line 863, in
CvtSpssDatetime
   if dt < 86400:
TypeError: can't compare datetime.datetime to int

When I trace the offending record, I find that it is the first one where
CalDay_1 was missing, but CalDay_2 and subsequent variables are filled.

I'm confused by the error because when I look in spssdata.py, I find that
the variable 'dt' should just be whatever the cursor reads in from the date
field, so I don't understand how it can already be of type
datetime.datetime.

Any ideas?



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Problem-with-date-conversion-in-SPSS-Python-tp5729019.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Problem with date conversion in SPSS Python

mgriffiths
Jon

I don't call CvtSpssDatetime again, just once through 'cvtDates=ALL'.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: Problem with date conversion in SPSS Python

Richard Ristow
In reply to this post by mgriffiths
At 11:10 AM 3/23/2015, mgriffiths wrote:

>I am using SPSS v21 and Python Essentials to try to analyse a dataset that
>includes dates in SPSS dd-mmm-yyyy format.

Is it fair to ask, what analysis will you be doing? Perhaps it could
be done in native SPSS, skipping the Python conversion altogether.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Problem with date conversion in SPSS Python

Jon K Peck
This problem was resolved by an update to the spssdata.py module.  The updated version is available on the SPSS Community website.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Richard Ristow <[hidden email]>
To:        [hidden email]
Date:        03/24/2015 08:34 PM
Subject:        Re: [SPSSX-L] Problem with date conversion in SPSS Python
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




At 11:10 AM 3/23/2015, mgriffiths wrote:

>I am using SPSS v21 and Python Essentials to try to analyse a dataset that
>includes dates in SPSS dd-mmm-yyyy format.

Is it fair to ask, what analysis will you be doing? Perhaps it could
be done in native SPSS, skipping the Python conversion altogether.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD