Can I force SPSS to release an .xls file from memory after reading its data?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Can I force SPSS to release an .xls file from memory after reading its data?

Ruben Geert van den Berg
Dear all,

I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.

Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.

>Error.  Command name: GET DATA
>(2052) Error accessing the Excel file.  The file may be open by another
>application.  If so, close it and retry.
>* File: "d:\temp\testbook.xls"
>Execution of this command stops.

 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding

cache.
exe.

after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?

TIA!

Reply | Threaded
Open this post in threaded view
|

Automatic reply: Can I force SPSS to release an .xls file from memory after reading its data?

Tierney, Mary Lou

I am in an two half-day meetings today with no access to email - I will respond to email Wednesday 14 November.

 

Please contact:

·         [hidden email] for questions about or assistance with MIP ProjectPages (page functionality, content on the pages, i.e. PI name, proejct title, etc.)

·          [hidden email] for questions about charge numbers

·          [hidden email] for quesitons or assiatcne with MIP processes or the webiste

·         [hidden email] for assistance with other Innovation Zone sites, such as CI&T InZone

 

Regards,

Mary Lou

 

Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

David Marso
Administrator
In reply to this post by Ruben Geert van den Berg

At some point I would likely say screw it and use VBA to build a concatenated text file of the 365 sheets and then run SPSS against the text file.  I know this isn't the answer you are looking for but ...
After about wasting 2 hrs f'ing around I would take 10 minutes and write the VBA code and it would probably finish running in about 20 seconds ;-)
--
Ruben van den Berg wrote
Dear all,
I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.
Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.
>Error.  Command name: GET DATA>(2052) Error accessing the Excel file.  The file may be open by another>application.  If so, close it and retry.>* File: "d:\temp\testbook.xls">Execution of this command stops.
 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding
cache.exe.
after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?
TIA!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Ruben Geert van den Berg
Thanks David!

Another list member had the same suggestion but I'm completely (really 100%) unfamiliar with VBA. However, I think I may fix it with the Python xlrd and xlwt modules, at least I'm familiar with that syntax.

Best regards,

Ruben

> Date: Tue, 13 Nov 2012 04:16:26 -0800

> From: [hidden email]
> Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
> To: [hidden email]
>
> At some point I would likely say screw it and use VBA to build a
> concatenated text file of the 365 sheets and then run SPSS against the text
> file. I know this isn't the answer you are looking for but ...
> After about wasting 2 hrs f'ing around I would take 10 minutes and write the
> VBA code and it would probably finish running in about 20 seconds ;-)
> --
> Ruben van den Berg wrote
> > Dear all,
> > I have a problem: I'm reading in an Excel workbook with 365 sheets of
> > data. Every sheet becomes a new DataSet and every time 50 DataSets are
> > open simultaneously, an ADD FILES command merges them into a single
> > DataSet and then new sheets are read.
> > Unfortunately, I get an error and the weird thing is that every time I run
> > the exact same code, I get it after a different number of sheets (20, 57,
> > 84, 82, 67 and so on...) have been processed.
> >>Error. Command name: GET DATA>(2052) Error accessing the Excel file. The
> file may be open by another>application. If so, close it and retry.>* File:
> "d:\temp\testbook.xls">Execution of this command stops.
> > My first guess would be that SPSS is somehow 'too slow' in releasing the
> > workbook from memory. I tried adding
> > cache.exe.
> > after reading each sheet but that doesn't seem to make any difference.
> > Does anyone have an idea whether/how this could be solved?
> > TIA!
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to email me.
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Can-I-force-SPSS-to-release-an-xls-file-from-memory-after-reading-its-data-tp5716165p5716167.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Robert Walker
In reply to this post by Ruben Geert van den Berg

Ruben,

 

Do the sheets follow a sequential naming convention, as in Sheet_1, Sheet_2, etc? Python or a macro should be able to read in each sheet without running into a memory issue.

 

Regards,

 

Bob Walker

Surveys & Forecasts, LLC

www.safllc.com

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg
Sent: Tuesday, November 13, 2012 6:45 AM
To: [hidden email]
Subject: Can I force SPSS to release an .xls file from memory after reading its data?

 

Dear all,

 

I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.

 

Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.

 

>Error.  Command name: GET DATA

>(2052) Error accessing the Excel file.  The file may be open by another

>application.  If so, close it and retry.

>* File: "d:\temp\testbook.xls"

>Execution of this command stops.

 

 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding

 

cache.

exe.

 

after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?

 

TIA!

 

Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Ruben Geert van den Berg
Dear Bob,

Thanks for your reply. 

It actually is a Python block that should do the trick because I want the 365 sheet names in my data without needing them in my syntax. And the syntax that's mprinted back looks gorgeous! I agree that there should be no problem. But as a scientist, I must take seriously the observation that it keeps on crashing, at least on my (crappy) system.

Kind regards,

Ruben


Date: Tue, 13 Nov 2012 14:29:02 -0500
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
To: [hidden email]

Ruben,

 

Do the sheets follow a sequential naming convention, as in Sheet_1, Sheet_2, etc? Python or a macro should be able to read in each sheet without running into a memory issue.

 

Regards,

 

Bob Walker

Surveys & Forecasts, LLC

www.safllc.com

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg
Sent: Tuesday, November 13, 2012 6:45 AM
To: [hidden email]
Subject: Can I force SPSS to release an .xls file from memory after reading its data?

 

Dear all,

 

I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.

 

Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.

 

>Error.  Command name: GET DATA

>(2052) Error accessing the Excel file.  The file may be open by another

>application.  If so, close it and retry.

>* File: "d:\temp\testbook.xls"

>Execution of this command stops.

 

 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding

 

cache.

exe.

 

after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?

 

TIA!

 

Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Albert-Jan Roskam
Hi Ruben,
 
I presume you've already checked out xlrd?
import xlrd
wb = xlrd.open_workbook('fatfile.xls')
for sheet in wb.sheet_names():
  data_of_one_sheet = [sheet.row_values(i) for i in xrange(sheet.nrows)]
  #... etc.
 
Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Wednesday, November 14, 2012 7:32 AM
Subject: Re: [SPSSX-L] Can I force SPSS to release an .xls file from memory after reading its data?

Dear Bob,

Thanks for your reply. 

It actually is a Python block that should do the trick because I want the 365 sheet names in my data without needing them in my syntax. And the syntax that's mprinted back looks gorgeous! I agree that there should be no problem. But as a scientist, I must take seriously the observation that it keeps on crashing, at least on my (crappy) system.

Kind regards,

Ruben

Date: Tue, 13 Nov 2012 14:29:02 -0500
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
To: [hidden email]

Ruben,
 
Do the sheets follow a sequential naming convention, as in Sheet_1, Sheet_2, etc? Python or a macro should be able to read in each sheet without running into a memory issue.
 
Regards,
 
Bob Walker
Surveys & Forecasts, LLC
 
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg
Sent: Tuesday, November 13, 2012 6:45 AM
To: [hidden email]
Subject: Can I force SPSS to release an .xls file from memory after reading its data?
 
Dear all,
 
I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.
 
Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.
 
>Error.  Command name: GET DATA
>(2052) Error accessing the Excel file.  The file may be open by another
>application.  If so, close it and retry.
>* File: "d:\temp\testbook.xls"
>Execution of this command stops.
 
 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding
 
cache.
exe.
 
after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?
 
TIA!
 


Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Ruben Geert van den Berg
Hi Albert-Jan!

You presume correctly. I think my system resources rather than my syntax are the problem. Yesterday, I managed to process up to 338 sheets but then it crashed again (and it usually crashes much sooner). See below for the code.

Kind regards,

Ruben

*Create test data.
begin program.
rdir="d:/temp/"
import xlwt
from random import choice
import datetime
from datetime import date
dates=[int((date(2011,1,1)+datetime.timedelta(days=x)).strftime("20%y%m%d"))for x in range(365)]
wb=xlwt.Workbook()
for date in dates:
    ws=wb.add_sheet("data_"+str(date))
    for col,cont in enumerate(['PersonID','type','date','events']):
        ws.write(0,col,cont)
    for row,resp in enumerate([104,21,60,2,1030]):
                ws.write(row+1,0,resp)
    ws.write(1,1,'production') # If not, this column may not have 'production' in some sheets -> string lengths!
    for row in range(4):
                ws.write(row+2,1,choice(['beta','production','test']))
    for row in range(5):
                ws.write(row+1,2,date)
    for row in range(5):
                ws.write(row+1,3,choice(range(20)))
wb.save(os.path.join(rdir,'testbook.xls'))
end program.

*Should work in theory but system resources seem to run out.

begin program.
workbook=("d:/temp/testbook.xls")
import xlrd,spss
wb=xlrd.open_workbook(workbook)
vallabs=[str(cnt+1)+"'"+nam+"'" for cnt,nam in enumerate([s.name for s in wb.sheets()])]
datas=[]
for cnt,s in enumerate(wb.sheets()):
    datas.append("data_"+str(cnt+1))
    spss.Submit("""
GET DATA
/TYPE=XLS
/FILE='%s'
/SHEET=name '%s'
/CELLRANGE=full
/READNAMES=on
/ASSUMEDSTRWIDTH=32767.
cache.
exe.
dataset name data_%d.
comp sheet=%d.\nexe.
"""%(workbook,s.name,cnt+1,cnt+1))
#end program.
    if cnt != 0 and (cnt%49==0 or cnt+1==len(wb.sheets())):
        spss.Submit("add files " + "/".join(["file="+fil for fil in datas]) + ".\nexe.\ndatas clo all.\ndatas nam data_1.")
        datas=["data_1"]
spss.Submit("val lab sheet %s."%" ".join(vallabs))
end program.


Date: Wed, 14 Nov 2012 00:59:24 -0800
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
To: [hidden email]

Hi Ruben,
 
I presume you've already checked out xlrd?
import xlrd
wb = xlrd.open_workbook('fatfile.xls')
for sheet in wb.sheet_names():
  data_of_one_sheet = [sheet.row_values(i) for i in xrange(sheet.nrows)]
  #... etc.
 
Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
From: Ruben van den Berg <[hidden email]>
To: [hidden email]
Sent: Wednesday, November 14, 2012 7:32 AM
Subject: Re: [SPSSX-L] Can I force SPSS to release an .xls file from memory after reading its data?

Dear Bob,

Thanks for your reply. 

It actually is a Python block that should do the trick because I want the 365 sheet names in my data without needing them in my syntax. And the syntax that's mprinted back looks gorgeous! I agree that there should be no problem. But as a scientist, I must take seriously the observation that it keeps on crashing, at least on my (crappy) system.

Kind regards,

Ruben

Date: Tue, 13 Nov 2012 14:29:02 -0500
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
To: [hidden email]

Ruben,
 
Do the sheets follow a sequential naming convention, as in Sheet_1, Sheet_2, etc? Python or a macro should be able to read in each sheet without running into a memory issue.
 
Regards,
 
Bob Walker
Surveys & Forecasts, LLC
 
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg
Sent: Tuesday, November 13, 2012 6:45 AM
To: [hidden email]
Subject: Can I force SPSS to release an .xls file from memory after reading its data?
 
Dear all,
 
I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.
 
Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.
 
>Error.  Command name: GET DATA
>(2052) Error accessing the Excel file.  The file may be open by another
>application.  If so, close it and retry.
>* File: "d:\temp\testbook.xls"
>Execution of this command stops.
 
 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding
 
cache.
exe.
 
after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?
 
TIA!
 


Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

David Marso
Administrator
Why not start with the 365 sheet wkbook, add a 366th and concatenate the 365 into the 366th.
Then read the 366th into SPSS?  That would eliminate having a huge number of SPSS data sets open.
Difficult to know what the source is.  Probably having 365 data sets in SPSS??? >;-(
--
Ruben van den Berg wrote
Hi Albert-Jan!
You presume correctly. I think my system resources rather than my syntax are the problem. Yesterday, I managed to process up to 338 sheets but then it crashed again (and it usually crashes much sooner). See below for the code.
Kind regards,
Ruben
*Create test data.begin program.rdir="d:/temp/"import xlwtfrom random import choiceimport datetimefrom datetime import datedates=[int((date(2011,1,1)+datetime.timedelta(days=x)).strftime("20%y%m%d"))for x in range(365)]wb=xlwt.Workbook()for date in dates:    ws=wb.add_sheet("data_"+str(date))    for col,cont in enumerate(['PersonID','type','date','events']):        ws.write(0,col,cont)    for row,resp in enumerate([104,21,60,2,1030]):                ws.write(row+1,0,resp)    ws.write(1,1,'production') # If not, this column may not have 'production' in some sheets -> string lengths!    for row in range(4):                ws.write(row+2,1,choice(['beta','production','test']))    for row in range(5):                ws.write(row+1,2,date)    for row in range(5):                ws.write(row+1,3,choice(range(20)))wb.save(os.path.join(rdir,'testbook.xls'))end program.
*Should work in theory but system resources seem to run out.
begin program.workbook=("d:/temp/testbook.xls")import xlrd,spsswb=xlrd.open_workbook(workbook)vallabs=[str(cnt+1)+"'"+nam+"'" for cnt,nam in enumerate([s.name for s in wb.sheets()])]datas=[]for cnt,s in enumerate(wb.sheets()):    datas.append("data_"+str(cnt+1))    spss.Submit("""GET DATA/TYPE=XLS/FILE='%s'/SHEET=name '%s'/CELLRANGE=full/READNAMES=on/ASSUMEDSTRWIDTH=32767.cache.exe.dataset name data_%d.comp sheet=%d.\nexe."""%(workbook,s.name,cnt+1,cnt+1))#end program.    if cnt != 0 and (cnt%49==0 or cnt+1==len(wb.sheets())):        spss.Submit("add files " + "/".join(["file="+fil for fil in datas]) + ".\nexe.\ndatas clo all.\ndatas nam data_1.")        datas=["data_1"]spss.Submit("val lab sheet %s."%" ".join(vallabs))end program.
Date: Wed, 14 Nov 2012 00:59:24 -0800
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from              memory after              reading its data?
To: [hidden email]

Hi Ruben,
 
I presume you've already checked out xlrd?
import xlrd
wb = xlrd.open_workbook('fatfile.xls')
for sheet in wb.sheet_names():
  data_of_one_sheet = [sheet.row_values(i) for i in xrange(sheet.nrows)]

  #... etc.

 
Regards,
Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





From: Ruben van den Berg <[hidden email]>
To: [hidden email] 
Sent: Wednesday, November 14, 2012 7:32 AM
Subject: Re: [SPSSX-L] Can I force SPSS to release an .xls file from memory after reading its data?






Dear Bob,


Thanks for your reply.


It actually is a Python block that should do the trick because I want the 365 sheet names in my data without needing them in my syntax. And the syntax that's mprinted back looks gorgeous! I agree that there should be no problem. But as a scientist, I must take seriously the observation that it keeps on crashing, at least on my (crappy) system.


Kind regards,


Ruben




Date: Tue, 13 Nov 2012 14:29:02 -0500
From: [hidden email]
Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
To: [hidden email]





Ruben,
 
Do the sheets follow a sequential naming convention, as in Sheet_1, Sheet_2, etc? Python or a macro should be able to read in each sheet without running into a memory issue.
 

Regards,
 
Bob Walker
Surveys & Forecasts, LLC
http://www.safllc.com/
 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben van den Berg
Sent: Tuesday, November 13, 2012 6:45 AM
To: [hidden email]
Subject: Can I force SPSS to release an .xls file from memory after reading its data?
 



Dear all,

 

I have a problem: I'm reading in an Excel workbook with 365 sheets of data. Every sheet becomes a new DataSet and every time 50 DataSets are open simultaneously, an ADD FILES command merges them into a single DataSet and then new sheets are read.

 

Unfortunately, I get an error and the weird thing is that every time I run the exact same code, I get it after a different number of sheets (20, 57, 84, 82, 67 and so on...) have been processed.

 


>Error.  Command name: GET DATA

>(2052) Error accessing the Excel file.  The file may be open by another

>application.  If so, close it and retry.

>* File: "d:\temp\testbook.xls"

>Execution of this command stops.

 

 My first guess would be that SPSS is somehow 'too slow' in releasing the workbook from memory. I tried adding

 

cache.

exe.

 

after reading each sheet but that doesn't seem to make any difference. Does anyone have an idea whether/how this could be solved?

 

TIA!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Albert-Jan Roskam
In reply to this post by Ruben Geert van den Berg
<snip>

>*Should work in theory but system resources seem to run out.
>
>
>begin program.
>workbook=("d:/temp/testbook.xls")
>import xlrd,spss
>wb=xlrd.open_workbook(workbook)
>vallabs=[str(cnt+1)+"'"+nam+"'" for cnt,nam in enumerate([s.name for s in wb.sheets()])]
>datas=[]
>for cnt,s in enumerate(wb.sheets()):
>    datas.append("data_"+str(cnt+1))

xlrd rocks. In fact it rocks so much that I wouldn't bother to use GET DATA. I suspect that you get an SPSS_FITAB_FULL error. In other words: too many datasets open at the same time. Maybe something like this? (entirely untested)

allData = (sheet.row_values(i) for i in xrange(sheet.nrows) for sheet in wb.sheet_names()) # generator expression --> small memory footprint
varNames  = ['x', 'y', 'z']
varTypes  = {'x': 0, 'y': 50, 'z': 20}
with spss.DataStep():
   datasetObj = spss.Dataset(name=None)
   for varName in varNames:
     if varTypes[varName] == 0:

       datasetObj.varlist.append(varName)
     else:
       datasetObj.varlist.append(varName, varTypes[varName])

   for line in allData:
     datasetObj.cases.append(list(line))

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can I force SPSS to release an .xls file from memory after reading its data?

Ruben Geert van den Berg
Dear Albert-Jan,

I wasn't familiar with "datasetObj" but after some tinkering, the code below runs like an antilope. I'd almost say it's even better than my own attempt ;-)

Thanks a lot!!

Ruben

begin program.
allData = ([sheet.name]+sheet.row_values(i) for sheet in wb.sheets() for i in xrange(1,sheet.nrows))
varNames = ['sheet','PersonID','type','date','events']
varTypes = {'sheet':20,'PersonID': 0, 'type': 12, 'date': 0,'events':0}
with spss.DataStep():
    datasetObj = spss.Dataset(name=None)
    for varName in varNames:
        if varTypes[varName] == 0:
            datasetObj.varlist.append(varName)
        else:
            datasetObj.varlist.append(varName, varTypes[varName])
    for line in allData:
        datasetObj.cases.append(list(line))
end program.




> Date: Wed, 14 Nov 2012 04:33:16 -0800

> From: [hidden email]
> Subject: Re: Can I force SPSS to release an .xls file from memory after reading its data?
> To: [hidden email]
>
> <snip>
> >*Should work in theory but system resources seem to run out.
> >
> >
> >begin program.
> >workbook=("d:/temp/testbook.xls")
> >import xlrd,spss
> >wb=xlrd.open_workbook(workbook)
> >vallabs=[str(cnt+1)+"'"+nam+"'" for cnt,nam in enumerate([s.name for s in wb.sheets()])]
> >datas=[]
> >for cnt,s in enumerate(wb.sheets()):
> > datas.append("data_"+str(cnt+1))
>
> xlrd rocks. In fact it rocks so much that I wouldn't bother to use GET DATA. I suspect that you get an SPSS_FITAB_FULL error. In other words: too many datasets open at the same time. Maybe something like this? (entirely untested)
>
> allData = (sheet.row_values(i) for i in xrange(sheet.nrows) for sheet in wb.sheet_names()) # generator expression --> small memory footprint
> varNames = ['x', 'y', 'z']
> varTypes = {'x': 0, 'y': 50, 'z': 20}
> with spss.DataStep():
> datasetObj = spss.Dataset(name=None)
> for varName in varNames:
> if varTypes[varName] == 0:
>
> datasetObj.varlist.append(varName)
> else:
> datasetObj.varlist.append(varName, varTypes[varName])
>
> for line in allData:
> datasetObj.cases.append(list(line))
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD