memory problems

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

memory problems

Zach Creel
I seem to have alot of memory problems in SPSS v18.0.1.  I'm wondering if
anyone can point me in the right direction.

I have a fairly long syntax file with lots of embedded Python.  When I run
this same file over and over, spss apparently grabs onto more and more
memory each time, as evidenced by the spss process in task manager.
Eventually, spss will just hang in the middle of processing.  My machine has
4 Gig RAM and I get the same problems whether on Windows Vista or 7.

Is it possible there's something I need to do in either syntax or python to
force spss to release memory?  Or maybe it's some kind of operating system
problem?  Thanks for any clues.

Zach

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Jason Burke
Hi Zach,

Which, in your observation, is consuming all the memory? There will be
two things to look at:

paswstat, and
spssengine

Cheers,


Jason

On Wed, Mar 10, 2010 at 6:49 PM, Zach Creel <[hidden email]> wrote:

> I seem to have alot of memory problems in SPSS v18.0.1.  I'm wondering if
> anyone can point me in the right direction.
>
> I have a fairly long syntax file with lots of embedded Python.  When I run
> this same file over and over, spss apparently grabs onto more and more
> memory each time, as evidenced by the spss process in task manager.
> Eventually, spss will just hang in the middle of processing.  My machine has
> 4 Gig RAM and I get the same problems whether on Windows Vista or 7.
>
> Is it possible there's something I need to do in either syntax or python to
> force spss to release memory?  Or maybe it's some kind of operating system
> problem?  Thanks for any clues.
>
> Zach
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Zach Creel
In reply to this post by Zach Creel
Thanks for the reply Jason.  Here's what's happening.

This is on Vista 64 (but similar things seem to happen on Vista 32 & also
Windows 7)

When i open SPSS, here's what my processes are using:

spssengine.exe = 124,300K
spssengine.exe = 24,456K
startpythion.exe = 10,900K

Then I run my (fairly long 3000-line) SPSS/Python code on only 400 cases.

It finishes fine, without error.  But now look at the memory usage:

spssengine.exe = 1,504,112K
paswstat.exe = 428,056K
startpythion.exe = 11,920K

Now, if I run again, exact same code, it fails with this Python error:

Traceback (most recent call last):
File "<string>", line 64, in <module>
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 1209, in
attributes
return getAttributesDict(self.VariableName(id))
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 463, in
getAttributesDict
attnames = spss.EvaluateXPath(tag, '/outputTree', attnamespath)
File "C:\Python26\lib\site-packages\spss180\spss\spss.py", line 414, in
EvaluateXPath
raise SpssError,error
spss.errMsg.SpssError: [errLevel 12] Invalid handle object.

And SPSS itself crashes, throwing a Windows error message: "spssengine has
stopped working - A problem has caused the program to stop working
correctly.  Windows will close the program."

Thanks for any thoughts you may have on this.

Zach

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Albert-Jan Roskam
Hi,

Interesting. What code do you have around line 64? Do you have very large objects? If so, you could delete them, then explicitly run garbage collection. If you have big loops, you could easily replace those with generators. I've been experimenting a bit with memory use on both Linux and Windows XP. Here is some Python code and Linux bash shell output. In XP, the results are comparable.

Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15)
[GCC 4.4.1] on linux2
Type "copyright", "credits" or "license()" for more information.
   
IDLE 2.6.4      ==== No Subprocess ====
>>> import gc
>>> big = range(10**7)
>>> dir()
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'big', 'gc', 'main']

## This is the total amount of used and free memory:
$ free
             total       used       free     shared    buffers     cached
Mem:       1801744    1759280      42464          0       4464      34876
-/+ buffers/cache:    1719940      81804
Swap:       176672     121460      55212

>>> del(big)
>>> gc.collect()
0

## Now the total amount of used memory has decreased quite dramatically:

$ free
             total       used       free     shared    buffers     cached
Mem:       1801744    1367548     434196          0       4628      40372
-/+ buffers/cache:    1322548     479196
Swap:       176672     133376      43296


Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the face of ambiguity, refuse the temptation to guess.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- On Sat, 3/20/10, Zach Creel <[hidden email]> wrote:

From: Zach Creel <[hidden email]>
Subject: Re: [SPSSX-L] memory problems
To: [hidden email]
Date: Saturday, March 20, 2010, 1:01 AM

Thanks for the reply Jason.  Here's what's happening.

This is on Vista 64 (but similar things seem to happen on Vista 32 & also
Windows 7)

When i open SPSS, here's what my processes are using:

spssengine.exe = 124,300K
spssengine.exe = 24,456K
startpythion.exe = 10,900K

Then I run my (fairly long 3000-line) SPSS/Python code on only 400 cases.

It finishes fine, without error.  But now look at the memory usage:

spssengine.exe = 1,504,112K
paswstat.exe = 428,056K
startpythion.exe = 11,920K

Now, if I run again, exact same code, it fails with this Python error:

Traceback (most recent call last):
File "<string>", line 64, in <module>
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 1209, in
attributes
return getAttributesDict(self.VariableName(id))
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 463, in
getAttributesDict
attnames = spss.EvaluateXPath(tag, '/outputTree', attnamespath)
File "C:\Python26\lib\site-packages\spss180\spss\spss.py", line 414, in
EvaluateXPath
raise SpssError,error
spss.errMsg.SpssError: [errLevel 12] Invalid handle object.

And SPSS itself crashes, throwing a Windows error message: "spssengine has
stopped working - A problem has caused the program to stop working
correctly.  Windows will close the program."

Thanks for any thoughts you may have on this.

Zach

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Jon K Peck
In reply to this post by Zach Creel

It's unlikely, IMO, that the problem has anything to do with Python.  The problem seems to be in the spssengine process, which has grown to an enormous amount of memory.  I suspect that the Python failure is actually due to the backend having run out of resources and been unable to do some operations with the xmlworkspace.  So the question again comes down to what is that backend code doing?  There are a few known memory leaks in the backend that have been addressed in the forthcoming 18.0.2 patch.  

However, one thing to check is that if you are writing a lot of output into the xmlworkspace, are you deleting these items after use via the spss.DeleteXPathHandle api?  You might also call spss.GetHandleList periodically to see whether you are accumulating items in the workspace that your Python code is not handling.

HTH,

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Zach Creel <[hidden email]>
To: [hidden email]
Date: 03/19/2010 09:31 PM
Subject: Re: [SPSSX-L] memory problems
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Thanks for the reply Jason.  Here's what's happening.

This is on Vista 64 (but similar things seem to happen on Vista 32 & also
Windows 7)

When i open SPSS, here's what my processes are using:

spssengine.exe = 124,300K
spssengine.exe = 24,456K
startpythion.exe = 10,900K

Then I run my (fairly long 3000-line) SPSS/Python code on only 400 cases.

It finishes fine, without error.  But now look at the memory usage:

spssengine.exe = 1,504,112K
paswstat.exe = 428,056K
startpythion.exe = 11,920K

Now, if I run again, exact same code, it fails with this Python error:

Traceback (most recent call last):
File "<string>", line 64, in <module>
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 1209, in
attributes
return getAttributesDict(self.VariableName(id))
File "C:\Python26\lib\site-packages\spssaux\spssaux.py", line 463, in
getAttributesDict
attnames = spss.EvaluateXPath(tag, '/outputTree', attnamespath)
File "C:\Python26\lib\site-packages\spss180\spss\spss.py", line 414, in
EvaluateXPath
raise SpssError,error
spss.errMsg.SpssError: [errLevel 12] Invalid handle object.

And SPSS itself crashes, throwing a Windows error message: "spssengine has
stopped working - A problem has caused the program to stop working
correctly.  Windows will close the program."

Thanks for any thoughts you may have on this.

Zach

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Zach Creel
In reply to this post by Zach Creel
Thanks Albert-Jan, I didn't know you could explicitly request the python
garbage collector to run.  I will definitely experiment with that ...

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Zach Creel
In reply to this post by Zach Creel
Thanks for the info Jon.  It is possible I'm overwhelming the xmlworkspace,
I do run alot of output (mostly debugging info).  However I tried disabling
all output to the Viewer and it had no effect.

I am not writing directly to the xmlworkspace via Python, but I suspect I am
indirectly writing to it somewhere.  What is it used for?  Is it just for
output, or is it also used for in-memory data processing?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: memory problems

Jon K Peck

The xml workspace is only used when you specify the xmlworkspace keyword, but suppressing Viewer output via VIEWER=NO does hold output in memory until OMSEND (just not in the xmlworkspace).  We recently found a bug where not all the OMS memory was being returned under some circumstances, but that only occurs under special circumstances.  You could try eliminating OMS altogether just to see if the memory growth can be pinned to that.  I think a fix for that problem is going into 18.0.2, which will be released very soon.

The xml workspace belongs to the backend process, not the Python process, by the way.
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Zach Creel <[hidden email]>
To: [hidden email], Jon K Peck/Chicago/IBM@IBMUS
Date: 03/23/2010 01:53 PM
Subject: Re: memory problems





Thanks for the info Jon.  It is possible I'm overwhelming the xmlworkspace,
I do run alot of output (mostly debugging info).  However I tried disabling
all output to the Viewer and it had no effect.

I am not writing directly to the xmlworkspace via Python, but I suspect I am
indirectly writing to it somewhere.  What is it used for?  Is it just for
output, or is it also used for in-memory data processing?


Reply | Threaded
Open this post in threaded view
|

A New Extension Command on SPSS Developer Central

Jon K Peck

I have posted a new extension command, SPSSINC SELECT VARIABLES, to SPSS Developer Central, www.spss.com/devcentral.  The new command creates an SPSS Statistics macro that lists variables selected according to criteria such as the variable type (numeric or string), the measurement level, patterns in the names, e. g., ends with "education", and custom variable attributes.

Using this command can simplify your syntax, and it allows you to create jobs that are more general than if the syntax has to have knowledge of all the variable names in the dataset.

I have blogged about this command and its uses on insideout.spss.com.

The command, which as usual is free, can be downloaded from the Downloads section of Developer Central.  The package includes a dialog box interface as well as the syntax definition and implementation.  It requires the Python programmability plugin and at least Version 17.

I hope you find this useful.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435