|
|
The first thing to consider is that Windows itself caches a lot of things in memory already. It tends to use unallocated memory for extra i/o buffers and to keep loaded modules around in memory as long as the memory isn't needed for something else. So I doubt that there are many situations where a RAM disk would help. A RAM disk would partition off some of the memory specifically for file contents, but the tradeoff is that that memory would not be available for other purposes, so it might induce more paging to disk in other areas. I haven't used a RAM myself in quite a few years, but my guess is that it is no help in most situations. There are two things that might help. First, a 64-bit OS with 64-bit SPSS would allow addressing more memory and, with more physical memory, could speed up processing. Second, a rewrite of the job using Python programmability could probably eliminate most of the data passes and build the hierarchical relationships much more efficiently. 40,000 cases is not a lot of data, so it's likely that all the hierarchy traversals could be built in memory. So the tradeoffs are throw more hardware at the problem or throw more programming resources at it. HTH, Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
|
|
In reply to this post by David Futrell
At 09:37 AM 1/29/2010, David Futrell wrote:
>I have a syntax file that I run frequently that reads an employee >database (40,000 records), looks at the employee to supervisor >relationship, and ultimitately creates an hierarchy such that the >resulting file contains every supervisor above a certain level in >the organization and everyone who reports to him and anyone else >below him in the organization. That's called 'transitive closure': in your case, if A supervises B and B supervises C, than A supervises C. I'm not digging out the last code I wrote to do it, but it takes a number of passes about like the longest chain. That's still a multiple-pass loop, and Python's a good choice to run it, including determining when it's time to stop. Alternatively, Jon Peck may be thinking of stepping aside from native SPSS altogether, and managing the data entirely in Python. -Looping onward, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by David Futrell
In addition to files you name in your commands, Statistics uses lots of temporary disk files. They are all located in the same directory, and you can specify that directory. Look in Edit->Options->File Locations for the temporary folder name. |
| Free forum by Nabble | Edit this page |
