|
It seems that the type of data files I analyze become bigger at the outset and then become even bigger as I create variables and run various types of analysis. As a consequence and as my c: drive becomed increasingly stacked with my saved data or miscellaneous files, the processing time becomes slower and slower. I keep my c: drive relatively clear and use an external hard drive for that purpose. Also, I use the temp file option when processing off of the external hard drive but still, the processing time seems to grow slower.
Currently, I use the SPSS 15 & SPSS 16 desktop versions and rely upon an Intel dual core processor, 2.33 GHz, 667 Mhz and 4.0GB SDRAM, and a 160GB hard drive. I'm contemplating going to a more powerful desktop but before I plunge into my pocketbook, I was wondering if an SPSS user could recommend a hardware set-up that will give me the faster processing times. I've not used a server version but I'm open to whatever the best recommendation is for processing bigger and bigger files. Thanks. [hidden email] ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi,
Your hardware configuration doesn't sound too shabby to me at all. Of course, you can always buy faster, bigger, better, hardware, but still. I also work routinely with huge files, and one way to speed things up is to use the SAMPLE command in conjunction with the SET SEED command. If your file is not sorted in a particular way, N OF CASES is also an option. Set seed allows you to reproduce the random sample, and you can use the Mersienne Twister algorithm for sampling: SET SEED = 4321 RNG = MT. SAMPLE .01. In addition, you can increase the workspace: SHOW WORKSPACE. SET WORKSPACE 20000. You'll get error messages when you set it to high. After you've tuned/debugged your syntax, you can run it on the entire dataset. Meanwhile, you may have to grab a cup of coffee --not a punishment in my eyes ;-) More generally, it saves a lot of time to save up on data passes. One way to do that is to use EXECUTE sparingly (see Spss programming and data management by Raynald Levesque, freely downloadble; see also this list). Cheers!! Albert-Jan --- On Thu, 7/24/08, Charley Trimble <[hidden email]> wrote: > From: Charley Trimble <[hidden email]> > Subject: Hardware recommendations? > To: [hidden email] > Date: Thursday, July 24, 2008, 7:01 PM > It seems that the type of data files I analyze become bigger > at the outset and then become even bigger as I create > variables and run various types of analysis. As a > consequence and as my c: drive becomed increasingly > stacked with my saved data or miscellaneous files, the > processing time becomes slower and slower. I keep my c: > drive relatively clear and use an external hard drive for > that purpose. Also, I use the temp file option when > processing off of the external hard drive but still, the > processing time seems to grow slower. > Currently, I use the SPSS 15 & SPSS 16 desktop versions > and rely upon an Intel dual core processor, 2.33 GHz, 667 > Mhz and 4.0GB SDRAM, and a 160GB hard drive. > I'm contemplating going to a more powerful desktop but > before I plunge into my pocketbook, I was wondering if an > SPSS user could recommend a hardware set-up that will give > me the faster processing times. I've not used a > server version but I'm open to whatever the best > recommendation is for processing bigger and bigger files. > Thanks. > [hidden email] > > ====================To manage your subscription to SPSSX-L, > send a message to > [hidden email] (not to SPSSX-L), with no body > text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the > command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Your hardware configuration looks fine. What graphic card do you have?
I have been shopping for a laptop/desktop and the salesperson mentioned that many of the analysis/modeling software have begun to rely more heavily on graphic cards to do the computing. They do this in order to reduce the processing time for very complex computation. Good luck! Joanne -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Albert-jan Roskam Sent: Thursday, July 24, 2008 12:32 PM To: [hidden email] Subject: Re: Hardware recommendations? Hi, Your hardware configuration doesn't sound too shabby to me at all. Of course, you can always buy faster, bigger, better, hardware, but still. I also work routinely with huge files, and one way to speed things up is to use the SAMPLE command in conjunction with the SET SEED command. If your file is not sorted in a particular way, N OF CASES is also an option. Set seed allows you to reproduce the random sample, and you can use the Mersienne Twister algorithm for sampling: SET SEED = 4321 RNG = MT. SAMPLE .01. In addition, you can increase the workspace: SHOW WORKSPACE. SET WORKSPACE 20000. You'll get error messages when you set it to high. After you've tuned/debugged your syntax, you can run it on the entire dataset. Meanwhile, you may have to grab a cup of coffee --not a punishment in my eyes ;-) More generally, it saves a lot of time to save up on data passes. One way to do that is to use EXECUTE sparingly (see Spss programming and data management by Raynald Levesque, freely downloadble; see also this list). Cheers!! Albert-Jan --- On Thu, 7/24/08, Charley Trimble <[hidden email]> wrote: > From: Charley Trimble <[hidden email]> > Subject: Hardware recommendations? > To: [hidden email] > Date: Thursday, July 24, 2008, 7:01 PM > It seems that the type of data files I analyze become bigger > at the outset and then become even bigger as I create > variables and run various types of analysis. As a > consequence and as my c: drive becomed increasingly > stacked with my saved data or miscellaneous files, the > processing time becomes slower and slower. I keep my c: > drive relatively clear and use an external hard drive for > that purpose. Also, I use the temp file option when > processing off of the external hard drive but still, the > processing time seems to grow slower. > Currently, I use the SPSS 15 & SPSS 16 desktop versions > and rely upon an Intel dual core processor, 2.33 GHz, 667 > Mhz and 4.0GB SDRAM, and a 160GB hard drive. > I'm contemplating going to a more powerful desktop but > before I plunge into my pocketbook, I was wondering if an > SPSS user could recommend a hardware set-up that will give > me the faster processing times. I've not used a > server version but I'm open to whatever the best > recommendation is for processing bigger and bigger files. > Thanks. > [hidden email] > > ====================To manage your subscription to SPSSX-L, > send a message to > [hidden email] (not to SPSSX-L), with no body > text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the > command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Charley Trimble
At 01:01 PM 7/24/2008, Charley Trimble wrote:
>It seems that the type of data files I analyze >become bigger at the outset and then become even >bigger as I create variables and run various >types of analysis. As a consequence and as my >c: drive becomed increasingly stacked with my >saved data or miscellaneous files, the >processing time becomes slower and slower. I >keep my c: drive relatively clear and use an >external hard drive for that purpose. > >I rely upon an Intel dual core processor, 2.33 GHz, 667 Mhz and 4.0GB SDRAM Those are probably much more than adequate for your work. I doubt you'd see any improvement from an upgrade. >... and a 160GB hard drive. As my c: drive >becomed increasingly stacked with my saved data >or miscellaneous files, the processing time >becomes slower and slower. I keep my c: drive >relatively clear and use an external hard drive for that purpose. In most SPSS runs, most of the time is spent reading from disk, or writing to disk. That's probably true of yours. Always, look first to change the code and logic to use less disk traffic. If you use EXECUTE (exe.) statements, remove all except those very few that are needed. (See "Use EXECUTE Sparingly" in any edition of Levesque, Raynald, "SPSS® Programming and Data Management, A Guide for SPSS® and SAS® Users". SPSS, Inc., Chicago, IL, various dates. It is downloadable free from the SPSS, Inc., Web site.) Then, it's a good idea to use CACHE after any transformation program that drops most of the variables or most of the cases, and to create analysis files with only the variables needed for the analysis. Beyond that, one would have to look at your code. >Also, I use the temp file option when processing >off of the external hard drive That may be a problem. Check the data transfer rate for the external drive; it may be much slower than for the internal drive. >my c: drive becomed increasingly stacked with my >saved data or miscellaneous files Another warning sign. It's wise, simply on general principles, to have free disk space of several times the total needed for your data. Among other things, when space is tight, disk fragmentation will get worse more rapidly. (You could try de-fragmenting.) But the hardware improvements that are likely to help are, . A larger c:\ drive, looking for one with the highest possible transfer rate . A second internal hard drive, also with high transfer rate; and then adjusting your logic so that, where possible, data is being read from one of the drives and written to the other. -Best of luck to you, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Just a couple of comments:
- There is a nice freeware utility called ATF-cleaner (easy to find, if you don't I can send it to you, it's 50 kb) that cleans the drive of all garbage and temporary files. Follow the cleaning with a good defragmentation (I have that scheduled for fridays) and disk speed will increase a lot. - Set the temporary folder used by SPSS in a different drive than Windows swap file. This works like a charm for Adobe Photoshop when handling very big picture files (in raw format). Best regards, Marta > > Another warning sign. It's wise, simply on > general principles, to have free disk space of > several times the total needed for your data. > Among other things, when space is tight, disk > fragmentation will get worse more rapidly. (You could try > de-fragmenting.) > > But the hardware improvements that are likely to help are, > > . A larger c:\ drive, looking for one with the highest possible > transfer rate > > . A second internal hard drive, also with high > transfer rate; and then adjusting your logic so > that, where possible, data is being read from one > of the drives and written to the other. > > -Best of luck to you, > Richard > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > -- For miscellaneous statistical stuff, visit: http://gjyp.nl/marta/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
