|
Hi all-
I *think* I need to increase my cache, but wasn't able to find anything in the documentation about the default value or what a good range might be to increase to or toward. My file is roughly 1 million cases & 180 variables, and the former will be increasing over time. Thanks, Brian PS- below syntax was run with Insert File (which generates the line numbers to the left of the code in the output) 211 * Identify Duplicate Cases. 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). 213 MATCH FILES /FILE = * /BY cust 214 /FIRST = PFirstOrderDate /LAST = PrimaryLast. 215 DO IF (PFirstOrderDate). 216 COMPUTE MatchSequence = 1 - PrimaryLast. 217 ELSE. 218 COMPUTE MatchSequence = MatchSequence + 1. 219 END IF. 220 LEAVE MatchSequence. 221 FORMAT MatchSequence (f7). 222 COMPUTE InDupGrp = MatchSequence > 0. 223 SORT CASES InDupGrp(D). >Error. Command name: SORT CASES >File write error: file name C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on device (DATA1002) >This command not executed. >Error # 5817. Command name: SORT CASES >The SPSS file sort has failed. The file remains unsorted. The specific >problem is printed below. >Error # 5822. Command name: SORT CASES >I/O error writing to the sort scratch file. 224 MATCH FILES /FILE = * /DROP = PrimaryLast InDupGrp MatchSequence. 225 VARIABLE LABELS PFirstOrderDate 'Indicator of each first matching case as Primary' . 226 VALUE LABELS PFirstOrderDate 0 'Duplicate Case' 1 'Primary Case'. 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). 228 FREQUENCIES VARIABLES = PFirstOrderDate . ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Brian
My guess is you are out of room on the drive SPSS uses for its scratch files . On a standalone installation it creates a bunch of directories in c:\my documents\local settings\temp for its scratch files. Ff your sort falls over, odds are SPSS will not have cleaned up its old scratch files. The directory above tends to fill up with all sorts of rubbish that things like IE and the operating system itself leave around and rarely clean up, so you may find you can get back lots more room by fossicking around in there. Clean up as much space as you can - you will need several times the size of the file you are sorting. In my experience with version up to 15 (haven't got 16 so I can't say) SET CACHE makes little difference no matter how much physical memory you have. Big sorts just thrash the disk and take forever. Regards Adrian Barnett -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Moore Sent: Tuesday, 27 November 2007 2:07 AM To: [hidden email] Subject: set cache reasonable values Hi all- I *think* I need to increase my cache, but wasn't able to find anything in the documentation about the default value or what a good range might be to increase to or toward. My file is roughly 1 million cases & 180 variables, and the former will be increasing over time. Thanks, Brian PS- below syntax was run with Insert File (which generates the line numbers to the left of the code in the output) 211 * Identify Duplicate Cases. 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). 213 MATCH FILES /FILE = * /BY cust 214 /FIRST = PFirstOrderDate /LAST = PrimaryLast. 215 DO IF (PFirstOrderDate). 216 COMPUTE MatchSequence = 1 - PrimaryLast. 217 ELSE. 218 COMPUTE MatchSequence = MatchSequence + 1. 219 END IF. 220 LEAVE MatchSequence. 221 FORMAT MatchSequence (f7). 222 COMPUTE InDupGrp = MatchSequence > 0. 223 SORT CASES InDupGrp(D). >Error. Command name: SORT CASES >File write error: file name C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on device (DATA1002) >This command not executed. >Error # 5817. Command name: SORT CASES >The SPSS file sort has failed. The file remains unsorted. The specific >problem is printed below. >Error # 5822. Command name: SORT CASES >I/O error writing to the sort scratch file. 224 MATCH FILES /FILE = * /DROP = PrimaryLast InDupGrp MatchSequence. 225 VARIABLE LABELS PFirstOrderDate 'Indicator of each first matching case as Primary' . 226 VALUE LABELS PFirstOrderDate 0 'Duplicate Case' 1 'Primary Case'. 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). 228 FREQUENCIES VARIABLES = PFirstOrderDate . ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Brian Moore-3
At 10:36 AM 11/26/2007, Brian Moore wrote:
>I *think* I need to increase my cache, but wasn't able to find >anything in the documentation about the default value or what a good >range might be to increase to or toward. > >My file is roughly 1 million cases & 180 variables, and the former >will be increasing over time. Here's the problem you cite. I'm quoting the error messages in full: > 222 COMPUTE InDupGrp = MatchSequence > 0. > 223 SORT CASES InDupGrp(D). > >>Error. Command name: SORT CASES >>File write error: file name >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left >>on device (DATA1002) >>This command not executed. > >>Error # 5817. Command name: SORT CASES >>The SPSS file sort has failed. The file remains unsorted. The >>specific problem is printed below. > >>Error # 5822. Command name: SORT CASES >>I/O error writing to the sort scratch file. As Adrian Barnett, said, on the face of it, this is simple: your working disk filled, during the sort. That means you need more free disk space, not a change in any SPSS setting. The usual estimate is you should have free space 2-3 times the file size, to run a sort well. HOWEVER, in your case there are probably ways around, besides adding more disk space. In the first place, the code you posted following this sort command, doesn't rely on the data being sorted by 'InDupGrp': > 224 MATCH FILES /FILE = * > /DROP = PrimaryLast InDupGrp MatchSequence. > 225 VARIABLE LABELS > PFirstOrderDate > 'Indicator of each first matching case as Primary' . > 226 VALUE LABELS PFirstOrderDate > 0 'Duplicate Case' > 1 'Primary Case'. > 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). > 228 FREQUENCIES VARIABLES = PFirstOrderDate . I suppose you do need the sort later for something, though. Second, if you do want to sort by a binary variable, it's probably faster to split the file and merge. It should take storage for ONE copy of your file, and should be reasonably fast. Like this, assuming that InDupYes and InDupNo are file names or file handles (NOT dataset names) that can be used for the scratch files. Not tested: DO IF InDupGrp EQ 1. . XSAVE OUTFILE=InDupYes. ELSE. . XSAVE OUTFILE=InDupNo. END IF. EXECUTE. ADD FILES /FILE=InDupYes /FILE=InDupNo. By the way, I see that earlier in the job, command > 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). ran successfully, on the same file. It may have been happenstance, with the scratch files just making it the first time; but it's more likely that, if the file was already sorted by at least 'cust', SPSS used an adaptive-sort algorithm that took advantage of the file's already being nearly in order. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Richard,
I have talked to the SPSS Help Desk about this issue. Also, I deal with datafiles approximately the size you are describing. Here are some of the suggestion, which I've passed on before. 1. Set Workspace (Total RAM memory) = ,say, 300M for Windows. As I have 2GB of RAM, and a 93GB hard drive on a laptop, with about 22GB free, I Set Workpace = 300000 (the units are thousands). However, there havef been times when I apparently Set Workspace too large. If one employs the command Show Workspace., one can find out what I assume is the default value. Another individual at the SPSS Help desk suggested that on 30M might be a more appropriate.size for Set Workspace. 2. Under "File," select "Cache Data" before running the procedure. 3. Go to Edit >> Options, and determine the location of the temporary directory. Delete the SPSS files in this directory. In fact, it has been suggested that I create a new directory for the temporary files, that it is dedicated to SPSS. This step can be quite important if one hasn't cleared the temporary files recently.. 4. If you are running a Sort command, first do Transform >> Automatic Recode for the String variables, and run the procedure using the Recoded variables. Numeric variables are easier to sort. After doing all this, if I still have problems I reboot the computer. Frequently, this seems to work best. There must be some way of emptying out the entire cache on the computer (the temporary storage I only dimly understand), but no one has ever been able to tell me how to do this. Greg On 11/26/07, Richard Ristow <[hidden email]> wrote: > > At 10:36 AM 11/26/2007, Brian Moore wrote: > > >I *think* I need to increase my cache, but wasn't able to find > >anything in the documentation about the default value or what a good > >range might be to increase to or toward. > > > >My file is roughly 1 million cases & 180 variables, and the former > >will be increasing over time. > > Here's the problem you cite. I'm quoting the error messages in full: > > > 222 COMPUTE InDupGrp = MatchSequence > 0. > > 223 SORT CASES InDupGrp(D). > > > >>Error. Command name: SORT CASES > >>File write error: file name > >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left > >>on device (DATA1002) > >>This command not executed. > > > >>Error # 5817. Command name: SORT CASES > >>The SPSS file sort has failed. The file remains unsorted. The > >>specific problem is printed below. > > > >>Error # 5822. Command name: SORT CASES > >>I/O error writing to the sort scratch file. > > As Adrian Barnett, said, on the face of it, this is simple: your > working disk filled, during the sort. That means you need more free > disk space, not a change in any SPSS setting. The usual estimate is > you should have free space 2-3 times the file size, to run a sort well. > > HOWEVER, in your case there are probably ways around, besides adding > more disk space. > > In the first place, the code you posted following this sort command, > doesn't rely on the data being sorted by 'InDupGrp': > > > 224 MATCH FILES /FILE = * > > /DROP = PrimaryLast InDupGrp MatchSequence. > > 225 VARIABLE LABELS > > PFirstOrderDate > > 'Indicator of each first matching case as Primary' . > > 226 VALUE LABELS PFirstOrderDate > > 0 'Duplicate Case' > > 1 'Primary Case'. > > 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). > > 228 FREQUENCIES VARIABLES = PFirstOrderDate . > > I suppose you do need the sort later for something, though. > > Second, if you do want to sort by a binary variable, it's probably > faster to split the file and merge. It should take storage for ONE > copy of your file, and should be reasonably fast. Like this, assuming > that InDupYes and InDupNo are file names or file handles (NOT dataset > names) that can be used for the scratch files. Not tested: > > DO IF InDupGrp EQ 1. > . XSAVE OUTFILE=InDupYes. > ELSE. > . XSAVE OUTFILE=InDupNo. > END IF. > EXECUTE. > > ADD FILES > /FILE=InDupYes > /FILE=InDupNo. > > By the way, I see that earlier in the job, command > > > 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). > > ran successfully, on the same file. It may have been happenstance, > with the scratch files just making it the first time; but it's more > likely that, if the file was already sorted by at least 'cust', SPSS > used an adaptive-sort algorithm that took advantage of the file's > already being nearly in order. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks to everyone for insights on this issue.
I have succeeded in cleaning out the cache, but still getting warnings. Apparently this is the maximum workspace (& from the warning <end of message> appears to be a software determined maximum, not anything to do with computing power) and I'm still getting the error. Other specs that may matter: -using version 15 (but are looking at upgrading to 16) -File size is 420 MB -& I have ~20 gigs total free. Any other ideas? Thanks, Brian extracts from the output file appear below. >The parameter of the WORKSPACE subcommand is in terms of kilobytes (KB). >It must be at least 6148 and not greater than 2097151. set workspace 2097151. show workspace. System Settings Keyword Description Setting WORKSPACE Special workspace memory limit in kilobytes 2097151 >Warning # 44 >The operating system could not allocate a memory segment of the size >requested by SPSS. Therefore, SPSS can only use the largest block >available in memory already allocated. Use SHOW WORKSPACE to determine the >size of the request and SET WORKSPACE to change it. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gregory Hildebrandt Sent: Tuesday, November 27, 2007 10:42 AM To: [hidden email] Subject: Re: set cache reasonable values Richard, I have talked to the SPSS Help Desk about this issue. Also, I deal with datafiles approximately the size you are describing. Here are some of the suggestion, which I've passed on before. 1. Set Workspace (Total RAM memory) = ,say, 300M for Windows. As I have 2GB of RAM, and a 93GB hard drive on a laptop, with about 22GB free, I Set Workpace = 300000 (the units are thousands). However, there havef been times when I apparently Set Workspace too large. If one employs the command Show Workspace., one can find out what I assume is the default value. Another individual at the SPSS Help desk suggested that on 30M might be a more appropriate.size for Set Workspace. 2. Under "File," select "Cache Data" before running the procedure. 3. Go to Edit >> Options, and determine the location of the temporary directory. Delete the SPSS files in this directory. In fact, it has been suggested that I create a new directory for the temporary files, that it is dedicated to SPSS. This step can be quite important if one hasn't cleared the temporary files recently.. 4. If you are running a Sort command, first do Transform >> Automatic Recode for the String variables, and run the procedure using the Recoded variables. Numeric variables are easier to sort. After doing all this, if I still have problems I reboot the computer. Frequently, this seems to work best. There must be some way of emptying out the entire cache on the computer (the temporary storage I only dimly understand), but no one has ever been able to tell me how to do this. Greg On 11/26/07, Richard Ristow <[hidden email]> wrote: > > At 10:36 AM 11/26/2007, Brian Moore wrote: > > >I *think* I need to increase my cache, but wasn't able to find > >anything in the documentation about the default value or what a good > >range might be to increase to or toward. > > > >My file is roughly 1 million cases & 180 variables, and the former > >will be increasing over time. > > Here's the problem you cite. I'm quoting the error messages in full: > > > 222 COMPUTE InDupGrp = MatchSequence > 0. > > 223 SORT CASES InDupGrp(D). > > > >>Error. Command name: SORT CASES > >>File write error: file name > >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on > >>device (DATA1002) This command not executed. > > > >>Error # 5817. Command name: SORT CASES The SPSS file sort has > >>failed. The file remains unsorted. The specific problem is printed > >>below. > > > >>Error # 5822. Command name: SORT CASES I/O error writing to the > >>sort scratch file. > > As Adrian Barnett, said, on the face of it, this is simple: your > working disk filled, during the sort. That means you need more free > disk space, not a change in any SPSS setting. The usual estimate is > you should have free space 2-3 times the file size, to run a sort well. > > HOWEVER, in your case there are probably ways around, besides adding > more disk space. > > In the first place, the code you posted following this sort command, > doesn't rely on the data being sorted by 'InDupGrp': > > > 224 MATCH FILES /FILE = * > > /DROP = PrimaryLast InDupGrp MatchSequence. > > 225 VARIABLE LABELS > > PFirstOrderDate > > 'Indicator of each first matching case as Primary' . > > 226 VALUE LABELS PFirstOrderDate > > 0 'Duplicate Case' > > 1 'Primary Case'. > > 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). > > 228 FREQUENCIES VARIABLES = PFirstOrderDate . > > I suppose you do need the sort later for something, though. > > Second, if you do want to sort by a binary variable, it's probably > faster to split the file and merge. It should take storage for ONE > copy of your file, and should be reasonably fast. Like this, assuming > that InDupYes and InDupNo are file names or file handles (NOT dataset > names) that can be used for the scratch files. Not tested: > > DO IF InDupGrp EQ 1. > . XSAVE OUTFILE=InDupYes. > ELSE. > . XSAVE OUTFILE=InDupNo. > END IF. > EXECUTE. > > ADD FILES > /FILE=InDupYes > /FILE=InDupNo. > > By the way, I see that earlier in the job, command > > > 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). > > ran successfully, on the same file. It may have been happenstance, > with the scratch files just making it the first time; but it's more > likely that, if the file was already sorted by at least 'cust', SPSS > used an adaptive-sort algorithm that took advantage of the file's > already being nearly in order. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > the command. To leave the list, send the command SIGNOFF SPSSX-L For a > list of commands to manage subscriptions, send the command INFO > REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
At 02:28 PM 11/28/2007, Brian Moore wrote:
>Thanks to everyone for insights on this issue. > >I have succeeded in cleaning out the cache, but still getting >warnings. Apparently [2097151 bytes] is a software determined >maximum [and cannot always be reached]: > >>Warning # 44 >>The operating system could not allocate a memory segment of the >>size requested by SPSS. > >Other specs that may matter: >-using version 15 (but are looking at upgrading to 16) >-File size is 420 MB >-& I have ~20 gigs total free. Well, that appears to rule out an *intrinsic* problem with disk space, although you previously got a message that SPSS couldn't get the disk space it *thought* it needed: >>Error. Command name: SORT CASES >>File write error: file name >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left >>on device (DATA1002) This command not executed. Goodness knows how SPSS got there, but I doubt we can diagnose that, certainly not fix it. >Any other ideas? Well, the command sequence that blew up before was >> 222 COMPUTE InDupGrp = MatchSequence > 0. >> 223 SORT CASES InDupGrp(D). Assuming that's still the case, (a) Very wild chance: Do you need to do it at all? The code following the sort, in your original posting, didn't appear to rely on the data's being sorted by 'InDupGrp'. But I doubt this'll do it; presumably, you do need the sort, otherwise. (b) Still relying on your original posting: It struck me that, earlier in your code, the command >> 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). apparently worked, on the same file. Sorting on the binary variable 'InDupGrp' gives a huge number of ties on the sort key, and I wonder whether that gives the sorting algorithm some trouble. (Yes, I can give a good argument why it shouldn't.) I'd try appending the previous key sequence: COMPUTE InDupGrp = MatchSequence > 0. SORT CASES BY InDupGrp(D) cust(A) Order_Date_Overall(A) ProductType(D). (c) Have you tried the work-around I suggested? That is, DO IF InDupGrp EQ 1. . XSAVE OUTFILE=InDupYes. ELSE. . XSAVE OUTFILE=InDupNo. END IF. EXECUTE. ADD FILES /FILE=InDupYes /FILE=InDupNo. (Here, InDupYes and InDupNo are file names or file handles - NOT dataset names - for scratch files; and the code's still not tested.) .................... Apologies for any crucial points I've missed. But, well, 'any ideas' this is. -Best of luck and best wishes, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
" Apparently [2097151 bytes] is a software determined
>maximum [and cannot always be reached]:" Well, actually no. That is 2GB (it's measure in K), and that is the maximum address space possible for any 32-bit Windows application. That is doubtless too big for reasonable use. This doesn't explain the original problem, but trying to make the workspace that big is probably not the solution. -Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow Sent: Wednesday, November 28, 2007 1:40 PM To: [hidden email] Subject: Re: [SPSSX-L] set cache reasonable values At 02:28 PM 11/28/2007, Brian Moore wrote: >Thanks to everyone for insights on this issue. > >I have succeeded in cleaning out the cache, but still getting >warnings. Apparently [2097151 bytes] is a software determined >maximum [and cannot always be reached]: > >>Warning # 44 >>The operating system could not allocate a memory segment of the >>size requested by SPSS. > >Other specs that may matter: >-using version 15 (but are looking at upgrading to 16) >-File size is 420 MB >-& I have ~20 gigs total free. Well, that appears to rule out an *intrinsic* problem with disk space, although you previously got a message that SPSS couldn't get the disk space it *thought* it needed: >>Error. Command name: SORT CASES >>File write error: file name >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left >>on device (DATA1002) This command not executed. Goodness knows how SPSS got there, but I doubt we can diagnose that, certainly not fix it. >Any other ideas? Well, the command sequence that blew up before was >> 222 COMPUTE InDupGrp = MatchSequence > 0. >> 223 SORT CASES InDupGrp(D). Assuming that's still the case, (a) Very wild chance: Do you need to do it at all? The code following the sort, in your original posting, didn't appear to rely on the data's being sorted by 'InDupGrp'. But I doubt this'll do it; presumably, you do need the sort, otherwise. (b) Still relying on your original posting: It struck me that, earlier in your code, the command >> 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). apparently worked, on the same file. Sorting on the binary variable 'InDupGrp' gives a huge number of ties on the sort key, and I wonder whether that gives the sorting algorithm some trouble. (Yes, I can give a good argument why it shouldn't.) I'd try appending the previous key sequence: COMPUTE InDupGrp = MatchSequence > 0. SORT CASES BY InDupGrp(D) cust(A) Order_Date_Overall(A) ProductType(D). (c) Have you tried the work-around I suggested? That is, DO IF InDupGrp EQ 1. . XSAVE OUTFILE=InDupYes. ELSE. . XSAVE OUTFILE=InDupNo. END IF. EXECUTE. ADD FILES /FILE=InDupYes /FILE=InDupNo. (Here, InDupYes and InDupNo are file names or file handles - NOT dataset names - for scratch files; and the code's still not tested.) .................... Apologies for any crucial points I've missed. But, well, 'any ideas' this is. -Best of luck and best wishes, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Brian Moore-3
Are you sure you are distinguishing CACHE and SCRATCH?
In the SPSS context: CACHE usually refers to memory (RAM) or storage(DISK) used for buffering input and output especially useful when dealing with storage (DISK) on another machine (e.g., a server). SPSS allows you to specify how you want to use the Operating System function. WORKSPACE usually refers to memory (RAM) used by a program for arrays and instructions. SPSS allows you to adjust this SCRATCH usually refers to storage (DISK) used temporarily to hold data. You can specify what storage you want SPSS to use for SCRATCH. If you share the storage you may have a quota (limit), or you may be filling up the device or partition you are using for scratch. Sometimes people use the term cache loosely in specifying file locations. Art Kendall Social Research Consultants Brian Moore wrote: > Thanks to everyone for insights on this issue. > > I have succeeded in cleaning out the cache, but still getting warnings. > > Apparently this is the maximum workspace (& from the warning <end of > message> appears to be a software determined maximum, not anything to do > with computing power) and I'm still getting the error. > > Other specs that may matter: > -using version 15 (but are looking at upgrading to 16) > -File size is 420 MB > -& I have ~20 gigs total free. > > Any other ideas? > Thanks, > Brian > > extracts from the output file appear below. > > >> The parameter of the WORKSPACE subcommand is in terms of kilobytes >> > (KB). > >> It must be at least 6148 and not greater than 2097151. >> > > set workspace 2097151. > show workspace. > > System Settings > Keyword Description Setting > WORKSPACE Special workspace memory limit in kilobytes 2097151 > > >> Warning # 44 >> The operating system could not allocate a memory segment of the size >> requested by SPSS. Therefore, SPSS can only use the largest block >> available in memory already allocated. Use SHOW WORKSPACE to determine >> > the > >> size of the request and SET WORKSPACE to change it. >> > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Gregory Hildebrandt > Sent: Tuesday, November 27, 2007 10:42 AM > To: [hidden email] > Subject: Re: set cache reasonable values > > Richard, > > I have talked to the SPSS Help Desk about this issue. Also, I deal with > datafiles approximately the size you are describing. > > Here are some of the suggestion, which I've passed on before. > > 1. Set Workspace (Total RAM memory) = ,say, 300M for Windows. As I > have 2GB of RAM, and a 93GB hard drive on a laptop, with about 22GB > free, I Set Workpace = 300000 (the units are thousands). However, there > havef been times when I apparently Set Workspace too large. If one > employs the command Show Workspace., one can find out what I assume is > the default value. > Another individual at the SPSS Help desk suggested that on 30M might be > a more appropriate.size for Set Workspace. > > 2. Under "File," select "Cache Data" before running the procedure. > > 3. Go to Edit >> Options, and determine the location of the temporary > directory. Delete the SPSS files in this directory. In fact, it has > been suggested that I create a new directory for the temporary files, > that it is dedicated to SPSS. This step can be quite important if one > hasn't cleared the temporary files recently.. > > 4. If you are running a Sort command, first do Transform >> Automatic > Recode for the String variables, and run the procedure using the Recoded > variables. Numeric variables are easier to sort. > > After doing all this, if I still have problems I reboot the computer. > Frequently, this seems to work best. There must be some way of emptying > out the entire cache on the computer (the temporary storage I only dimly > understand), but no one has ever been able to tell me how to do this. > > Greg > > > > On 11/26/07, Richard Ristow <[hidden email]> wrote: > >> At 10:36 AM 11/26/2007, Brian Moore wrote: >> >> >>> I *think* I need to increase my cache, but wasn't able to find >>> anything in the documentation about the default value or what a good >>> range might be to increase to or toward. >>> >>> My file is roughly 1 million cases & 180 variables, and the former >>> will be increasing over time. >>> >> Here's the problem you cite. I'm quoting the error messages in full: >> >> >>> 222 COMPUTE InDupGrp = MatchSequence > 0. >>> 223 SORT CASES InDupGrp(D). >>> >>> >>>> Error. Command name: SORT CASES >>>> File write error: file name >>>> C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on >>>> > > >>>> device (DATA1002) This command not executed. >>>> >>>> Error # 5817. Command name: SORT CASES The SPSS file sort has >>>> failed. The file remains unsorted. The specific problem is printed >>>> > > >>>> below. >>>> >>>> Error # 5822. Command name: SORT CASES I/O error writing to the >>>> sort scratch file. >>>> >> As Adrian Barnett, said, on the face of it, this is simple: your >> working disk filled, during the sort. That means you need more free >> disk space, not a change in any SPSS setting. The usual estimate is >> you should have free space 2-3 times the file size, to run a sort >> > well. > >> HOWEVER, in your case there are probably ways around, besides adding >> more disk space. >> >> In the first place, the code you posted following this sort command, >> doesn't rely on the data being sorted by 'InDupGrp': >> >> >>> 224 MATCH FILES /FILE = * >>> /DROP = PrimaryLast InDupGrp MatchSequence. >>> 225 VARIABLE LABELS >>> PFirstOrderDate >>> 'Indicator of each first matching case as Primary' . >>> 226 VALUE LABELS PFirstOrderDate >>> 0 'Duplicate Case' >>> 1 'Primary Case'. >>> 227 VARIABLE LEVEL PFirstOrderDate (ORDINAL). >>> 228 FREQUENCIES VARIABLES = PFirstOrderDate . >>> >> I suppose you do need the sort later for something, though. >> >> Second, if you do want to sort by a binary variable, it's probably >> faster to split the file and merge. It should take storage for ONE >> copy of your file, and should be reasonably fast. Like this, assuming >> that InDupYes and InDupNo are file names or file handles (NOT dataset >> names) that can be used for the scratch files. Not tested: >> >> DO IF InDupGrp EQ 1. >> . XSAVE OUTFILE=InDupYes. >> ELSE. >> . XSAVE OUTFILE=InDupNo. >> END IF. >> EXECUTE. >> >> ADD FILES >> /FILE=InDupYes >> /FILE=InDupNo. >> >> By the way, I see that earlier in the job, command >> >> >>> 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). >>> >> ran successfully, on the same file. It may have been happenstance, >> with the scratch files just making it the first time; but it's more >> likely that, if the file was already sorted by at least 'cust', SPSS >> used an adaptive-sort algorithm that took advantage of the file's >> already being nearly in order. >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> [hidden email] (not to SPSSX-L), with no body text except >> the command. To leave the list, send the command SIGNOFF SPSSX-L For a >> > > >> list of commands to manage subscriptions, send the command INFO >> REFCARD >> >> > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command SIGNOFF SPSSX-L For a list > of commands to manage subscriptions, send the command INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
This came out regarding tables & Java in V16. Might also help here
==================================== Problem Description: I am using SPSS 16.0. I produce output which can contain somewhat large tables. In previous versions of SPSS, the job runs to completion quickly (~15 seconds or less). On the same system with SPSS 16.0.0, this job takes a long time or never completes. The application may go gray and unresponsive. If the output generates, I have similar trouble editing, copying/pasting, exporting or printing the larger tables. For instance, it may take several seconds to copy the table or I am unable to print. Is there any way I can improve this behavior? Resolution Summary: The following workaround may help in this situation. Resolution Description: If you are running out of memory performing any operations with large tables or if your jobs are taking a long time to run, we recommend temporarily adjusting the Maximum Java Heap Size for the client system. On Linux and Windows, this heap can be adjusted upwards by creating a system environment variable, called 'SPSSClientMaxHeapLevel', on each SPSS client. This variable can be set to values 1, 2, or 3 where 1 = 512MB 2 = 768MB and 3 = 1024. 1 is the default value. E.g., Variable: SPSSClientMaxHeapLevel Value: 2 For Macintosh, the info.plist file needs to be edited. Info.plist, found in /Applications/SPSSInc/SPSS16/SPSS16.0.app/Contents/Info.plist is an XML file which contains the setting. If you wish to change the max heap value, you need to do the following: 1. In Terminal, type: $ cd /Applications/SPSSInc/SPSS16/SPSS16.0.app/Contents 2. Create a copy of the Info.plist file for safekeeping: $ cp Info.plist Info.plist.keep 3. Edit the Info.plist file. For instance, If you want to increase the max heap size to 768MB, you'd change: -Xmx512M to -Xmx768M $ vi Info.plist (search for "-Xmx512M", make the change in vi, then save Info.plist) Please note: The drawback of setting Maximum Java Heap Size too high is that it will take available memory away from the backend in single-seat mode, potentially limiting other procedures that can be run (if they require a lot of RAM) and possibly introducing video display problems. ================ W ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Will
Statistical Services ============ info.statman@earthlink.net http://home.earthlink.net/~z_statman/ ============ |
|
In reply to this post by Peck, Jon
I'd written,
>>"Apparently [2097151 bytes] is a software determined maximum [and >>cannot always be reached]:" At 03:59 PM 11/28/2007, Peck, Jon wrote: >Well, actually no. That is 2GB (it's measure in K), and that is the >maximum address space possible for any 32-bit Windows >application. That is doubtless too big for reasonable use. THANK you, Jon. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Richard Ristow
Thanks for the suggestions. As I continue to try to break up this
problem (such as with the file separation idea) it gets more and more curious. In fact now I'm getting this warning even with much smaller files. &have had others run my larger syntax file on their computers without error I'm leaning toward this being some kind of hang-up that may have been initially caused by the overflowing cache; but now SPSS is not even checking for free space anymore before warning me. I've shut down and restarted; but haven't tried anything as drastic as reinstalling (yet). Anything in between I could try? Thanks , Brian PS- one last oddity is that I can't find any obvious problems with the RESULTS I'm getting when warned. (process is one I run every few weeks on transactional database & levels are roughly as expected) -----Original Message----- From: Richard Ristow [mailto:[hidden email]] Sent: Wednesday, November 28, 2007 1:40 PM To: Brian Moore; [hidden email] Cc: Gregory Hildebrandt Subject: Re: set cache reasonable values At 02:28 PM 11/28/2007, Brian Moore wrote: >Thanks to everyone for insights on this issue. > >I have succeeded in cleaning out the cache, but still getting warnings. >Apparently [2097151 bytes] is a software determined maximum [and cannot >always be reached]: > >>Warning # 44 >>The operating system could not allocate a memory segment of the size >>requested by SPSS. > >Other specs that may matter: >-using version 15 (but are looking at upgrading to 16) -File size is >420 MB -& I have ~20 gigs total free. Well, that appears to rule out an *intrinsic* problem with disk space, although you previously got a message that SPSS couldn't get the disk space it *thought* it needed: >>Error. Command name: SORT CASES >>File write error: file name >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on >>device (DATA1002) This command not executed. Goodness knows how SPSS got there, but I doubt we can diagnose that, certainly not fix it. >Any other ideas? Well, the command sequence that blew up before was >> 222 COMPUTE InDupGrp = MatchSequence > 0. >> 223 SORT CASES InDupGrp(D). Assuming that's still the case, (a) Very wild chance: Do you need to do it at all? The code following the sort, in your original posting, didn't appear to rely on the data's being sorted by 'InDupGrp'. But I doubt this'll do it; presumably, you do need the sort, otherwise. (b) Still relying on your original posting: It struck me that, earlier in your code, the command >> 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). apparently worked, on the same file. Sorting on the binary variable 'InDupGrp' gives a huge number of ties on the sort key, and I wonder whether that gives the sorting algorithm some trouble. (Yes, I can give a good argument why it shouldn't.) I'd try appending the previous key sequence: COMPUTE InDupGrp = MatchSequence > 0. SORT CASES BY InDupGrp(D) cust(A) Order_Date_Overall(A) ProductType(D). (c) Have you tried the work-around I suggested? That is, DO IF InDupGrp EQ 1. . XSAVE OUTFILE=InDupYes. ELSE. . XSAVE OUTFILE=InDupNo. END IF. EXECUTE. ADD FILES /FILE=InDupYes /FILE=InDupNo. (Here, InDupYes and InDupNo are file names or file handles - NOT dataset names - for scratch files; and the code's still not tested.) .................... Apologies for any crucial points I've missed. But, well, 'any ideas' this is. -Best of luck and best wishes, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Brian,
I think with one million data points and several hundred variables, SPSS starts to have problems. For example in a similar sized dataset, I create a chart in the SPSS Viewer, but can't copy into Excel or PowerPoint, which is my preference for tables rather than editing the SPSS table. The data seems to be behind the chart. Similar things happen with a large table, that has less than 65,000 rows. However, I did move a moderate sized table into "SPSS Pivot Table Object." The temporary directory in Edit >> Options may fill up vary fast, so I have replaced the default with C:\SPSS14.0\temp. About every other time I open SPSS, I first delete all the files in the temporary directory For sorting, increasing the memory using Set Workspace = 600000 or more has helped, contrary to what I was told. However, there have been times in the past when I have had to reduce the memory to permit a procedure like sorting to work. You may want to start with the default (check "Show Workspace"), and gradually increase the size. I wonder if more RAM would help, or if you have used up too high a proportion of your hard drive. It may be time to reinstall. Make certain everything is off the hard drive. It only takes a few minutes. However, once when I went into regedit, with a member of the SPSS Help desk on the phone, I found remnants of an old version of SPSS still in this directory, which I manually removed. With a large file, the SPSS viewer also seems to increase in size very quickly so one can easily end up with a 20mb viewer file. This might affect your ability to use the Sort procedure. Contrary to the prevailing wisdon, I have also found situtations in which the Syntax file is too large, and hae had to begin a new one. This was with SPSS 11.5, so, perhaps, the problem has been corrected. I also wonder if you can copy the file into Access and sort in Access. Then re-import into SPSS. Hope this helps. Greg On 11/28/07, Brian Moore <[hidden email]> wrote: > > Thanks for the suggestions. As I continue to try to break up thiss > problem (such as with the file separation idea) it gets more and more > curious. > > In fact now I'm getting this warning even with much smaller files. > &have had others run my larger syntax file on their computers without > error > > I'm leaning toward this being some kind of hang-up that may have been > initially caused by the overflowing cache; but now SPSS is not even > checking for free space anymore before warning me. > > I've shut down and restarted; but haven't tried anything as drastic as > reinstalling (yet). > Anything in between I could try? > > Thanks , > Brian > > PS- one last oddity is that I can't find any obvious problems with the > RESULTS I'm getting when warned. (process is one I run every few weeks > on transactional database & levels are roughly as expected) > > -----Original Message----- > From: Richard Ristow [mailto:[hidden email]] > Sent: Wednesday, November 28, 2007 1:40 PM > To: Brian Moore; [hidden email] > Cc: Gregory Hildebrandt > Subject: Re: set cache reasonable values > > At 02:28 PM 11/28/2007, Brian Moore wrote: > > >Thanks to everyone for insights on this issue. > > > >I have succeeded in cleaning out the cache, but still getting warnings. > > >Apparently [2097151 bytes] is a software determined maximum [and cannot > > >always be reached]: > > > >>Warning # 44 > >>The operating system could not allocate a memory segment of the size > >>requested by SPSS. > > > >Other specs that may matter: > >-using version 15 (but are looking at upgrading to 16) -File size is > >420 MB -& I have ~20 gigs total free. > > Well, that appears to rule out an *intrinsic* problem with disk space, > although you previously got a message that SPSS couldn't get the disk > space it *thought* it needed: > > >>Error. Command name: SORT CASES > >>File write error: file name > >>C:\DOCUME~1\BMoore\LOCALS~1\Temp\spss2064\cache.33: No space left on > >>device (DATA1002) This command not executed. > > Goodness knows how SPSS got there, but I doubt we can diagnose that, > certainly not fix it. > > >Any other ideas? > > Well, the command sequence that blew up before was > > >> 222 COMPUTE InDupGrp = MatchSequence > 0. > >> 223 SORT CASES InDupGrp(D). > > Assuming that's still the case, > > (a) Very wild chance: Do you need to do it at all? The code following > the sort, in your original posting, didn't appear to rely on the data's > being sorted by 'InDupGrp'. But I doubt this'll do it; presumably, you > do need the sort, otherwise. > > (b) Still relying on your original posting: It struck me that, earlier > in your code, the command > > >> 212 SORT CASES BY cust(A) Order_Date_Overall(A) ProductType(D). > > apparently worked, on the same file. > > Sorting on the binary variable 'InDupGrp' gives a huge number of ties on > the sort key, and I wonder whether that gives the sorting algorithm some > trouble. (Yes, I can give a good argument why it > shouldn't.) I'd try appending the previous key sequence: > > COMPUTE InDupGrp = MatchSequence > 0. > SORT CASES BY InDupGrp(D) > cust(A) Order_Date_Overall(A) ProductType(D). > > (c) Have you tried the work-around I suggested? That is, > > DO IF InDupGrp EQ 1. > . XSAVE OUTFILE=InDupYes. > ELSE. > . XSAVE OUTFILE=InDupNo. > END IF. > EXECUTE. > > ADD FILES > /FILE=InDupYes > /FILE=InDupNo. > > (Here, InDupYes and InDupNo are file names or file handles - NOT dataset > names - for scratch files; and the code's still not tested.) > .................... > Apologies for any crucial points I've missed. But, well, 'any ideas' > this is. > > -Best of luck and best wishes, > Richard > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
