This is what I am doing. In SPSS, I used Python to break a big data file into about 3000 files. I used "with spss.DataStep():" to generate these files one by one. The program works good for the first 1200 files, then an error message pop up. Please see the picture. Followed by: "An unknown error has terminated communication with the processor. The SPSS Statistics processor is unavailable." Can any expert help me about this type of error? Was it caused by limitation of memory? Thank you ! |
Where are you writing the output files? An SPSS dataset? an external file?
Are all of the cases being written to a file sequential in the input file? Please describe what you are trying to do and why.
Art Kendall
Social Research Consultants |
1. The original data is too big. I have to break it into about 3000 data files.
There is one variable called "unit", which will be used to break. I have about 3000 units. One file is for one unit. 2. I generate a new file and output it into a local directory. strdept = str(int(deptid)) name=list(dsNames.keys())[0] spss.Submit(r""" DATASET ACTIVATE %(name)s. SAVE OUTFILE='E:\WEI WAN\My SPSS\unitfile5\unit_%(strdept)s.sav'. DATASET CLOSE %(name)s. """ %locals()) spss.Submit(r""" DATASET ACTIVATE alldata. DATASET CLOSE ALL. """ %locals()) 3. Yes, the algorithm will write case by case into a new file. ------------- Dr. Wei Wan ________________________________________ From: Art Kendall [via SPSSX Discussion] [[hidden email]] Sent: Wednesday, July 05, 2017 9:59 AM To: Wei Wan Subject: Re: Serious and strange error from SPSS with Python code Where are you writing the output files? An SPSS dataset? an external file? Are all of the cases being written to a file sequential in the input file? Please describe what you are trying to do and why. Art Kendall Social Research Consultants ________________________________ If you reply to this email, your message will be added to the discussion below: http://spssx-discussion.1045642.n5.nabble.com/Serious-and-strange-error-from-SPSS-with-Python-code-tp5734504p5734505.html To unsubscribe from Serious and strange error from SPSS with Python code, click here< NAML<http://spssx-discussion.1045642.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> |
The save is saving the entire file each time. I don't think this is what you want, although we are seeing just a fragment of the code, I think. Try instead the SPSSINC SPLIT DATASET (Data > Split into files) extension command. It uses XSAVE to select out the cases to save in each file. There is no limit on the number of files it can generate (unlike simple XSAVE). On Wed, Jul 5, 2017 at 8:39 AM, tongmeng_wan <[hidden email]> wrote: 1. The original data is too big. I have to break it into about 3000 data files. |
Jon Peck,
Thanks. I tried to use "split into files", and it was successful. Now suppose that those 3000 files are stored at: E:/WEI WAN/My SPSS/temp/ I am trying following program. The purpose is to read each of those files, and compute a new variable. But I got error, which said I could not open it. Basically "Get FILE" does not work. Can you help? ---------------------------------- begin program. import glob import spss i = 0 for fcount, f in enumerate(glob.glob("E:/WEI WAN/My SPSS/temp/*.sav")): if i <= 5: print(fcount), print(":"), print(f) spss.Submit(r""" GET FILE='f'. DATASET NAME tempdata. DATASET ACTIVATE tempdata. COMPUTE try1=indicator+2. EXECUTE. SAVE OUTFILE='f'. New file DATASET CLOSE ALL EXECUTE. """) i = i+1 end program. ------------- Dr. Wei Wan ________________________________________ From: Jon Peck [via SPSSX Discussion] [[hidden email]] Sent: Wednesday, July 05, 2017 12:55 PM To: Wei Wan Subject: Re: Serious and strange error from SPSS with Python code The save is saving the entire file each time. I don't think this is what you want, although we are seeing just a fragment of the code, I think. Try instead the SPSSINC SPLIT DATASET (Data > Split into files) extension command. It uses XSAVE to select out the cases to save in each file. There is no limit on the number of files it can generate (unlike simple XSAVE). On Wed, Jul 5, 2017 at 8:39 AM, tongmeng_wan <[hidden email]</user/SendEmail.jtp?type=node&node=5734510&i=0>> wrote: 1. The original data is too big. I have to break it into about 3000 data files. There is one variable called "unit", which will be used to break. I have about 3000 units. One file is for one unit. 2. I generate a new file and output it into a local directory. strdept = str(int(deptid)) name=list(dsNames.keys())[0] spss.Submit(r""" DATASET ACTIVATE %(name)s. SAVE OUTFILE='E:\WEI WAN\My SPSS\unitfile5\unit_%(strdept)s.sav'. DATASET CLOSE %(name)s. """ %locals()) spss.Submit(r""" DATASET ACTIVATE alldata. DATASET CLOSE ALL. """ %locals()) 3. Yes, the algorithm will write case by case into a new file. ------------- Dr. Wei Wan ________________________________________ From: Art Kendall [via SPSSX Discussion] [[hidden email]<http:///user/SendEmail.jtp?type=node&node=5734507&i=0>] Sent: Wednesday, July 05, 2017 9:59 AM To: Wei Wan Subject: Re: Serious and strange error from SPSS with Python code Where are you writing the output files? An SPSS dataset? an external file? Are all of the cases being written to a file sequential in the input file? Please describe what you are trying to do and why. Art Kendall Social Research Consultants ________________________________ If you reply to this email, your message will be added to the discussion below: http://spssx-discussion.1045642.n5.nabble.com/Serious-and-strange-error-from-SPSS-with-Python-code-tp5734504p5734505.html To unsubscribe from Serious and strange error from SPSS with Python code, click here< NAML<http://spssx-discussion.1045642.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> ________________________________ View this message in context: RE: Serious and strange error from SPSS with Python code<http://spssx-discussion.1045642.n5.nabble.com/Serious-and-strange-error-from-SPSS-with-Python-code-tp5734504p5734507.html> Sent from the SPSSX Discussion mailing list archive<http://spssx-discussion.1045642.n5.nabble.com/> at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email]</user/SendEmail.jtp?type=node&node=5734510&i=1> (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD -- Jon K Peck [hidden email]</user/SendEmail.jtp?type=node&node=5734510&i=2> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email]</user/SendEmail.jtp?type=node&node=5734510&i=3> (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ________________________________ If you reply to this email, your message will be added to the discussion below: http://spssx-discussion.1045642.n5.nabble.com/Serious-and-strange-error-from-SPSS-with-Python-code-tp5734504p5734510.html To unsubscribe from Serious and strange error from SPSS with Python code, click here< NAML<http://spssx-discussion.1045642.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> |
In reply to this post by Jon Peck
Jon Peck,
Thanks. I tried to use "split into files", and it was successful. Now suppose that those 3000 files are stored at: E:/WEI WAN/My SPSS/temp/ I am trying following program. The purpose is to read each of those files, and compute a new variable. But I got error, which said I could not open it. Basically "Get FILE" does not work. Can you help? ---------------------------------- begin program. import glob import spss i = 0 for fcount, f in enumerate(glob.glob("E:/WEI WAN/My SPSS/temp/*.sav")): if i <= 5: print(fcount), print(":"), print(f) spss.Submit(r""" GET FILE='f'. DATASET NAME tempdata. DATASET ACTIVATE tempdata. COMPUTE try1=indicator+2. EXECUTE. SAVE OUTFILE='f'. New file DATASET CLOSE ALL EXECUTE. """) i = i+1 end program. |
The GET command refers to 'f' as the file to open, but that is a literal f character. You need to use a substitution mechanism to insert the current value of the variable named f. Write it like this fragment: r"""GET FILE= "%(f)". ...""" % locals() Similarly with the SAVE. That will substitute the value of variable f each time through the loop. Also, get rid of the EXECUTE commands. The statements above are enclosed in triple quotes - three " - so that quotes inside are handled correctly. On Wed, Jul 5, 2017 at 12:58 PM tongmeng_wan <[hidden email]> wrote: Jon Peck, |
Free forum by Nabble | Edit this page |