|
Hi all,
Is there any difference in performance (specifically in terms of speed) between using the Python plugin and SPSS macro language? I have a macro that bogs down when the datset is too large and I am wondering if the Python facility may allow me to get some more mileage out of it with larger datasets. I haven't used the python plugin before, but I'm curious if moving the code to Python would improve the performance. Thanks, Bryan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
As the saying goes, your mileage may vary, but here are some general guidelines.
Sometimes a macro has to do things in a very roundabout way when a Python program could do it directly. That would offer an opportunity for a speedup using the plugin. If most of the time is spent in relevant computation in SPSS, whether transformations or statistical procedures, using the plugin is unlikely to be faster. The overhead of macro expansion is unlikely to be a significant fraction of the run time required. But, if you can use the external (xd) mode of the plugin, where there is no SPSS user interface present - no dialogs, no menus, no Data Editor, no Viewer, there can be a huge performance gain because the updating and synchronization between the SPSS Processor and all the user interface elements is avoided. And less memory is used. We have had reports of speedups of 4x to 10x from this approach. The limitation, of course, is that with no Viewer, your human-readable output is limited to the formats supported by OMS - XML, HTML, Text, and a few others. But if you are mainly doing computation and producing new data files or if simple text summaries will do, this can be a big win. If you use external mode, you can even continue to run your macro but run it from the xd interface. It is very trivial to take an existing syntax job and turn it into a programmability job. There is an article on SPSS Developer Central and material in the downloadable Data Management book on this topic. You may be able to divide up a job into a portion that can be run with the external mode and then a second part run in the normal way in order to get ordinary Viewer output. HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bryan Tec Sent: Friday, May 23, 2008 8:19 AM To: [hidden email] Subject: [SPSSX-L] Python vs. macro language performance Hi all, Is there any difference in performance (specifically in terms of speed) between using the Python plugin and SPSS macro language? I have a macro that bogs down when the datset is too large and I am wondering if the Python facility may allow me to get some more mileage out of it with larger datasets. I haven't used the python plugin before, but I'm curious if moving the code to Python would improve the performance. Thanks, Bryan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Bryan Tec
At 10:19 AM 5/23/2008, Bryan Tec wrote:
>Is there any difference in performance (specifically in terms of >speed) between using the Python plugin and SPSS macro language? I >have a macro that bogs down when the datset is too large and I am >wondering if the Python facility may allow me to get some more >mileage out of it with larger datasets. As Jon Peck wrote (Fri, 23 May 2008 09:44:05 -0500) >If most of the time is spent in relevant computation in SPSS, using >the plugin is unlikely to be faster. That is, using Python to generate the same SPSS syntax -- using Python as you'd use a macro processor -- is very unlikely to be either better or worse. Quoting Jon again, "The overhead of macro expansion [or Python execution] is unlikely to be a significant fraction of the run time required." You have "a macro that bogs down when the dataset is too large". That's a sure sign the problem's in the SPSS computation. The macro processing time doesn't depend on the file size. Run the macro, on any size file, with SET MPRINT ON, and look or inefficiencies in the code generated. Among other things, this is the time to look for unnecessary EXECUTE statements. Most *are* unnecessary, and they can slow processing a lot. You'll have to post your code before we can suggest anything more. -Best of luck, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
