SPSSX Discussion

multiprocessing

Classic

List

Threaded

21 messages Options

Jon K Peck

Re: multiprocessing

I am guessing that you want to run multiple SQL queries in parallel and build a dataset joining these results in some fashion and feed that to Statistics. Is that right?

There would be several obstacles to this.
- If using the IBM SPSS Data Access Pack drivers, these cannot be used in R or Python processes. You would need native drivers.
- If the result is large amounts of data, R probably could not handle that
- If the queries return different cases, rbind could stack them, but if you are trying to join different fields for the same case base, you would have to worry about possible differences in the cases returned.
- If running this on a pc, the ability to use multiple processors via the parallel package might not help much, because i/o bandwidth would likely be the limiting factor, so you might not get much speedup. Parallel processing in the database, though, might help on that end of the job.
- Writing the data back to Statistics would mean an extra data pass.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: drfg2008 <[hidden email]>
To: [hidden email],
Date: 12/11/2013 06:59 AM
Subject: Re: [SPSSX-L] multiprocessing
Sent by: "SPSSX(r) Discussion" <[hidden email]>

no idea right now how to integrate it - just learned of an R parallel package and that via do.call(rbind, ...) the different queries can be merged into one table. Would be nice to have one example how to integrate this in IBM-SPSS and run it from IBM-SPSS syntax. ----- Dr. Frank Gaeth FU-Berlin -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/multiprocessing-tp5721935p5723583.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD