|
I am guessing that you want to run multiple
SQL queries in parallel and build a dataset joining these results in some
fashion and feed that to Statistics. Is that right?
There would be several obstacles to
this.
- If using the IBM SPSS Data Access
Pack drivers, these cannot be used in R or Python processes. You
would need native drivers.
- If the result is large amounts of
data, R probably could not handle that
- If the queries return different cases,
rbind could stack them, but if you are trying to join different fields
for the same case base, you would have to worry about possible differences
in the cases returned.
- If running this on a pc, the ability
to use multiple processors via the parallel package might not help much,
because i/o bandwidth would likely be the limiting factor, so you might
not get much speedup. Parallel processing in the database, though,
might help on that end of the job.
- Writing the data back to Statistics
would mean an extra data pass.
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621
From:
drfg2008 <[hidden email]>
To:
[hidden email],
Date:
12/11/2013 06:59 AM
Subject:
Re: [SPSSX-L]
multiprocessing
Sent by:
"SPSSX(r)
Discussion" <[hidden email]>
no idea right now how to integrate it - just learned
of an R parallel package
and that via do.call(rbind, ...) the different queries can be merged into
one table.
Would be nice to have one example how to integrate this in IBM-SPSS and
run
it from IBM-SPSS syntax.
-----
Dr. Frank Gaeth
FU-Berlin
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/multiprocessing-tp5721935p5723583.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|