Does anyone know how fast (and stable), the individual software packages SPSS / SAS / SPSS Modeler / Cognos work in comparison?
It's about data processing in the terabyte range. Previously it was said that SAS is very much faster than SPSS. I only know that SPSS several years ago produced run-time errors in C. My own comparison of various procedures on large data sets has only shown that SAS and SPSS are not far apart at speed. Does anyone have more experience with it? Frank
Dr. Frank Gaeth
|
any ideas?
Dr. Frank Gaeth
|
In reply to this post by drfg2008
You are asking a pretty general question, which might be why no one responded. I can say, though, that Statistics and Modeler are commonly used with datasets in the gigabyte and terabyte range, so I would not expect problems. The Server versions can access more hardware, and, of course, algorithms that require data in memory may not scale, but behavior and performance should be competitive
|
Thanks jkpeck,
yes the question is actually too general. However, this is how the discussion here was. The argument was not to use SPSS and use SAS instead, because "SPSS is too slow for terabyte data". This is an important argument for bigger companies and it determines their decision of acquiring SPSS or SAS instead. So I checked it with different (for us) relevant procedures (regression, count, ...) and found out, that SPSS is very well capable of processing data in the terabyte range. SPSS even seemed to be a bit faster (in these little self-made experiments) than SAS. Then I asked the company, SPSS in Munich, about the programming language used for the SPSS modules (because I assume C-Programs). However, they denied information. So if you have to convince bigger companies to acquire SPSS-Licences, speed is an important argument (and regularly emphasised by SAS as far as my experience is concerned). Frank
Dr. Frank Gaeth
|
I will be out of the office from May 23rd-May 25th, returning to the office on May 26th. I will not have regular access to email. If you need immediate assistance please call the main office number 503/223-8248 or 800/788-1887 and the receptionist will ensure that I get the message. Thank you. Kelly |
In reply to this post by drfg2008
In my opinion the idea that SAS was "machine faster"
than SPSS was just sales hype. In fact my experience was the
opposite.
I do not know if it is still true, but this was the situation many years ago. SAS did a separate pass of the data on a data step to create the binary version (system file) and then did whatever procedures by reading the binary file back in. SPSS created the binary file while executing the first procedure IFF there was another procedure. An extreme example. We had a big EBCDIC file. The task was to save a small file that contained just selected records. The charges for doing the task in SAS were just over double what they were for doing it in SPSS. In SAS read the data write it to a binary file read the binary file write the selected file. In SPSS read the data write the selected file. Even when there were multiple procedures, SPSS still had an advantage in that for the first pass SPSS both did a procedure and created the system file. ---- In the early 90s, for a half dozen tasks, we tried doing them both in SPSS and SAS. We found that it took about 15% more "people time" to do these tasks in SAS than in SPSS. Most of the difference was in getting the data ready. Once there was a final system file the machine costs were negligibly lower for SPSS. There was still a slight people cost advantage of SPSS. ---- Especially in the commercial sector tasks tend to be repeated many times. Whereas in research, tasks tend to be redrafted until they are satisfactory. Quite often it has been helpful to show some of the difference in the readability of the syntax. This has implications for auditing, maintaining, and updating processes. Take a few tasks and put the syntax side by side. --- Anecdotally, several experienced SAS users who later were in an SPSS environment said that it was like moving from a manual transmission to an automatic transmission. --- Of course since I majored in philosophy, I have to ask whether it is better to slip between the horns of the dilemma. Why not some of each? In FORTRAN wouldn't the logical operator be OR rather than XOR? The people who are going to be doing the work deserve to have their preferences taken into account. I usually recommend that SPSS be the most widely used package, but that SAS be available for those occasions where there is an esoteric procedure that needs SAS OR where a professional employee has a strong preference for doing his analysis in SAS. Overall, in my experience, the big advantage of SPSS is in its human factors. This is particularly so in the parts of the work that take the most people time, getting the data ready for the final analysis. YMMV but subjectively the people time to get to the final system file can easily exceed 90% of the people time. Total cost in purchase money + staff money + calendar time + morale should be taken into account. Art Kendall Social Research Consultants On 5/21/2011 4:14 AM, drfg2008 wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARDThanks jkpeck, yes the question is actually too general. However, this is how the discussion here was. The argument was not to use SPSS and use SAS instead, because "SPSS is too slow for terabyte data". This is an important argument for bigger companies and it determines their decision of acquiring SPSS or SAS instead. So I checked it with different (for us) relevant procedures (regression, count, ...) and found out, that SPSS is very well capable of processing data in the terabyte range. SPSS even seemed to be a bit faster (in these little self-made experiments) than SAS. Then I asked the company, SPSS in Munich, about the programming language used for the SPSS modules (because I assume C-Programs). However, they denied information. So if you have to convince bigger companies to acquire SPSS-Licences, speed is an important argument (and regularly emphasised by SAS as far as my experience is concerned). Frank ----- Dr. Frank Gaeth FU-Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/SPSS-SAS-SPSS-Modeler-Cognos-tp4394091p4414436.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by drfg2008
There are some Server configuration and
benchmark documents available. You should be able to get them through
the local SPSS office.
As for the programming language(s), that does not really tell you anything about speed or resource usage, but it's no secret. The main languages of the SPSS Statistics code are C, C++, Fortran, and Java. Regards, Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: drfg2008 <[hidden email]> To: [hidden email] Date: 05/21/2011 02:17 AM Subject: Re: [SPSSX-L] SPSS / SAS / SPSS-Modeler / Cognos Sent by: "SPSSX(r) Discussion" <[hidden email]> Thanks jkpeck, yes the question is actually too general. However, this is how the discussion here was. The argument was not to use SPSS and use SAS instead, because "SPSS is too slow for terabyte data". This is an important argument for bigger companies and it determines their decision of acquiring SPSS or SAS instead. So I checked it with different (for us) relevant procedures (regression, count, ...) and found out, that SPSS is very well capable of processing data in the terabyte range. SPSS even seemed to be a bit faster (in these little self-made experiments) than SAS. Then I asked the company, SPSS in Munich, about the programming language used for the SPSS modules (because I assume C-Programs). However, they denied information. So if you have to convince bigger companies to acquire SPSS-Licences, speed is an important argument (and regularly emphasised by SAS as far as my experience is concerned). Frank ----- Dr. Frank Gaeth FU-Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/SPSS-SAS-SPSS-Modeler-Cognos-tp4394091p4414436.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
C, C++: That can be an important information, if you argue with companies. Especially telecom and financial sector have huge data-files. They always argue with speed and mass data. We had discussions about SAS and SPSS in comparison. To argue with C can be an asset (whatever the actual advantages might be). Thanks for that.
After integration of R in SPSS, I wonder what “esoteric procedure still needs SAS”?
Dr. Frank Gaeth
|
> After integration of R in SPSS, I wonder what “esoteric procedure still
> needs SAS”? Who knows what the future may bring? One operational definition of "esoteric" or "pioneering" or "forefront" might be "not in both SPSS and SAS". Over the years I have gotten rid of literally hundreds of ad hoc pieces of software as the capabilities of packages became more extensive. Art Kendall Social Research Consultants On 5/21/2011 10:32 AM, drfg2008 wrote: > C, C++: That can be an important information, if you argue with companies. > Especially telecom and financial sector have huge data-files. They always > argue with speed and mass data. We had discussions about SAS and SPSS in > comparison. To argue with C can be an asset (whatever the actual advantages > might be). Thanks for that. > > After integration of R in SPSS, I wonder what “esoteric procedure still > needs SAS”? > > > ----- > Dr. Frank Gaeth > FU-Berlin > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/SPSS-SAS-SPSS-Modeler-Cognos-tp4394091p4414893.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by drfg2008
SPSS Modeler is good compare to SPSS statistic and SAS as it is very fast and drag and drop facility available for all functions but need to update your computer RAM (minimum 3GB), Processer (latest available in market) and Hard disk (15-20 GB for SPSS modeler).
|
Free forum by Nabble | Edit this page |