|
Administrator
|
In another thread, Jon informed us that the old SPSS Community at www.ibm.com/developerworks/spssdevcentral has been migrated to the new IBM SPSS Predictive Analytics Community at https://developer.ibm.com/predictiveanalytics/. I took a quick look, and was a bit surprised by what I read in this post:
https://developer.ibm.com/predictiveanalytics/2015/09/30/spss-statistics-data-files-can-be-smaller-and-faster-if-saved-as-uncompressed/ The counter-intuitive bit (for me) was that an uncompressed file could actually be smaller. Figuring that many members of this group might not frequent the Predictive Analytics Community forum, I thought I should share the link. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
The case where the uncompressed file could
be slightly smaller is when the compression algorithm used in sav files
is completely ineffective. If none of the data consists of small
integer values (and no string variables have trailing blanks or are empty),
then no fields can be compressed, but the compression flag is still required
if compression is on. The futile compression overhead is small, however.
Zsav compression, introduced in V21, uses a general algorithm that is broadly effective. For example, one year of the famous airline data is 853MB in sav format with compression but only 275MB with zsav. Zsav files may process slightly slower, depending on a number of factors, but the compression works with all sorts of content. The csv version of that airline data is 689MB. Most of the values are 1-4 bytes while the sav format stores all numbers (before compression) as 8-byte floating point numbers. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Bruce Weaver <[hidden email]> To: [hidden email] Date: 10/07/2015 09:07 AM Subject: [SPSSX-L] Counter-intuitive statement about uncompressed vs compressed data files Sent by: "SPSSX(r) Discussion" <[hidden email]> In another thread, Jon informed us that the old SPSS Community at www.ibm.com/developerworks/spssdevcentralhas been migrated to the new IBM SPSS Predictive Analytics Community at https://developer.ibm.com/predictiveanalytics/. I took a quick look, and was a bit surprised by what I read in this post: https://developer.ibm.com/predictiveanalytics/2015/09/30/spss-statistics-data-files-can-be-smaller-and-faster-if-saved-as-uncompressed/ The counter-intuitive bit (for me) was that an uncompressed file could actually be smaller. Figuring that many members of this group might not frequent the Predictive Analytics Community forum, I thought I should share the link. HTH. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Counter-intuitive-statement-about-uncompressed-vs-compressed-data-files-tp5730748.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
