A colleague and I both have experienced this problem. When saving data files, string variables' length changes to three times what it was originally designated to be. Why? How can we prevent this?
We use SPSS 19 on a Mac. Thanks. |
You're opening code page files in Unicode
mode, which triples the length of strings to make sure they don't get truncated,
since string length is measured in bytes, and characters that take one
byte in code page can take up to three bytes in UTF-8. Technically, the
strings aren't tripled on save, they're tripled on open. Two workarounds:
1. Switch to code page mode before opening files: SET UNICODE OFF. or alternatively: 2. Automatically reset the length of all strings to the length of the longest value for each string variable: ALTER TYPE ALL (A+AMIN). Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: J Otterstedt <[hidden email]> To: [hidden email] Date: 12/01/2011 12:17 PM Subject: String length changes when saving Sent by: "SPSSX(r) Discussion" <[hidden email]> A colleague and I both have experienced this problem. When saving data files, string variables' length changes to three times what it was originally designated to be. Why? How can we prevent this? We use SPSS 19 on a Mac. Thanks. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/String-length-changes-when-saving-tp5039393p5039393.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by J Otterstedt
The string length is probably not changing
when you save the data but when you read it in. This will happen
if you open a code page sav file and you are in Unicode mode in Statistics.
This is necessary in order to guarantee that no string data are lost
when the code page characters are converted to Unicode, because in the
worst case, a one-byte character in code page takes three bytes in Unicode.
Usually, though, this is far too conservative. In fact, if your data only contain plain roman characters, you don't actually need any extra space. If you want to reclaim the extra space, you can use the ALTER TYPE command to shrink the fields down to the minimum required to hold your actual text. This can be done if you open a dataset interactively by checking the Minimize string widths box on the Open dialog. Once you save the data file in Unicode mode, no further string expansion will occur. HTH, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: J Otterstedt <[hidden email]> To: [hidden email] Date: 12/01/2011 11:22 AM Subject: [SPSSX-L] String length changes when saving Sent by: "SPSSX(r) Discussion" <[hidden email]> A colleague and I both have experienced this problem. When saving data files, string variables' length changes to three times what it was originally designated to be. Why? How can we prevent this? We use SPSS 19 on a Mac. Thanks. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/String-length-changes-when-saving-tp5039393p5039393.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |