String length changes when saving

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

String length changes when saving

J Otterstedt
A colleague and I both have experienced this problem. When saving data files, string variables' length changes to three times what it was originally designated to be. Why? How can we prevent this?

We use SPSS 19 on a Mac.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: String length changes when saving

Rick Oliver-3
You're opening code page files in Unicode mode, which triples the length of strings to make sure they don't get truncated, since string length is measured in bytes, and characters that take one byte in code page can take up to three bytes in UTF-8. Technically, the strings aren't tripled on save, they're tripled on open. Two workarounds:

1. Switch to code page mode before opening files: SET UNICODE OFF.

or alternatively:

2. Automatically reset the length of all strings to the length of the longest value for each string variable: ALTER TYPE ALL (A+AMIN).


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        J Otterstedt <[hidden email]>
To:        [hidden email]
Date:        12/01/2011 12:17 PM
Subject:        String length changes when saving
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




A colleague and I both have experienced this problem. When saving data files,
string variables' length changes to three times what it was originally
designated to be. Why? How can we prevent this?

We use SPSS 19 on a Mac.

Thanks.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/String-length-changes-when-saving-tp5039393p5039393.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: String length changes when saving

Jon K Peck
In reply to this post by J Otterstedt
The string length is probably not changing when you save the data but when you read it in.  This will happen if you open a code page sav file and you are in Unicode mode in Statistics.  This is necessary in order to guarantee that no string data are lost when the code page characters are converted to Unicode, because in the worst case, a one-byte character in code page takes three bytes in Unicode.

Usually, though, this is far too conservative.  In fact, if your data only contain plain roman characters, you don't actually need any extra space.  If you want to reclaim the extra space, you can use the ALTER TYPE command to shrink the fields down to the minimum required to hold your actual text.  This can be done if you open a dataset interactively by checking the Minimize string widths box on the Open dialog.

Once you save the data file in Unicode mode, no further string expansion will occur.

HTH,

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        J Otterstedt <[hidden email]>
To:        [hidden email]
Date:        12/01/2011 11:22 AM
Subject:        [SPSSX-L] String length changes when saving
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




A colleague and I both have experienced this problem. When saving data files,
string variables' length changes to three times what it was originally
designated to be. Why? How can we prevent this?

We use SPSS 19 on a Mac.

Thanks.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/String-length-changes-when-saving-tp5039393p5039393.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD