I got this message upon opening a file that was written in Unicode and opened on my computer with Unicode turned off (locale=local). Warning. Command name: GET FILE SPSS Statistics data file "U:\DAP\ERIN\RESPONSES_CLEANED.SAV" is written in a character encoding (ISO_8859-1:1987) incompatible with the current LOCALE setting. It may not be readable. Consider changing LOCALE or setting UNICODE on. (DATA 1721) I understand that to get rid of the warning all I need to do is to change the character encoding to Unicode in the Edit->Options menu. What I’m curious about is whether any statements can be made about where read errors will occur given that the file almost certainly contains only floating point numbers and US standard English characters and numerals. Thanks, Gene Maguin |
The warning is based on the encoding in
the file vs the Statistics setting. In many cases, everything will
be fine. When there is a problem, say a Japanese source with a western
locale, you will see all sorts of garbage in variable names, labels, string
data etc when there are Japanese characters (there might not be any). The
Japanese have a name for this - mojibake - "ghost characters".
More subtle problems might occur with particular characters such as, say, the Euro sign, the encoding of which varies some from one code page to another. For a few characters, there are differences between the Windows and Mac versions of the same code page. One thing you can count on, though, is that if all the text is plain, 7-bit ascii, it will work regardless of any locale or Unicode settings. And once you have converted to Unicode, you can forget about all these annoying encoding problems forever after. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 07/31/2013 07:44 AM Subject: [SPSSX-L] unicode vs local locale. where would read errors be found? Sent by: "SPSSX(r) Discussion" <[hidden email]> I got this message upon opening a file that was written in Unicode and opened on my computer with Unicode turned off (locale=local). Warning. Command name: GET FILE SPSS Statistics data file "U:\DAP\ERIN\RESPONSES_CLEANED.SAV" is written in a character encoding (ISO_8859-1:1987) incompatible with the current LOCALE setting. It may not be readable. Consider changing LOCALE or setting UNICODE on. (DATA 1721) I understand that to get rid of the warning all I need to do is to change the character encoding to Unicode in the Edit->Options menu. What I’m curious about is whether any statements can be made about where read errors will occur given that the file almost certainly contains only floating point numbers and US standard English characters and numerals. Thanks, Gene Maguin |
In reply to this post by Maguin, Eugene
If it contains nothing but numbers and
7-bit ASCII characters, there shouldn't be a problem.
According to Wikipedia, ISO-8859 "is generally intended for “Western European” languages (see below for a list). It is the basis for most popular 8-bit character sets, including Windows-1252." So if you're running in Windows-1252 code page, I wouldn't think there would be any data loss. Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 07/31/2013 08:47 AM Subject: unicode vs local locale. where would read errors be found? Sent by: "SPSSX(r) Discussion" <[hidden email]> I got this message upon opening a file that was written in Unicode and opened on my computer with Unicode turned off (locale=local). Warning. Command name: GET FILE SPSS Statistics data file "U:\DAP\ERIN\RESPONSES_CLEANED.SAV" is written in a character encoding (ISO_8859-1:1987) incompatible with the current LOCALE setting. It may not be readable. Consider changing LOCALE or setting UNICODE on. (DATA 1721) I understand that to get rid of the warning all I need to do is to change the character encoding to Unicode in the Edit->Options menu. What I’m curious about is whether any statements can be made about where read errors will occur given that the file almost certainly contains only floating point numbers and US standard English characters and numerals. Thanks, Gene Maguin |
Free forum by Nabble | Edit this page |