UTF-8 Warning

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

UTF-8 Warning

Salbod
I was completely caught off guard today when I was trying to port text files from SPSS into MPlus. I didn't know that the default on SAVE AS is set to Unicode (UFT-8) rather than Local Encoding. Is there anyway I can set the Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Jon K Peck
I would first suggest that you see whether MPlus will accept utf-8 files - even Notepad handles them properly, but if you need to save text files in the old code page encoding, you can run Statistics in code page mode, and then syntax windows will default to code page encoding.  You can set the mode via Edit > Options then General prior to V22 and Language in 22.  

Bear in mind that if your text consists solely of 7-bit characters, utf-8 is the same as code page except that Statistics includes a 3-byte-long byte order mark (BOM) at the start of a Unicode file.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Salbod <[hidden email]>
To:        [hidden email],
Date:        11/12/2013 04:39 PM
Subject:        [SPSSX-L] UTF-8 Warning
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I was completely caught off guard today when I was trying to port text files
from SPSS into MPlus. I didn't know that the default on SAVE AS is set to
Unicode (UFT-8) rather than Local Encoding. Is there anyway I can set the
Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Maguin, Eugene
In reply to this post by Salbod
I've been through this before, with a helpful explanation from Jon as to what the characters were, and you can do is to open the data file in the mplus editor, which will show the BOM, and delete the characters.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Salbod
Sent: Tuesday, November 12, 2013 6:39 PM
To: [hidden email]
Subject: UTF-8 Warning

I was completely caught off guard today when I was trying to port text files from SPSS into MPlus. I didn't know that the default on SAVE AS is set to Unicode (UFT-8) rather than Local Encoding. Is there anyway I can set the Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Maguin, Eugene
In reply to this post by Salbod
Mplus reads the BOM as a data value.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Salbod
Sent: Tuesday, November 12, 2013 6:39 PM
To: [hidden email]
Subject: UTF-8 Warning

I was completely caught off guard today when I was trying to port text files from SPSS into MPlus. I didn't know that the default on SAVE AS is set to Unicode (UFT-8) rather than Local Encoding. Is there anyway I can set the Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Albert-Jan Roskam
In reply to this post by Salbod
Maybe you could use:
set unicode = off locale = "en_US.UTF8".
That way you use a local codepage encoding (so no BOM) and still use utf-8. Never tried this, however.

What I don't understand is why SHOW LOCALE then shows this output:
en_US.windows-1252 (en_US.UTF8)

Regards,

Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--------------------------------------------
On Wed, 11/13/13, Salbod <[hidden email]> wrote:

 Subject: [SPSSX-L] UTF-8 Warning
 To: [hidden email]
 Date: Wednesday, November 13, 2013, 12:38 AM

 I was completely caught off guard
 today when I was trying to port text files
 from SPSS into MPlus. I didn't know that the default on SAVE
 AS is set to
 Unicode (UFT-8) rather than Local Encoding. Is there anyway
 I can set the
 Encoding on the Save As Dialog window to Local Encoding?

 Any suggestions will be most welcome.

 Thank you, Stephen Salbod, Pace University, NYC



 --
 View this message in context: http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
 Sent from the SPSSX Discussion mailing list archive at
 Nabble.com.

 =====================
 To manage your subscription to SPSSX-L, send a message to
 [hidden email]
 (not to SPSSX-L), with no body text except the
 command. To leave the list, send the command
 SIGNOFF SPSSX-L
 For a list of commands to manage subscriptions, send the
 command
 INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Salbod
In reply to this post by Maguin, Eugene
Thanks, Gene, you nailed it. When I opened a saved .dat file (UFT-8) in MPlus the character rears ugly head.

Case closed.

--Steve

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Maguin, Eugene
Sent: Wednesday, November 13, 2013 8:48 AM
To: [hidden email]
Subject: Re: UTF-8 Warning

I've been through this before, with a helpful explanation from Jon as to what the characters were, and you can do is to open the data file in the mplus editor, which will show the BOM, and delete the characters.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Salbod
Sent: Tuesday, November 12, 2013 6:39 PM
To: [hidden email]
Subject: UTF-8 Warning

I was completely caught off guard today when I was trying to port text files from SPSS into MPlus. I didn't know that the default on SAVE AS is set to Unicode (UFT-8) rather than Local Encoding. Is there anyway I can set the Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Jon K Peck
In reply to this post by Albert-Jan Roskam
You should just say locale=english.  If you specify locale as below you are saying an English locale with utf-8, but I suppose then with Unicode off it would convert utf-8 input into cp1252.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Albert-Jan Roskam <[hidden email]>
To:        [hidden email],
Date:        11/13/2013 08:37 AM
Subject:        Re: [SPSSX-L] UTF-8 Warning
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Maybe you could use:
set unicode = off locale = "en_US.UTF8".
That way you use a local codepage encoding (so no BOM) and still use utf-8. Never tried this, however.

What I don't understand is why SHOW LOCALE then shows this output:
en_US.windows-1252 (en_US.UTF8)

Regards,

Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--------------------------------------------
On Wed, 11/13/13, Salbod <[hidden email]> wrote:

Subject: [SPSSX-L] UTF-8 Warning
To: [hidden email]
Date: Wednesday, November 13, 2013, 12:38 AM

I was completely caught off guard
today when I was trying to port text files
from SPSS into MPlus. I didn't know that the default on SAVE
AS is set to
Unicode (UFT-8) rather than Local Encoding. Is there anyway
I can set the
Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at
Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email]
(not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the
command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: UTF-8 Warning

Salbod

Albert-Jan and Jon, I will follow your prescriptions. Thanks, Steve

 

PS

In my travels I came across this most enlightening article on UFT-8

http://www.joelonsoftware.com/articles/Unicode.html

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Wednesday, November 13, 2013 12:40 PM
To: [hidden email]
Subject: Re: UTF-8 Warning

 

You should just say locale=english.  If you specify locale as below you are saying an English locale with utf-8, but I suppose then with Unicode off it would convert utf-8 input into cp1252.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Albert-Jan Roskam <[hidden email]>
To:        [hidden email],
Date:        11/13/2013 08:37 AM
Subject:        Re: [SPSSX-L] UTF-8 Warning
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Maybe you could use:
set unicode = off locale = "en_US.UTF8".
That way you use a local codepage encoding (so no BOM) and still use utf-8. Never tried this, however.

What I don't understand is why SHOW LOCALE then shows this output:
en_US.windows-1252 (en_US.UTF8)

Regards,

Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--------------------------------------------
On Wed, 11/13/13, Salbod <[hidden email]> wrote:

Subject: [SPSSX-L] UTF-8 Warning
To: [hidden email]
Date: Wednesday, November 13, 2013, 12:38 AM

I was completely caught off guard
today when I was trying to port text files
from SPSS into MPlus. I didn't know that the default on SAVE
AS is set to
Unicode (UFT-8) rather than Local Encoding. Is there anyway
I can set the
Encoding on the Save As Dialog window to Local Encoding?

Any suggestions will be most welcome.

Thank you, Stephen Salbod, Pace University, NYC



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/UTF-8-Warning-tp5723015.html
Sent from the SPSSX Discussion mailing list archive at
Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email]
(not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the
command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD