Problem displaying Japanese characters, already set locale and Unicode.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem displaying Japanese characters, already set locale and Unicode.

jbloomfield1
Hello,

I've searched this news group and the web yet have not found a solution to my problem.

System: Windows 7 Professional and SPSS 18.0.0.

I have a data set that includes verbatim responses in different languages, English, French, German, Spanish, and Japanese.  The problem is with the open end responses containing Japanese characters. While some Japanese characters show up, non-Japanese characters are mixed within these responses as well (e.g. question marks, circles, check marks, arrows).  

I've tried suggestions from other threads: I've set Unicode on, set locale to Japanese in both SPSS and my Windows settings, and the problem persists. I used the Shift_JIS and windows-932 codepacks, changed the fonts in SPSS to both Arial MS Unicode and MS Mincho as well as set the language in SPSS to Japanese.  I had a native Japanese speaker look at the characters and confirm that some of the characters are Japanese, but the text is gibberish and suggested it was a problem that usually has to do with the encoding.

The file was pulled as a .sav in UTF-8 (Unicode) and I've already run the syntax to confirm that Unicode mode is on and the Japanese encoding is being used.  I've found Japanese text online and have copy/pasted into Notepad so I don't think it's a language pack issue.  I've even tried partitioning the Japanese respondents into a separate file to no avail.    

Is there anything that I'm leaving out?  Has anyone experienced similar issues?  I'm at wits end.

Many thanks in advance,

Jason
Reply | Threaded
Open this post in threaded view
|

Automatic reply: Problem displaying Japanese characters, already set locale and Unicode.

Marhefka, Stephanie

Hello. I will be out of the office  until Tuesday, May 28. I will do my best to respond to you upon my return. 

 

Stephanie L. Marhefka, Ph.D.

 

Reply | Threaded
Open this post in threaded view
|

Re: Problem displaying Japanese characters, already set locale and Unicode.

Jon K Peck
In reply to this post by jbloomfield1
This is not a font problem.  You might as well change those back to the default fonts, which can handle Japanese just fine.

You have an encoding problem.  With the mix of languages you have, you need to use Unicode: no code page would work for Japanese and European languages, since those languages will have accented characters that are not present in the Japanese code page such as e with acute accent.  You don't need any special Windows language packs.

It isn't clear, though what you are starting with.  You say "The file was pulled as a .sav in UTF-8".  What does that mean?  If this sav file was generated by some other application, it could be that the encoding is marked incorrectly in the sav file.  What was the actual input to creating the sav file?  If that was text or some other non-SPSS source, there would have been some assumption made about the encoding at that point.  The file might have been read as code page when it was actually UTF-8 or some other Unicode encoding such as UTF-16.  Or it might have actually been in UTF-8 but was read as if it was code page.  In most cases there would have been a conversion based on the assumption or declared input encoding.  If that was wrong, which sounds likely, it can't be fixed without going back to that source and reading it again with the correct input encoding declaration.

If you want to send me some of the inputs and the sav file, I can probably figure out what went wrong.

Regards,

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        jbloomfield1 <[hidden email]>
To:        [hidden email],
Date:        05/22/2013 04:37 PM
Subject:        [SPSSX-L] Problem displaying Japanese characters,              already set locale and Unicode.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hello,

I've searched this news group and the web yet have not found a solution to
my problem.

System: Windows 7 Professional and SPSS 18.0.0.

I have a data set that includes verbatim responses in different languages,
English, French, German, Spanish, and Japanese.  The problem is with the
open end responses containing Japanese characters. While some Japanese
characters show up, non-Japanese characters are mixed within these responses
as well (e.g. question marks, circles, check marks, arrows).

I've tried suggestions from other threads: I've set Unicode on, set locale
to Japanese in both SPSS and my Windows settings, and the problem persists.
I used the Shift_JIS and windows-932 codepacks, changed the fonts in SPSS to
both Arial MS Unicode and MS Mincho as well as set the language in SPSS to
Japanese.  I had a native Japanese speaker look at the characters and
confirm that some of the characters are Japanese, but the text is gibberish
and suggested it was a problem that usually has to do with the encoding.

The file was pulled as a .sav in UTF-8 (Unicode) and I've already run the
syntax to confirm that Unicode mode is on and the Japanese encoding is being
used.  I've found Japanese text online and have copy/pasted into Notepad so
I don't think it's a language pack issue.  I've even tried partitioning the
Japanese respondents into a separate file to no avail.

Is there anything that I'm leaving out?  Has anyone experienced similar
issues?  I'm at wits end.

Many thanks in advance,

Jason



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Problem-displaying-Japanese-characters-already-set-locale-and-Unicode-tp5720359.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Automatic reply: Problem displaying Japanese characters, already set locale and Unicode.

Harmon, Judith, PED





I will be out of the office until Tuesday May 28th.  I will respond to your e-mail when I return.

 Thanks!