Unicode Checking

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Unicode Checking

Georg Maubach-2
Unicode Checking

Dear Listers,

I would like to check if a dataset has Unicode encoding or not. Is there a way how I can check that using SPSS syntax?

I know how to check if the processor runs with Unicode. But I am interested in the data storage format.

Best regards

Georg

Reply | Threaded
Open this post in threaded view
|

Re: Unicode Checking

Rick Oliver-3

sysfile info file="filename.sav".


From: Georg Maubach <[hidden email]>
To: [hidden email]
Date: 07/02/2010 11:16 AM
Subject: Unicode Checking
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Dear Listers,

I would like to check if a dataset has Unicode encoding or not. Is there a way how I can check that using SPSS syntax?

I know how to check if the processor runs with Unicode. But I am interested in the data storage format.

Best regards

Georg

Reply | Threaded
Open this post in threaded view
|

Re: Unicode Checking

Jon K Peck
In reply to this post by Georg Maubach-2

The output from the SYSFILE INFO command includes a character encoding field.  For a Unicode file, it will be UTF-8.

Of course, any file open in Statistics will be in Unicode or code page consistent with the Statistics setting.

If you want to do this with Python code, here is some code to do this.

begin program.
import spss, spssaux
tag, errlevel=spssaux.createXmlOutput("sysfile info 'c:/spss18/samples/english/cars.sav'",
omsid='Sysfile Info')
charencoding = spssaux.getValuesFromXmlWorkspace(tag, "File Information",
rowCategory="Character Encoding")
print charencoding[0]
end program.

If you are using German output, change rowCategory to "Zeichenkodierung"

HTH,
Jon
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Georg Maubach <[hidden email]>
To: [hidden email]
Date: 07/02/2010 10:18 AM
Subject: [SPSSX-L] Unicode Checking
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Dear Listers,

I would like to check if a dataset has Unicode encoding or not. Is there a way how I can check that using SPSS syntax?

I know how to check if the processor runs with Unicode. But I am interested in the data storage format.

Best regards

Georg