Conversion of foreign dataset

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Conversion of foreign dataset

Jeff A

 

…odd situation I’m trying to figure out.

 

I have a student from South Korea who has data from Korea in an Excel file - presumably the original was in Korean. We can successfully open the data in SPSS, but in both Excel and SPSS the variable names are coming through like this, “≥≤º∫¿Ã¿¸æ˜¡÷∫Œ¿Œ∞ÊøÏ∫Œ≤Ù∑¥¥Ÿ”

 

Clearly something’s not being translated properly when fonts are changed but I’m unsure how to get proper variable names (and Labels) into the SPSS dataset. I can see two possibilities:

 

1)      Use some method to convert the non-sense above into actual Korean that the student can translate to English for me.

2)      We can somehow erase all variable names (and variable labels) and manually type in meaningful labels since we have what’s essentially a codebook as a separate file. …but since each of the variable names in currently in the above non-sense, I’m unsure how I can actually delete these names and resign them new names since I have no meaningful way to refer to them. There has to be a way that I can write syntax saying the equivalent of “make the name of the first variable a and the second b” (or similar) so we can later alter that. Or alternatively write, rename variable 1 “Gender” or similar. I don’t want to have to do this manually since there are several hundred variables.

 

Can someone point me in the right direction?

 

Thanks in advance,

 

Jeff

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

PRogman
Just som thoughts, have never met this problem before.
If you open the character formatting window in Excel you may see a line
stating something like 'this font is not installed. A similar font is used'.
Do you have a Korean font (Malgun Gothic)?
Have you tried Google translate? Do you know if it is unicode or if a
certain national code page was used to generate the excel file? Did you get
a .CSV which may have the original encoding (and which can be opened in an
editor)
/PR




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

Jon Peck
In reply to this post by Jeff A
Unless the Excel file is really ancient, it would be in Unicode.  Make sure that Statistics is running in Unicode (Edit > Options > Language).  Otherwise, in code page mode, it would attempt to convert the text according to the current code page, which is probably not Korean.  The default Statistics font should handle Korean text, but if it has been changed, that might be an issue as well.

If this doesn’t solve the problem, I can take a look at the Excel file if you send it to me at [hidden email].

On Mon, Mar 2, 2020 at 10:31 PM Jeff <[hidden email]> wrote:

 

…odd situation I’m trying to figure out.

 

I have a student from South Korea who has data from Korea in an Excel file - presumably the original was in Korean. We can successfully open the data in SPSS, but in both Excel and SPSS the variable names are coming through like this, “≥≤º∫¿Ã¿¸æ˜¡÷∫Œ¿Œ∞ÊøÏ∫Œ≤Ù∑¥¥Ÿ”

 

Clearly something’s not being translated properly when fonts are changed but I’m unsure how to get proper variable names (and Labels) into the SPSS dataset. I can see two possibilities:

 

1)      Use some method to convert the non-sense above into actual Korean that the student can translate to English for me.

2)      We can somehow erase all variable names (and variable labels) and manually type in meaningful labels since we have what’s essentially a codebook as a separate file. …but since each of the variable names in currently in the above non-sense, I’m unsure how I can actually delete these names and resign them new names since I have no meaningful way to refer to them. There has to be a way that I can write syntax saying the equivalent of “make the name of the first variable a and the second b” (or similar) so we can later alter that. Or alternatively write, rename variable 1 “Gender” or similar. I don’t want to have to do this manually since there are several hundred variables.

 

Can someone point me in the right direction?

 

Thanks in advance,

 

Jeff

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

Jeff A

 

We were finally able to handle the problem by having someone in Korea download the data there and email it to us.

 

Now I’m trying to determine whether we can rename variables by referring to them according to their position in the dataset rather than the original name since the name is in a foreign language. …similar for variable labels and values.

 

E.g.,

Rename variables NewVar = 1st variable in the dataset.

 

Can this be done?

 

Jeff

 

 

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Jon Peck
Sent: Tuesday, March 3, 2020 11:47 PM
To: [hidden email]
Subject: Re: Conversion of foreign dataset

 

Unless the Excel file is really ancient, it would be in Unicode.  Make sure that Statistics is running in Unicode (Edit > Options > Language).  Otherwise, in code page mode, it would attempt to convert the text according to the current code page, which is probably not Korean.  The default Statistics font should handle Korean text, but if it has been changed, that might be an issue as well.

 

If this doesn’t solve the problem, I can take a look at the Excel file if you send it to me at [hidden email].

 

On Mon, Mar 2, 2020 at 10:31 PM Jeff <[hidden email]> wrote:

 

…odd situation I’m trying to figure out.

 

I have a student from South Korea who has data from Korea in an Excel file - presumably the original was in Korean. We can successfully open the data in SPSS, but in both Excel and SPSS the variable names are coming through like this, “≥≤º∫¿Ã¿¸æ˜¡÷∫Œ¿Œ∞ÊøÏ∫Œ≤Ù∑¥¥Ÿ”

 

Clearly something’s not being translated properly when fonts are changed but I’m unsure how to get proper variable names (and Labels) into the SPSS dataset. I can see two possibilities:

 

1)      Use some method to convert the non-sense above into actual Korean that the student can translate to English for me.

2)      We can somehow erase all variable names (and variable labels) and manually type in meaningful labels since we have what’s essentially a codebook as a separate file. …but since each of the variable names in currently in the above non-sense, I’m unsure how I can actually delete these names and resign them new names since I have no meaningful way to refer to them. There has to be a way that I can write syntax saying the equivalent of “make the name of the first variable a and the second b” (or similar) so we can later alter that. Or alternatively write, rename variable 1 “Gender” or similar. I don’t want to have to do this manually since there are several hundred variables.

 

Can someone point me in the right direction?

 

Thanks in advance,

 

Jeff

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Virus-free. www.avast.com
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

Jon Peck
There is no way in standard syntax to do the renaming based on number, but the code below will help.  It renames the variables V1 ... Vn so you can then use standard syntax to do the rest.  It also creates a custom attribute named OriginalName that contains, well, the original name.  You can display this attribute in the Data Editor Variable View by setting View > Customize Variable View.

If you wanted to supply a list of new names in file order, the code could be modified to use that instead.

begin program.
import spss, spssaux
vardict = spssaux.VariableDict()
for i,v in enumerate(vardict):
    print v.VariableName, v.VariableIndex
    attrdict = {"OriginalName" : v.VariableName}
    v.Attributes = attrdict
    spss.Submit("""rename variables (%s = %s)""" % (v.VariableName, "V" + str(i)))
end program.

On Tue, Mar 3, 2020 at 1:17 PM <[hidden email]> wrote:

 

We were finally able to handle the problem by having someone in Korea download the data there and email it to us.

 

Now I’m trying to determine whether we can rename variables by referring to them according to their position in the dataset rather than the original name since the name is in a foreign language. …similar for variable labels and values.

 

E.g.,

Rename variables NewVar = 1st variable in the dataset.

 

Can this be done?

 

Jeff

 

 

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Jon Peck
Sent: Tuesday, March 3, 2020 11:47 PM
To: [hidden email]
Subject: Re: Conversion of foreign dataset

 

Unless the Excel file is really ancient, it would be in Unicode.  Make sure that Statistics is running in Unicode (Edit > Options > Language).  Otherwise, in code page mode, it would attempt to convert the text according to the current code page, which is probably not Korean.  The default Statistics font should handle Korean text, but if it has been changed, that might be an issue as well.

 

If this doesn’t solve the problem, I can take a look at the Excel file if you send it to me at [hidden email].

 

On Mon, Mar 2, 2020 at 10:31 PM Jeff <[hidden email]> wrote:

 

…odd situation I’m trying to figure out.

 

I have a student from South Korea who has data from Korea in an Excel file - presumably the original was in Korean. We can successfully open the data in SPSS, but in both Excel and SPSS the variable names are coming through like this, “≥≤º∫¿Ã¿¸æ˜¡÷∫Œ¿Œ∞ÊøÏ∫Œ≤Ù∑¥¥Ÿ”

 

Clearly something’s not being translated properly when fonts are changed but I’m unsure how to get proper variable names (and Labels) into the SPSS dataset. I can see two possibilities:

 

1)      Use some method to convert the non-sense above into actual Korean that the student can translate to English for me.

2)      We can somehow erase all variable names (and variable labels) and manually type in meaningful labels since we have what’s essentially a codebook as a separate file. …but since each of the variable names in currently in the above non-sense, I’m unsure how I can actually delete these names and resign them new names since I have no meaningful way to refer to them. There has to be a way that I can write syntax saying the equivalent of “make the name of the first variable a and the second b” (or similar) so we can later alter that. Or alternatively write, rename variable 1 “Gender” or similar. I don’t want to have to do this manually since there are several hundred variables.

 

Can someone point me in the right direction?

 

Thanks in advance,

 

Jeff

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Virus-free. www.avast.com


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

Jeff A

I think you mean Ctrl+C, Ctrl+F is for searching. I actually tried this before, but did it wrong. As you’ve mentioned, you have to highlight the entire column and not go to the first and last variable with shift+click like I was trying to do.

 

Jon’s short program did the trick – I’ll keep all of these in my notes in case I run into it again.

 

Thanks

 

Jeff

 

 

From: [hidden email] <[hidden email]>
Sent: Wednesday, March 4, 2020 6:50 PM
To: [hidden email]; [hidden email]
Subject: RE: Conversion of foreign dataset

 

I’ve done this before using:

 

RENAME VARIABLES

                (<first varname> to <last varname> = V1 to Vn) .

 

Eg:

*brit Soc Att 2014.

RENAME VARIABLES

(sserial to year = v1 to v650).

 

You can also highlight the Names column and use Ctrl+F to copy the list of names to edit at will.

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074


Virus-free. www.avast.com
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

Jon Peck
Another trick is to use the Variables dialog (via the DE toolbar) to select and paste all the variables, but  you wouldn't get the custom attribute on the renaming.  I did that since there might be a need to reference the original data source or talk to the provider about the variables.

On Wed, Mar 4, 2020 at 2:59 PM Jeff A <[hidden email]> wrote:

I think you mean Ctrl+C, Ctrl+F is for searching. I actually tried this before, but did it wrong. As you’ve mentioned, you have to highlight the entire column and not go to the first and last variable with shift+click like I was trying to do.

 

Jon’s short program did the trick – I’ll keep all of these in my notes in case I run into it again.

 

Thanks

 

Jeff

 

 

From: [hidden email] <[hidden email]>
Sent: Wednesday, March 4, 2020 6:50 PM
To: [hidden email]; [hidden email]
Subject: RE: Conversion of foreign dataset

 

I’ve done this before using:

 

RENAME VARIABLES

                (<first varname> to <last varname> = V1 to Vn) .

 

Eg:

*brit Soc Att 2014.

RENAME VARIABLES

(sserial to year = v1 to v650).

 

You can also highlight the Names column and use Ctrl+F to copy the list of names to edit at will.

 

John F Hall MA (Cantab) Dip Ed (Dunelm)

IBM-SPSS Academic Author 9900074


Virus-free. www.avast.com
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Conversion of foreign dataset

John F Hall
No, I meant put cursor on Name and Ctrl+F to highlight all variables: Ctrl+C copies the entire variable list if you need it. Alternatively you can just highlight a few variables: Ctrl+C to copy them. Open a new syntax file: Write in: rename variables ( Ctrl+V to paste the list: Modify to: Add = ): Add a period: rename variables (SSerial to Country = v1 to v10). Run the syntax to get: All variable properties are preserved.. DO NOT SAVE THE CHANGES. Best to make a copy of your original *.sav file and work on that. Email: [hidden email] Website: Journeys in Survey Research Course: Survey Analysis Workshop (SPSS)

Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD