SPSSX Discussion

automatic correction of label errors

Classic

List

Threaded

6 messages Options

Mario Giesel

automatic correction of label errors

Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "&auml;"
 
So I have to change "&auml;" into "ä" wherever it occurs (Variable label or value label):
Gef&auml;llt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%) 
 
What is the best approach using Python for doing that?
 
Thanks for any help.
 Mario

Mario Giesel
Munich, Germany

Jon K Peck

Re: automatic correction of label errors

For data, of course, you can just use the replace transformation function. For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped. Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: Mario Giesel <[hidden email]>
To: [hidden email]
Date: 08/31/2011 03:32 AM
Subject: [SPSSX-L] automatic correction of label errors
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hello, SPSS friends,

Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "ä"

So I have to change "ä" into "ä" wherever it occurs (Variable label or value label):
Gefällt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)

What is the best approach using Python for doing that?

Thanks for any help.
Mario

Mario Giesel

Re: automatic correction of label errors

Thanks, Jon, I found a solution that corrects value labels using Python.
Maybe others are interested in it as well:
 
BEGIN PROGRAM PYTHON.
import spss, spssaux, re
vallabel = []
#List of umlauts and corrections
umlaute = [["&auml;","ä"],["&uuml;","ü"],["&ouml;","ö"],["&Auml;","Ä"],["&Uuml;","Ü"],["&Ouml;","Ö"],["&szlig","ß"]] 
syntax = ""
for i in xrange(spss.GetVariableCount()): # check all SPSS variables
  vallabel.append(spssaux.GetValueLabels(i)) # write Value Labels of variable
  for key in vallabel[i]:
    for k in xrange(len(umlaute)):
      if re.search(umlaute[k][0], vallabel[i][key]): # If first string in list of umlauts shows up
        #Exchange wrong string with correct string
        vallabel[i][key] = vallabel[i][key].replace(umlaute[k][0],umlaute[k][1]) 
        # Create SPSS syntax
        syntax = syntax + "ADD VALUE LABEL " + spss.GetVariableName(i) + " " + key + "'" + vallabel[i][key] + "'.\n" 
print syntax
spss.Submit(syntax)
END PROGRAM.
 

Von: Jon K Peck <[hidden email]>
An: Mario Giesel <[hidden email]>
Cc:[hidden email]
Gesendet: 14:37 Mittwoch, 31.August 2011 
Betreff: Re: [SPSSX-L] automatic correction of label errors

For data, of course, you can just use the replace transformation function.  For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped.  Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Mario Giesel <[hidden email]>
To:        [hidden email]
Date:  
      08/31/2011 03:32 AM
Subject:        [SPSSX-L] automatic correction of label errors
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "&auml;"
 
So I have to change "&auml;" into "ä" wherever it occurs (Variable label or value label):
Gef&auml;llt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%) 
 
What is
 the best approach using Python for doing that?
 
Thanks for any help.
Mario

Mario Giesel
Munich, Germany

Albert-Jan Roskam

Re: automatic correction of label errors

Shouldn't you have the following shebang as the first line of your code? # -*- coding: utf-8 -*-

Btw, I'd use a Python dictionary to store the umlaute, that makes the notation easier and dictionaries

really are perfect for translation tables. I'd try something like this (untested!)

# -*- coding: utf-8 -*-
import re
import spss
umlaute = {"ä":"ä",
           "ü":"ü",
           "ö":"ö",
           "Ä":"Ä",
           "Ü":"Ü",
           "Ö":"Ö",
           "&szlig":"ß"}
for v in spss.VariableDict():

newvallabels = {}

    for val, lab in v.ValueLabels:
        for from_, to in umlaute.items():
            newvallabels[val] = re.sub(from_, to, lab)
    v.ValueLabels = newvallabels

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Mario Giesel <[hidden email]>
To: [hidden email]
Sent: Friday, September 2, 2011 6:44 PM
Subject: Re: [SPSSX-L] automatic correction of label errors

Thanks, Jon, I found a solution that corrects value labels using Python.

Maybe others are interested in it as well:

BEGIN PROGRAM PYTHON.

import spss, spssaux, re

vallabel = []

#List of umlauts and corrections

umlaute = [["ä","ä"],["ü","ü"],["ö","ö"],["Ä","Ä"],["Ü","Ü"],["Ö","Ö"],["&szlig","ß"]]

syntax = ""

for i in xrange(spss.GetVariableCount()): # check all SPSS variables

vallabel.append(spssaux.GetValueLabels(i)) # write Value Labels of variable

for key in vallabel[i]:

    for k in xrange(len(umlaute)):

      if re.search(umlaute[k][0], vallabel[i][key]): # If first string in list of umlauts shows up

        #Exchange wrong string with correct string

        vallabel[i][key] = vallabel[i][key].replace(umlaute[k][0],umlaute[k][1])

        # Create SPSS syntax

      syntax = syntax + "ADD VALUE LABEL " + spss.GetVariableName(i) + " " + key + "'" + vallabel[i][key] + "'.\n"

print syntax

spss.Submit(syntax)

END PROGRAM.

Von: Jon K Peck <[hidden email]>
An: Mario Giesel <[hidden email]>
Cc: [hidden email]
Gesendet: 14:37 Mittwoch, 31.August 2011
Betreff: Re: [SPSSX-L] automatic correction of label errors

For data, of course, you can just use the replace transformation function. For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped. Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: Mario Giesel <[hidden email]>
To: [hidden email]
Date: 08/31/2011 03:32 AM
Subject: [SPSSX-L] automatic correction of label errors
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hello, SPSS friends,

Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "ä"

So I have to change "ä" into "ä" wherever it occurs (Variable label or value label):
Gefällt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)

What is the best approach using Python for doing that?

Thanks for any help.
Mario

E. Bernardo

Articles on Mediation analysis using SEM

Dear all,

We hope someone would suggest articles that used mediation analysis using SEM approach. We will get from them some ideas on how to present the results of our studies.

Thank you in advance.

Eins

Matthias Spörrle

Re: Articles on Mediation analysis using SEM

Eins,

you might want to look at:

Cheung, G. W., & Lau, R. S. (2008). Testing mediation and suppression
effects of latent variables: Bootstrapping with structural equation
models. Organizational Research Methods, 11(2), 296-325. doi:
10.1177/1094428107300343

Frese, M., Garst, H., & Fay, D. (2007). Making things happen:
Reciprocal relationships between work characteristics and personal
initiative in a four-wave longitudinal structural equation model.
Journal of Applied Psychology, 92(4), 1084-1102. doi:
10.1037/0021-9010.92.4.1084

James, L. R., Mulaik, S. A., & Brett, J. M. (2006). A tale of two
methods. Organizational Research Methods, 9(2), 233-244. doi:
10.1177/1094428105285144

Ledermann, T., & Macho, S. (2009). Mediation in dyadic data at the
level of the dyads: A structural equation modeling approach. Journal
of Family Psychology, 23(5), 661-670. doi: 10.1037/a0016197

Leiter, M. P., Gascón, S., & Martínez-Jarreta, B. (2010). Making sense
of work life: A structural model of burnout. Journal of Applied Social
Psychology, 40(1), 57-75. doi: 10.1111/j.1559-1816.2009.00563.x

Nes, L. S., Evans, D. R., & Segerstrom, S. C. (2009). Optimism and
college retention: Mediation by motivation, performance, and
adjustment. Journal of Applied Social Psychology, 39(8), 1887-1912.
doi: 10.1111/j.1559-1816.2009.00508.x

HTH
Matthias

On Sun, Sep 4, 2011 at 5:05 PM, Eins Bernardo <[hidden email]> wrote:
> Dear all,
> We hope someone would suggest articles that used mediation analysis using
> SEM approach. We will get from them some ideas on how to present the
> results of our studies.
> Thank you in advance.
> Eins
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD