automatic correction of label errors

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

automatic correction of label errors

Mario Giesel
Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "ä"
 
So I have to change "ä" into "ä" wherever it occurs (Variable label or value label):
Gefällt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)
 
What is the best approach using Python for doing that?
 
Thanks for any help.
 Mario
Mario Giesel
Munich, Germany
Reply | Threaded
Open this post in threaded view
|

Re: automatic correction of label errors

Jon K Peck
For data, of course, you can just use the replace transformation function.  For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped.  Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Mario Giesel <[hidden email]>
To:        [hidden email]
Date:        08/31/2011 03:32 AM
Subject:        [SPSSX-L] automatic correction of label errors
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "&auml;"

 
So I have to change "&auml;" into "ä" wherever it occurs (Variable label or value label):
Gef&auml;llt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)

 
What is the best approach using Python for doing that?
 
Thanks for any help.
Mario

Reply | Threaded
Open this post in threaded view
|

Re: automatic correction of label errors

Mario Giesel
Thanks, Jon, I found a solution that corrects value labels using Python.
Maybe others are interested in it as well:
 
BEGIN PROGRAM PYTHON.
import spss, spssaux, re
vallabel = []
#List of umlauts and corrections
umlaute = [["&auml;","ä"],["&uuml;","ü"],["&ouml;","ö"],["&Auml;","Ä"],["&Uuml;","Ü"],["&Ouml;","Ö"],["&szlig","ß"]]
syntax = ""
for i in xrange(spss.GetVariableCount()): # check all SPSS variables
  vallabel.append(spssaux.GetValueLabels(i)) # write Value Labels of variable
  for key in vallabel[i]:
    for k in xrange(len(umlaute)):
      if re.search(umlaute[k][0], vallabel[i][key]): # If first string in list of umlauts shows up
        #Exchange wrong string with correct string
        vallabel[i][key] = vallabel[i][key].replace(umlaute[k][0],umlaute[k][1])
        # Create SPSS syntax
        syntax = syntax + "ADD VALUE LABEL " + spss.GetVariableName(i) + " " + key + "'" + vallabel[i][key] + "'.\n"
print syntax
spss.Submit(syntax)
END PROGRAM.
 

Von: Jon K Peck <[hidden email]>
An: Mario Giesel <[hidden email]>
Cc: [hidden email]
Gesendet: 14:37 Mittwoch, 31.August 2011
Betreff: Re: [SPSSX-L] automatic correction of label errors

For data, of course, you can just use the replace transformation function.  For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped.  Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Mario Giesel <[hidden email]>
To:        [hidden email]
Date:        08/31/2011 03:32 AM
Subject:        [SPSSX-L] automatic correction of label errors
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "&auml;"

 
So I have to change "&auml;" into "ä" wherever it occurs (Variable label or value label):
Gef&auml;llt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)

 
What is the best approach using Python for doing that?
 
Thanks for any help.
Mario



Mario Giesel
Munich, Germany
Reply | Threaded
Open this post in threaded view
|

Re: automatic correction of label errors

Albert-Jan Roskam
Shouldn't you have the following shebang as the first line of your code? # -*- coding: utf-8 -*-
Btw, I'd use a Python dictionary to store the umlaute, that makes the notation easier and  dictionaries
really are perfect for translation tables. I'd try something like this (untested!)
# -*- coding: utf-8 -*-
import re
import spss
umlaute = {"&auml;":"ä",
           "&uuml;":"ü",
           "&ouml;":"ö",
           "&Auml;":"Ä",
           "&Uuml;":"Ü",
           "&Ouml;":"Ö",
           "&szlig":"ß"}
for v in spss.VariableDict():
    newvallabels = {}
    for val, lab in v.ValueLabels:
        for from_, to in umlaute.items():
            newvallabels[val] = re.sub(from_, to, lab)
    v.ValueLabels = newvallabels       
   


 
Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Mario Giesel <[hidden email]>
To: [hidden email]
Sent: Friday, September 2, 2011 6:44 PM
Subject: Re: [SPSSX-L] automatic correction of label errors

Thanks, Jon, I found a solution that corrects value labels using Python.
Maybe others are interested in it as well:
 
BEGIN PROGRAM PYTHON.
import spss, spssaux, re
vallabel = []
#List of umlauts and corrections
umlaute = [["&auml;","ä"],["&uuml;","ü"],["&ouml;","ö"],["&Auml;","Ä"],["&Uuml;","Ü"],["&Ouml;","Ö"],["&szlig","ß"]]
syntax = ""
for i in xrange(spss.GetVariableCount()): # check all SPSS variables
  vallabel.append(spssaux.GetValueLabels(i)) # write Value Labels of variable
  for key in vallabel[i]:
    for k in xrange(len(umlaute)):
      if re.search(umlaute[k][0], vallabel[i][key]): # If first string in list of umlauts shows up
        #Exchange wrong string with correct string
        vallabel[i][key] = vallabel[i][key].replace(umlaute[k][0],umlaute[k][1])
        # Create SPSS syntax
        syntax = syntax + "ADD VALUE LABEL " + spss.GetVariableName(i) + " " + key + "'" + vallabel[i][key] + "'.\n"
print syntax
spss.Submit(syntax)
END PROGRAM.
 

Von: Jon K Peck <[hidden email]>
An: Mario Giesel <[hidden email]>
Cc: [hidden email]
Gesendet: 14:37 Mittwoch, 31.August 2011
Betreff: Re: [SPSSX-L] automatic correction of label errors

For data, of course, you can just use the replace transformation function.  For the metadata, you can use the spssaux.VariableDict class or the spss.Dataset class and iterate through all the labels replacing whatever strings have been escaped.  Make a table of strings and replacements, search each variable and value label for entries, and just assign the replaced values.

Jon Peck (no "h")
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Mario Giesel <[hidden email]>
To:        [hidden email]
Date:        08/31/2011 03:32 AM
Subject:        [SPSSX-L] automatic correction of label errors
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



Hello, SPSS friends,
 
Now and then I get databases where labels of variables have to be changed due to errors in umlauts.
E.G. instead of letter "ä" there is "&auml;"

 
So I have to change "&auml;" into "ä" wherever it occurs (Variable label or value label):
Gef&auml;llt mir voll und ganz (100%) => Gefällt mir voll und ganz (100%)

 
What is the best approach using Python for doing that?
 
Thanks for any help.
Mario





Reply | Threaded
Open this post in threaded view
|

Articles on Mediation analysis using SEM

E. Bernardo
Dear all,

We hope someone would suggest articles that used mediation analysis using SEM approach.  We will get from them some ideas on how to present the results of our studies.

Thank you in advance.

Eins

Reply | Threaded
Open this post in threaded view
|

Re: Articles on Mediation analysis using SEM

Matthias Spörrle
Eins,

you might want to look at:

Cheung, G. W., & Lau, R. S. (2008). Testing mediation and suppression
effects of latent variables: Bootstrapping with structural equation
models. Organizational Research Methods, 11(2), 296-325. doi:
10.1177/1094428107300343

Frese, M., Garst, H., & Fay, D. (2007). Making things happen:
Reciprocal relationships between work characteristics and personal
initiative in a four-wave longitudinal structural equation model.
Journal of Applied Psychology, 92(4), 1084-1102. doi:
10.1037/0021-9010.92.4.1084

James, L. R., Mulaik, S. A., & Brett, J. M. (2006). A tale of two
methods. Organizational Research Methods, 9(2), 233-244. doi:
10.1177/1094428105285144

Ledermann, T., & Macho, S. (2009). Mediation in dyadic data at the
level of the dyads: A structural equation modeling approach. Journal
of Family Psychology, 23(5), 661-670. doi: 10.1037/a0016197

Leiter, M. P., Gascón, S., & Martínez-Jarreta, B. (2010). Making sense
of work life: A structural model of burnout. Journal of Applied Social
Psychology, 40(1), 57-75. doi: 10.1111/j.1559-1816.2009.00563.x

Nes, L. S., Evans, D. R., & Segerstrom, S. C. (2009). Optimism and
college retention: Mediation by motivation, performance, and
adjustment. Journal of Applied Social Psychology, 39(8), 1887-1912.
doi: 10.1111/j.1559-1816.2009.00508.x

HTH
Matthias





On Sun, Sep 4, 2011 at 5:05 PM, Eins Bernardo <[hidden email]> wrote:
> Dear all,
> We hope someone would suggest articles that used mediation analysis using
> SEM approach.  We will get from them some ideas on how to present the
> results of our studies.
> Thank you in advance.
> Eins
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD