Creating new variables based on variable label

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Creating new variables based on variable label

emma78
Hi,

I have a datatset with nine Variables which have the following variable labels:
Q1
Q2
Q3
Q3
Q4
Q5
Q6
Q6
Q7


Is there a possibility to rename the variablenames acording to those labels?
Because there are some identical names, suffixe would be neccessary.
As a final result I would like to have those variablenames:
Q1
Q2
Q3_1
Q3_2
Q4
Q5
Q6_1
Q6_2
Q7

Do you know a command which can be helpful here?

Thank you!
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

emma78
Just one additional remark:
I know the rename syntax but I was interested if there is a more automatically way to write a command which extracts the names out of the variabel labels.
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

David Marso
Administrator
In reply to this post by emma78
This thread might be useful to locate the appropriate python method.
http://spssx-discussion.1045642.n5.nabble.com/Python-Getting-variable-labels-from-spssaux-VariableDict-td5720662.html#a5720674

You will have to write code to generate the appropriate RENAME command after resolving the duplicates.
This is a rather unusual request.  Usually people create VARIABLE LABELS which reflect a more complex expansion of the variable intent.  Are you sure all of your variable labels are valid SPSS variable names aside from duplications?  That would be a much more difficult task to achieve.
--
emma78 wrote
Hi,

I have a datatset with nine Variables which have the following variable labels:
Q1
Q2
Q3
Q3
Q4
Q5
Q6
Q6
Q7


Is there a possibility to rename the variablenames acording to those labels?
Because there are some identical names, suffixe would be neccessary.
As a final result I would like to have those variablenames:
Q1
Q2
Q3_1
Q3_2
Q4
Q5
Q6_1
Q6_2
Q7

Do you know a command which can be helpful here?

Thank you!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Maguin, Eugene
In reply to this post by emma78
Why aren't you using the rename variables command? You've got just 9 variables. Alternataively, you could edit the names in the data window.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of emma78
Sent: Wednesday, September 23, 2015 10:24 AM
To: [hidden email]
Subject: Creating new variables based on variable label

Hi,

I have a datatset with nine Variables which have the following variable
labels:
Q1
Q2
Q3
Q3
Q4
Q5
Q6
Q6
Q7


Is there a possibility to rename the variable*names* acording to those labels?
Because there are some identical names, suffixe would be neccessary.
As a final result I would like to have those variablenames:
Q1
Q2
Q3_1
Q3_2
Q4
Q5
Q6_1
Q6_2
Q7

Do you know a command which can be helpful here?

Thank you!



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Jon K Peck
In reply to this post by emma78
I don't know why you would do this, but a few lines of Python code will handle it (hoping the indentation survives the list).

This code assumes that all the labels would in fact be valid variable names, that all the variables have labels, and that there are no case confusion possibilities, e. g., labels like Q3 and q3.  All those situations could be handled with a bit more code.


data list free /X1 to X9(9f1.0).
begin data
1 2 3 4 5 6 7 8 9
end data.
dataset name nine.
variable label X1 'Q1'/X2 'Q2'/X3 'Q3' /X4 'Q3'/X5 'Q3'/X6 'Q4'/X7 'Q7'/X8 'Q7'/X9 'Q8'.
begin program.
import spss, spssaux

vardict = spssaux.VariableDict()
remap = {}
for v in vardict:
    varlabel = v.VariableLabel
    suffix = 1
    originallabel = varlabel
    while True:
        if not varlabel in remap:
            remap[varlabel] = v.VariableName
            break
        varlabel = originallabel + "." + str(suffix)
        suffix += 1
for label, name in remap.items():
    spss.Submit("""RENAME VARIABLES (%s=%s)""" % (name, label))
end program.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 09:14 AM
Subject:        Re: [SPSSX-L] Creating new variables based on variable label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Just one additional remark:
I know the rename syntax but I was interested if there is a more
automatically way to write a command which extracts the names out of the
variabel labels.



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730638.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

emma78
Hi,
thanky for your answers,
John your syntax works pretty well. Is it possible that also the first of identical variables starts with _1?
So that I have
Q1
Q2
Q3_1
Q3_2
Q3_3
Q4
Q7_1
Q7_2
Q8


Nine varables are just an example, in real datasets we have several hundes of them:-)
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

John F Hall
Emma

What are your variables actually called in the Name column in the Data
Editor?  Seems a lot of messing about with so few variables.  Much easier to
edit everything in the Data Editor, and you really ought to add some short
descriptions in the Label column.

See 1.4:  Completing your data dictionary on my site:
(http://surveyresearch.weebly.com/block-1-from-questionnaire-to-spss-saved-f
ile.html )


John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com  
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop



See tutorial
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
emma78
Sent: 23 September 2015 18:08
To: [hidden email]
Subject: Re: Creating new variables based on variable label

Hi,
thanky for your answers,
John your syntax works pretty well. Is it possible that also the first of
identical variables starts with _1?
So that I have
Q1
Q2
Q3_1
Q3_2
Q3_3
Q4
Q7_1
Q7_2
Q8


Nine varables are just an example, in real datasets we have several hundes
of them:-)



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-o
n-variable-label-tp5730637p5730642.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

John F Hall
In reply to this post by emma78
Just seen that you have hundreds of labels/names, so comment withdrawn.  As
usual Jon Peck has a best solution in Python.  However I'm intrigued as to
how you got into this situation in the first place: are you using some
off-the-shelf data collection software?

John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop





-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
emma78
Sent: 23 September 2015 18:08
To: [hidden email]
Subject: Re: Creating new variables based on variable label

Hi,
thanky for your answers,
John your syntax works pretty well. Is it possible that also the first of
identical variables starts with _1?
So that I have
Q1
Q2
Q3_1
Q3_2
Q3_3
Q4
Q7_1
Q7_2
Q8


Nine varables are just an example, in real datasets we have several hundes
of them:-)



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-o
n-variable-label-tp5730637p5730642.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Jon K Peck
In reply to this post by emma78
That requires a quite different approach.  Does it matter what order the suffixes are in, or does this need to be related to the file order?


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 10:08 AM
Subject:        Re: [SPSSX-L] Creating new variables based on variable label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi,
thanky for your answers,
John your syntax works pretty well. Is it possible that also the first of
identical variables starts with _1?
So that I have
Q1
Q2
Q3_1
Q3_2
Q3_3
Q4
Q7_1
Q7_2
Q8


Nine varables are just an example, in real datasets we have several hundes
of them:-)



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730642.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

emma78
In reply to this post by John F Hall
Hi John,
Actually the datasets are quite normal😉 var names v_1, v_2 etc. and Longer Variable Labels.(Q1 Age, Q2 Gender ...)
The Problem is that we Need to have the names like the question numbers, Q1, Q2 and so on.
Therefore i made some adjustments so that the dataset Looks like the one i told You.
If there is an easier approach to this I Would be glad to know😊
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

emma78
In reply to this post by Jon K Peck
They have to be in the Order of the var Labels ,
So if we have 4 Q3
The Order  should Look like
Q3_1
Q3_2
Q3_3
Q3_4

Thank You!
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Mike
In reply to this post by Jon K Peck
Let me suggest a simple procedure to make the variable names
a combination of variable name plus variable label:
 
(1) Make sure that in "Options" (click on "Edit" on the top line
menu of the system data window; options is at the bottom of
the drop down list) in the "Output" table, one has "names and
labels" selected for pivot table.  The cause the combined
variable name-variable label to be printed in a pivot table.
 
(2) Run descriptives, for example.
 
desc var=all/stat=sum.
 
The resulting table will have three columns (a) the combined
variable name-variable label, (b) the sample size N, and (c) the
sum.
 
(3) One can edit the pivot table to copy the contents of
column (a) (one can do this either in SPSS or copy the
table into Excel and copy the first column from there).
 
(4) Go to the Variable View of the SPSS data window,
clink on the first column (variable names) which should
highlight it. Right click, paste the contents.
 
Now, the non-obvious problem is that variables labels typically
have stuff like blanks that are not allowed in variable names.
Also, variable name are limited to about 60 characters (at
least that's what I had to cut long labels to). I tried
this on some questionnaire data that I have and I realized
that I had to remove the blank spaces, dashes, commas,
colons, etc. in the variable labels.
 
Morale: Make sure that the variable labels conform to variable
name requirements.
 
-Mike Palij
New York University
 
 
----- Original Message -----
Sent: Wednesday, September 23, 2015 1:11 PM
Subject: Re: Creating new variables based on variable label

That requires a quite different approach.  Does it matter what order the suffixes are in, or does this need to be related to the file order?


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 10:08 AM
Subject:        Re: [SPSSX-L] Creating new variables based on variable label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi,
thanky for your answers,
John your syntax works pretty well. Is it possible that also the first of
identical variables starts with _1?
So that I have
Q1
Q2
Q3_1
Q3_2
Q3_3
Q4
Q7_1
Q7_2
Q8


Nine varables are just an example, in real datasets we have several hundes
of them:-)



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730642.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Jon K Peck
In reply to this post by emma78
For closure on this, after some offline discussion, here is the final Python code.

* Encoding: UTF-8.
data list free /X1 to X9(9f1.0).
begin data
1 2 3 4 5 6 7 8 9
end data.
dataset name nine.
variable label X1 'Q1'/X2 'Q2'/X3 'Q3' /X4 'Q3'/X5 'Q3'/X6 'Q4'/X7 'Q7'/X8 'Q7'/X9 'Q8'.

begin program.
# assumes all variables have labels and these are legal as variable names.
# also that there are no casing variations in labels that would result in duplicate
# variable names.

import spss, spssaux

vardict = spssaux.VariableDict()
labels = [(v.VariableLabel, v.VariableName, v.VariableIndex) for v in vardict]
labels.sort(key=lambda x: x[2])
counts = {}

for item in labels:
    if item[0] in counts:
        counts[item[0]] = 1
    else:
        counts[item[0]] = 0
for i in range(len(labels)):
    label, name, order = labels[i]
    if counts[label] == 0:
        spss.Submit(r"""RENAME VARIABLES (%s=%s)""" % (name, label))
    else:
        spss.Submit(r"""RENAME VARIABLES (%s=%s_%s""" % (name, label, counts[label]))
        counts[label] += 1
end program.




Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 12:00 PM
Subject:        Re: [SPSSX-L] Creating new variables based on variable label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




They have to be in the Order of the var Labels ,
So if we have 4 Q3
The Order  should Look like
Q3_1
Q3_2
Q3_3
Q3_4

Thank You!



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730650.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Mike
Jon uses a "toy" example below.  To show an example of a real
world problem consider the following situation:  Bob Altemeyer's
Right Wing Authoritarianism (RWA) scale is a very popular instrument
in social and political psychology (Google scholar indicates that
source I cite below.has been cited 1,185 times)  Here is some SPSS
syntax describing the items where RWA1 to RWA32 are the Variable
Names and the Variable Labels are the actual items (more or less).

title "Authoritarianism (RWA) & Social Dominance Orientation
(SDO)Analyses".
** NOTE: Auth & SDO scale come from Altemeyer (1998).
** "The Other Auth Personality".
subtitle "Transforming Original Data Into Analyzable Data".

** Below is additional syntax to identify each RWA item.
** Var Labels for Authoritarianism Scales Items".

var label RWA1 "A01. Established Auths Are Right"
RWA2 "A02. Women Should Obey Their Husbands"
RWA3 "A03. Mightly Leader Needed-Destroy Radicals"
RWA4 "A04. Gays Are Healthy And Moral"
RWA5 "A05. Trust Authorities"
RWA6 "A06. Atheists As Good As Religious Folk"
RWA7 "A07. Back To Traditional Values & Tough Leaders"
RWA8 "A08. Nothing Wrong With Nudist Camps"
RWA9 "A09. We Need Free-Thinkers"
RWA10 "A10. Must Smash Perversions Destroying Our Country"
RWA11 "A11. Everyone Has Own Lifestyle & Values"
RWA12 "A12. Old-Fashioned Values Are Best"
RWA13 "A13. Admire Pro-Choice, Animal Rights, Abolish School Prayer"
RWA14 "A14. We Need Strong Determined Leader-Crush Evil"
RWA15 "A15. Best People Challenge Gov, Crit Religion"
RWA16 "A16. Strictly Follow Gods Law-Punish Violators"
RWA17 "A17. Best To Censor Trashy Magazine"
RWA18 "A18. Premarital Sex Is Okay"
RWA19 "A19. Great If We Do What Authorities Tell Us-Rid Rotten Apples"
RWA20 "A20. No ONE Right Way To Live For Everyone"
RWA21 "A21. Praise for Homosexuals & Feminists"
RWA22 "A22. Toublemakers Should Shut-Up"
RWA23 "A23. Auhtorities Should Put Out Radicals & Immoral Folks"
RWA24 "A24. Less Attention to Bible-More To Personal Standards"
RWA25 "A25. We Need Discipline-Everyone Follow The Leader"
RWA26 "A26. Better To Have Trashy Mags Than Censorship"
RWA27 "A27. Must Crack Down On Deviant Groups & Troublemakers"
RWA28 "A28. Our Rules For Modesty & Sex No Better Than Other Folks"
RWA29 "A29. Strongest Methods Should Be Used To Eliminate Troublemakers"
RWA30 "A30. Womens Place is Awhere She Wants"
RWA31 "A31. Good That Young People Have Freedom To Protest"
RWA32 "A32. Good Citizens Will Stomp Out The Rot Poisoning Our Country".

I would never substitute the variable labels for the variable names in
this situation because it would make specifying analyses very
difficult.  Consider the syntax needed to calculate Cronbach's
Alpha for the RWA scale:

reliability var=RWA3 to RWA32/
 scale(Authoritarianism)=RWA3 to RWA32/
 model=alpha/
 stat=desc,corr,scale/
 summary=all/
 icc=model(oneway).

Why would one use long variable names in such a situation
or any other prodecude?

The real point I want to make is that the python code below would
not work with real data such as this because the variable labels
would have to be changed to eliminate the violations that the
labsls would have as variable names, such as:
(1) Eliminate the blank spaces (using "_" would maintin readability)
(2) Eliminate commas and other punctuation (periods are okay)
(3) Truncate the label to 60 characters.
The procedure I suggested in an earlier post (run descriptives
with the name and labels printed, edit the name+label string
in either in SPSS or Excel, and then paste the edited strings
into the names column of the variable view of the data
window) will produce the desired result.  What would have
to be added to the python code below to deal with a
situation using RWA items?

My own reading of the original post was that an instrument
was used where the variables were named Q1 etc to reflect
the ordinal position of the question in the questionnaire or
interview and the label represented the content of the question.
The output from procedures may have only provided the variable
name while both the name and label was desired. Setting the
options to print both names and labels takes care of this problem.

Perhaps the OP was unaware that one could get both the
variable name and variable label printed by setting the
appropriate option.  The simple solution then is not python
code but setting the correct options.

I suppose that Jon got more info from the OP and decided that
the Python solution is what the OP really wanted.  Perhaps.
This works only in very simple situations (i.e., labels are
single words or symbols) and should fail in more realistic situations.

-Mike Palij
New York University
[hidden email]



----- Original Message -----
From: Jon K Peck
To: [hidden email]
Sent: Thursday, September 24, 2015 10:07 AM
Subject: Re: Creating new variables based on variable label


For closure on this, after some offline discussion, here is the final
Python code.

* Encoding: UTF-8.
data list free /X1 to X9(9f1.0).
begin data
1 2 3 4 5 6 7 8 9
end data.
dataset name nine.
variable label X1 'Q1'/X2 'Q2'/X3 'Q3' /X4 'Q3'/X5 'Q3'/X6 'Q4'/X7
'Q7'/X8 'Q7'/X9 'Q8'.

begin program.
# assumes all variables have labels and these are legal as variable
names.
# also that there are no casing variations in labels that would result
in duplicate
# variable names.

import spss, spssaux

vardict = spssaux.VariableDict()
labels = [(v.VariableLabel, v.VariableName, v.VariableIndex) for v in
vardict]
labels.sort(key=lambda x: x[2])
counts = {}

for item in labels:
    if item[0] in counts:
        counts[item[0]] = 1
    else:
        counts[item[0]] = 0
for i in range(len(labels)):
    label, name, order = labels[i]
    if counts[label] == 0:
        spss.Submit(r"""RENAME VARIABLES (%s=%s)""" % (name, label))
    else:
        spss.Submit(r"""RENAME VARIABLES (%s=%s_%s""" % (name, label,
counts[label]))
        counts[label] += 1
end program.




Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 12:00 PM
Subject:        Re: [SPSSX-L] Creating new variables based on variable
label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





They have to be in the Order of the var Labels ,
So if we have 4 Q3
The Order  should Look like
Q3_1
Q3_2
Q3_3
Q3_4

Thank You!



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730650.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a
message to [hidden email] (not to SPSSX-L), with no body text
except the command. To leave the list, send the command SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command INFO
REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Jon K Peck
The Python code did the job for this specific problem.  It's certainly not a general solution nor is converting labels into names in general a good general idea.  There is a reason that Statistics supports both names and labels.  However, one could generalize the code to make the labels into legal names, albeit with possible truncation if the situation warranted.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Mike Palij" <[hidden email]>
To:        Jon K Peck/Chicago/IBM@IBMUS, <[hidden email]>
Cc:        "Michael Palij" <[hidden email]>
Date:        09/24/2015 10:54 AM
Subject:        Re: Creating new variables based on variable label




Jon uses a "toy" example below.  To show an example of a real
world problem consider the following situation:  Bob Altemeyer's
Right Wing Authoritarianism (RWA) scale is a very popular instrument
in social and political psychology (Google scholar indicates that
source I cite below.has been cited 1,185 times)  Here is some SPSS
syntax describing the items where RWA1 to RWA32 are the Variable
Names and the Variable Labels are the actual items (more or less).

title "Authoritarianism (RWA) & Social Dominance Orientation
(SDO)Analyses".
** NOTE: Auth & SDO scale come from Altemeyer (1998).
** "The Other Auth Personality".
subtitle "Transforming Original Data Into Analyzable Data".

** Below is additional syntax to identify each RWA item.
** Var Labels for Authoritarianism Scales Items".

var label RWA1 "A01. Established Auths Are Right"
RWA2 "A02. Women Should Obey Their Husbands"
RWA3 "A03. Mightly Leader Needed-Destroy Radicals"
RWA4 "A04. Gays Are Healthy And Moral"
RWA5 "A05. Trust Authorities"
RWA6 "A06. Atheists As Good As Religious Folk"
RWA7 "A07. Back To Traditional Values & Tough Leaders"
RWA8 "A08. Nothing Wrong With Nudist Camps"
RWA9 "A09. We Need Free-Thinkers"
RWA10 "A10. Must Smash Perversions Destroying Our Country"
RWA11 "A11. Everyone Has Own Lifestyle & Values"
RWA12 "A12. Old-Fashioned Values Are Best"
RWA13 "A13. Admire Pro-Choice, Animal Rights, Abolish School Prayer"
RWA14 "A14. We Need Strong Determined Leader-Crush Evil"
RWA15 "A15. Best People Challenge Gov, Crit Religion"
RWA16 "A16. Strictly Follow Gods Law-Punish Violators"
RWA17 "A17. Best To Censor Trashy Magazine"
RWA18 "A18. Premarital Sex Is Okay"
RWA19 "A19. Great If We Do What Authorities Tell Us-Rid Rotten Apples"
RWA20 "A20. No ONE Right Way To Live For Everyone"
RWA21 "A21. Praise for Homosexuals & Feminists"
RWA22 "A22. Toublemakers Should Shut-Up"
RWA23 "A23. Auhtorities Should Put Out Radicals & Immoral Folks"
RWA24 "A24. Less Attention to Bible-More To Personal Standards"
RWA25 "A25. We Need Discipline-Everyone Follow The Leader"
RWA26 "A26. Better To Have Trashy Mags Than Censorship"
RWA27 "A27. Must Crack Down On Deviant Groups & Troublemakers"
RWA28 "A28. Our Rules For Modesty & Sex No Better Than Other Folks"
RWA29 "A29. Strongest Methods Should Be Used To Eliminate Troublemakers"
RWA30 "A30. Womens Place is Awhere She Wants"
RWA31 "A31. Good That Young People Have Freedom To Protest"
RWA32 "A32. Good Citizens Will Stomp Out The Rot Poisoning Our Country".

I would never substitute the variable labels for the variable names in
this situation because it would make specifying analyses very
difficult.  Consider the syntax needed to calculate Cronbach's
Alpha for the RWA scale:

reliability var=RWA3 to RWA32/
scale(Authoritarianism)=RWA3 to RWA32/
model=alpha/
stat=desc,corr,scale/
summary=all/
icc=model(oneway).

Why would one use long variable names in such a situation
or any other prodecude?

The real point I want to make is that the python code below would
not work with real data such as this because the variable labels
would have to be changed to eliminate the violations that the
labsls would have as variable names, such as:
(1) Eliminate the blank spaces (using "_" would maintin readability)
(2) Eliminate commas and other punctuation (periods are okay)
(3) Truncate the label to 60 characters.
The procedure I suggested in an earlier post (run descriptives
with the name and labels printed, edit the name+label string
in either in SPSS or Excel, and then paste the edited strings
into the names column of the variable view of the data
window) will produce the desired result.  What would have
to be added to the python code below to deal with a
situation using RWA items?

My own reading of the original post was that an instrument
was used where the variables were named Q1 etc to reflect
the ordinal position of the question in the questionnaire or
interview and the label represented the content of the question.
The output from procedures may have only provided the variable
name while both the name and label was desired. Setting the
options to print both names and labels takes care of this problem.

Perhaps the OP was unaware that one could get both the
variable name and variable label printed by setting the
appropriate option.  The simple solution then is not python
code but setting the correct options.

I suppose that Jon got more info from the OP and decided that
the Python solution is what the OP really wanted.  Perhaps.
This works only in very simple situations (i.e., labels are
single words or symbols) and should fail in more realistic situations.

-Mike Palij
New York University
[hidden email]



----- Original Message -----
From: Jon K Peck
To: [hidden email]
Sent: Thursday, September 24, 2015 10:07 AM
Subject: Re: Creating new variables based on variable label


For closure on this, after some offline discussion, here is the final
Python code.

* Encoding: UTF-8.
data list free /X1 to X9(9f1.0).
begin data
1 2 3 4 5 6 7 8 9
end data.
dataset name nine.
variable label X1 'Q1'/X2 'Q2'/X3 'Q3' /X4 'Q3'/X5 'Q3'/X6 'Q4'/X7
'Q7'/X8 'Q7'/X9 'Q8'.

begin program.
# assumes all variables have labels and these are legal as variable
names.
# also that there are no casing variations in labels that would result
in duplicate
# variable names.

import spss, spssaux

vardict = spssaux.VariableDict()
labels = [(v.VariableLabel, v.VariableName, v.VariableIndex) for v in
vardict]
labels.sort(key=lambda x: x[2])
counts = {}

for item in labels:
   if item[0] in counts:
       counts[item[0]] = 1
   else:
       counts[item[0]] = 0
for i in range(len(labels)):
   label, name, order = labels[i]
   if counts[label] == 0:
       spss.Submit(r"""RENAME VARIABLES (%s=%s)""" % (name, label))
   else:
       spss.Submit(r"""RENAME VARIABLES (%s=%s_%s""" % (name, label,
counts[label]))
       counts[label] += 1
end program.




Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        emma78 <[hidden email]>
To:        [hidden email]
Date:        09/23/2015 12:00 PM
Subject:        Re: [SPSSX-L] Creating new variables based on variable
label
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





They have to be in the Order of the var Labels ,
So if we have 4 Q3
The Order  should Look like
Q3_1
Q3_2
Q3_3
Q3_4

Thank You!



--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-on-variable-label-tp5730637p5730650.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a
message to [hidden email] (not to SPSSX-L), with no body text
except the command. To leave the list, send the command SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command INFO
REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

John F Hall
In reply to this post by Mike

For questionnaire surveys I thoroughly agree about the importance of having meaningful var labels.  Most important is that the question number appears at the beginning of the label.  This enables researchers to find their way round the data set using the questionnaire as a guide.  It also enables researchers to find and use the variables they need, especially when using dialog boxes from the GUI.  Variable names can be confusing, especially mnemonics, but labels with the question number at the beginning enable quick scrolling to find what you want.  I use syntax for most things, but the GUI does have some useful facilities, the best of which for me when exploring data sets is Data >> Define Variable Properties.  

 

There is some coverage of this problem in the 2002 British Social Attitudes survey in my 2006 presentation to ASSESS (SPSS users in Europe)

http://surveyresearch.weebly.com/old-dog-old-tricks-using-spss-syntax-to-beat-the-mouse-trap.html especially in the third slide-show :

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/slides_3_-_european_social_survey_2002.ppt  

 

A specific problem with the 2011 British Social Attitudes Survey (on which I wanted to base some tutorials) was that the question numbers appeared at the end of sometimes very long labels (basically almost the full question text).  To save a great deal of manual editing, Jon Peck supplied many a Python routine to move question numbers to the beginning (and a few other very complex things besides).  I used this in my 2014 presentation to ASSESS:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/comments_on_the_distributed_spss_file_for_british_social_attitudes_2011.pdf  and the accompanying slide-show:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/4_british_social_attitudes.pptx  

 

The off-list correspondence with Jon was under subject:

Modifying British Social Attitudes variable labels               [with              syntax or              Python]

. . and I'm not sure if any of it went on Nabble.

 

Jon's final version, for which I shall be eternally grateful, is below:


title "Jon Peck's Python code for BSA 2011".

begin program.

import spss,re

from spssaux import _smartquote

for v in range(spss.GetVariableCount()):  

    vname = spss.GetVariableName(v)

    vlabel = spss.GetVariableLabel(v)

    vl = []

    # Find the question number and move to front

    mo = re.match(r"(.*)(:Q)(\d+).*", vlabel)

    if not mo is None:

        vl.append("Q." + mo.group(3) + ":  ")

        vl.append(mo.group(1))

        hasq = True

    else:  # no Q-style question number.  Check for multiple questions

        hasq = False

        mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        if len(vl) == 0:

            vl.append("")

            vl.append(vlabel)

    # capitalize first letter of label excluding the Q number

    vl[-1] = vl[-1][0].upper() + vl[-1][1:]

    # find freestanding "dv"

    mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I)

    if not mo is None:

        if hasq:

            vlabel = vl[0] + "(dv) " + mo.group(1)

        else:

            if vl[0] != "":

                vl[0] = "(dv) " + vl[0]

                vlabel = vl[0] + mo.group(1) + mo.group(3)

            else:

                vlabel = "(dv) " + mo.group(1) + mo.group(3)

    else:

        vlabel = vl[0] + vl[1]

    spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel)))

end program.


 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

-----Original Message-----

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mike Palij

Sent: 24 September 2015 18:55

To: [hidden email]

Subject: Re: Creating new variables based on variable label

 

Jon uses a "toy" example below.  To show an example of a real world problem consider the following situation:  Bob Altemeyer's Right Wing Authoritarianism (RWA) scale is a very popular instrument in social and political psychology (Google scholar indicates that source I cite below.has been cited 1,185 times)  Here is some SPSS syntax describing the items where RWA1 to RWA32 are the Variable Names and the Variable Labels are the actual items (more or less).

 

title "Authoritarianism (RWA) & Social Dominance Orientation (SDO)Analyses".

** NOTE: Auth & SDO scale come from Altemeyer (1998).

** "The Other Auth Personality".

subtitle "Transforming Original Data Into Analyzable Data".

 

** Below is additional syntax to identify each RWA item.

** Var Labels for Authoritarianism Scales Items".

 

var label RWA1 "A01. Established Auths Are Right"

RWA2 "A02. Women Should Obey Their Husbands"

RWA3 "A03. Mightly Leader Needed-Destroy Radicals"

RWA4 "A04. Gays Are Healthy And Moral"

RWA5 "A05. Trust Authorities"

RWA6 "A06. Atheists As Good As Religious Folk"

RWA7 "A07. Back To Traditional Values & Tough Leaders"

RWA8 "A08. Nothing Wrong With Nudist Camps"

RWA9 "A09. We Need Free-Thinkers"

RWA10 "A10. Must Smash Perversions Destroying Our Country"

RWA11 "A11. Everyone Has Own Lifestyle & Values"

RWA12 "A12. Old-Fashioned Values Are Best"

RWA13 "A13. Admire Pro-Choice, Animal Rights, Abolish School Prayer"

RWA14 "A14. We Need Strong Determined Leader-Crush Evil"

RWA15 "A15. Best People Challenge Gov, Crit Religion"

RWA16 "A16. Strictly Follow Gods Law-Punish Violators"

RWA17 "A17. Best To Censor Trashy Magazine"

RWA18 "A18. Premarital Sex Is Okay"

RWA19 "A19. Great If We Do What Authorities Tell Us-Rid Rotten Apples"

RWA20 "A20. No ONE Right Way To Live For Everyone"

RWA21 "A21. Praise for Homosexuals & Feminists"

RWA22 "A22. Toublemakers Should Shut-Up"

RWA23 "A23. Auhtorities Should Put Out Radicals & Immoral Folks"

RWA24 "A24. Less Attention to Bible-More To Personal Standards"

RWA25 "A25. We Need Discipline-Everyone Follow The Leader"

RWA26 "A26. Better To Have Trashy Mags Than Censorship"

RWA27 "A27. Must Crack Down On Deviant Groups & Troublemakers"

RWA28 "A28. Our Rules For Modesty & Sex No Better Than Other Folks"

RWA29 "A29. Strongest Methods Should Be Used To Eliminate Troublemakers"

RWA30 "A30. Womens Place is Awhere She Wants"

RWA31 "A31. Good That Young People Have Freedom To Protest"

RWA32 "A32. Good Citizens Will Stomp Out The Rot Poisoning Our Country".

 

I would never substitute the variable labels for the variable names in this situation because it would make specifying analyses very difficult.  Consider the syntax needed to calculate Cronbach's Alpha for the RWA scale:

 

reliability var=RWA3 to RWA32/

scale(Authoritarianism)=RWA3 to RWA32/

model=alpha/

stat=desc,corr,scale/

summary=all/

icc=model(oneway).

 

Why would one use long variable names in such a situation or any other prodecude?

 

The real point I want to make is that the python code below would not work with real data such as this because the variable labels would have to be changed to eliminate the violations that the labsls would have as variable names, such as:

(1) Eliminate the blank spaces (using "_" would maintin readability)

(2) Eliminate commas and other punctuation (periods are okay)

(3) Truncate the label to 60 characters.

The procedure I suggested in an earlier post (run descriptives with the name and labels printed, edit the name+label string in either in SPSS or Excel, and then paste the edited strings into the names column of the variable view of the data

window) will produce the desired result.  What would have to be added to the python code below to deal with a situation using RWA items?

 

My own reading of the original post was that an instrument was used where the variables were named Q1 etc to reflect the ordinal position of the question in the questionnaire or interview and the label represented the content of the question.

The output from procedures may have only provided the variable name while both the name and label was desired. Setting the options to print both names and labels takes care of this problem.

 

Perhaps the OP was unaware that one could get both the variable name and variable label printed by setting the appropriate option.  The simple solution then is not python code but setting the correct options.

 

I suppose that Jon got more info from the OP and decided that the Python solution is what the OP really wanted.  Perhaps.

This works only in very simple situations (i.e., labels are single words or symbols) and should fail in more realistic situations.

 

-Mike Palij

New York University

[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Mike
John,
I don't doubt that Python can be used to correct the problems
you describe below or more complex situations including the
now infamous "variable labels to variable names affair".  But
just look at the Python code you used.  If Jon were not available,
how long would it take for you to learn Python well enough to
be able to write something equivalent? What if you had to pay
a Python programmer to do this for you -- could you afford
to do so?
 
Jon has been very generous with his time and knowledge
and providing Python solutions pro bono. If not for this list, where
would you find someone with these skills and how would you
pay for them?
 
All this points to asking how can you use the native abilities
of SPSS to solve these types of problems. In some cases,
Python might be the only solution but what if Jon decides to
quit SPSS tomorrow to pursue other activities (maybe he
want to explore becoming a smooth jazz performer) who
is going to provide the Python code for free?  And how
many new computer languages do you want to learn before
you go off to the great computer lab in the sky?
 
-Mike Palij
New York University
 
----- Original Message -----
Sent: Thursday, September 24, 2015 2:40 PM
Subject: RE: Creating new variables based on variable label

For questionnaire surveys I thoroughly agree about the importance of having meaningful var labels.  Most important is that the question number appears at the beginning of the label.  This enables researchers to find their way round the data set using the questionnaire as a guide.  It also enables researchers to find and use the variables they need, especially when using dialog boxes from the GUI.  Variable names can be confusing, especially mnemonics, but labels with the question number at the beginning enable quick scrolling to find what you want.  I use syntax for most things, but the GUI does have some useful facilities, the best of which for me when exploring data sets is Data >> Define Variable Properties.  

 

There is some coverage of this problem in the 2002 British Social Attitudes survey in my 2006 presentation to ASSESS (SPSS users in Europe)

http://surveyresearch.weebly.com/old-dog-old-tricks-using-spss-syntax-to-beat-the-mouse-trap.html especially in the third slide-show :

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/slides_3_-_european_social_survey_2002.ppt  

 

A specific problem with the 2011 British Social Attitudes Survey (on which I wanted to base some tutorials) was that the question numbers appeared at the end of sometimes very long labels (basically almost the full question text).  To save a great deal of manual editing, Jon Peck supplied many a Python routine to move question numbers to the beginning (and a few other very complex things besides).  I used this in my 2014 presentation to ASSESS:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/comments_on_the_distributed_spss_file_for_british_social_attitudes_2011.pdf  and the accompanying slide-show:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/4_british_social_attitudes.pptx  

 

The off-list correspondence with Jon was under subject:

Modifying British Social Attitudes variable labels               [with              syntax or              Python]

. . and I'm not sure if any of it went on Nabble.

 

Jon's final version, for which I shall be eternally grateful, is below:


title "Jon Peck's Python code for BSA 2011".

begin program.

import spss,re

from spssaux import _smartquote

for v in range(spss.GetVariableCount()):  

    vname = spss.GetVariableName(v)

    vlabel = spss.GetVariableLabel(v)

    vl = []

    # Find the question number and move to front

    mo = re.match(r"(.*)(:Q)(\d+).*", vlabel)

    if not mo is None:

        vl.append("Q." + mo.group(3) + ":  ")

        vl.append(mo.group(1))

        hasq = True

    else:  # no Q-style question number.  Check for multiple questions

        hasq = False

        mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        if len(vl) == 0:

            vl.append("")

            vl.append(vlabel)

    # capitalize first letter of label excluding the Q number

    vl[-1] = vl[-1][0].upper() + vl[-1][1:]

    # find freestanding "dv"

    mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I)

    if not mo is None:

        if hasq:

            vlabel = vl[0] + "(dv) " + mo.group(1)

        else:

            if vl[0] != "":

                vl[0] = "(dv) " + vl[0]

                vlabel = vl[0] + mo.group(1) + mo.group(3)

            else:

                vlabel = "(dv) " + mo.group(1) + mo.group(3)

    else:

        vlabel = vl[0] + vl[1]

    spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel)))

end program.


 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

-----Original Message-----

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mike Palij

Sent: 24 September 2015 18:55

To: [hidden email]

Subject: Re: Creating new variables based on variable label

 

Jon uses a "toy" example below.  To show an example of a real world problem consider the following situation:  Bob Altemeyer's Right Wing Authoritarianism (RWA) scale is a very popular instrument in social and political psychology (Google scholar indicates that source I cite below.has been cited 1,185 times)  Here is some SPSS syntax describing the items where RWA1 to RWA32 are the Variable Names and the Variable Labels are the actual items (more or less).

 

title "Authoritarianism (RWA) & Social Dominance Orientation (SDO)Analyses".

** NOTE: Auth & SDO scale come from Altemeyer (1998).

** "The Other Auth Personality".

subtitle "Transforming Original Data Into Analyzable Data".

 

** Below is additional syntax to identify each RWA item.

** Var Labels for Authoritarianism Scales Items".

 

var label RWA1 "A01. Established Auths Are Right"

RWA2 "A02. Women Should Obey Their Husbands"

RWA3 "A03. Mightly Leader Needed-Destroy Radicals"

RWA4 "A04. Gays Are Healthy And Moral"

RWA5 "A05. Trust Authorities"

RWA6 "A06. Atheists As Good As Religious Folk"

RWA7 "A07. Back To Traditional Values & Tough Leaders"

RWA8 "A08. Nothing Wrong With Nudist Camps"

RWA9 "A09. We Need Free-Thinkers"

RWA10 "A10. Must Smash Perversions Destroying Our Country"

RWA11 "A11. Everyone Has Own Lifestyle & Values"

RWA12 "A12. Old-Fashioned Values Are Best"

RWA13 "A13. Admire Pro-Choice, Animal Rights, Abolish School Prayer"

RWA14 "A14. We Need Strong Determined Leader-Crush Evil"

RWA15 "A15. Best People Challenge Gov, Crit Religion"

RWA16 "A16. Strictly Follow Gods Law-Punish Violators"

RWA17 "A17. Best To Censor Trashy Magazine"

RWA18 "A18. Premarital Sex Is Okay"

RWA19 "A19. Great If We Do What Authorities Tell Us-Rid Rotten Apples"

RWA20 "A20. No ONE Right Way To Live For Everyone"

RWA21 "A21. Praise for Homosexuals & Feminists"

RWA22 "A22. Toublemakers Should Shut-Up"

RWA23 "A23. Auhtorities Should Put Out Radicals & Immoral Folks"

RWA24 "A24. Less Attention to Bible-More To Personal Standards"

RWA25 "A25. We Need Discipline-Everyone Follow The Leader"

RWA26 "A26. Better To Have Trashy Mags Than Censorship"

RWA27 "A27. Must Crack Down On Deviant Groups & Troublemakers"

RWA28 "A28. Our Rules For Modesty & Sex No Better Than Other Folks"

RWA29 "A29. Strongest Methods Should Be Used To Eliminate Troublemakers"

RWA30 "A30. Womens Place is Awhere She Wants"

RWA31 "A31. Good That Young People Have Freedom To Protest"

RWA32 "A32. Good Citizens Will Stomp Out The Rot Poisoning Our Country".

 

I would never substitute the variable labels for the variable names in this situation because it would make specifying analyses very difficult.  Consider the syntax needed to calculate Cronbach's Alpha for the RWA scale:

 

reliability var=RWA3 to RWA32/

scale(Authoritarianism)=RWA3 to RWA32/

model=alpha/

stat=desc,corr,scale/

summary=all/

icc=model(oneway).

 

Why would one use long variable names in such a situation or any other prodecude?

 

The real point I want to make is that the python code below would not work with real data such as this because the variable labels would have to be changed to eliminate the violations that the labsls would have as variable names, such as:

(1) Eliminate the blank spaces (using "_" would maintin readability)

(2) Eliminate commas and other punctuation (periods are okay)

(3) Truncate the label to 60 characters.

The procedure I suggested in an earlier post (run descriptives with the name and labels printed, edit the name+label string in either in SPSS or Excel, and then paste the edited strings into the names column of the variable view of the data

window) will produce the desired result.  What would have to be added to the python code below to deal with a situation using RWA items?

 

My own reading of the original post was that an instrument was used where the variables were named Q1 etc to reflect the ordinal position of the question in the questionnaire or interview and the label represented the content of the question.

The output from procedures may have only provided the variable name while both the name and label was desired. Setting the options to print both names and labels takes care of this problem.

 

Perhaps the OP was unaware that one could get both the variable name and variable label printed by setting the appropriate option.  The simple solution then is not python code but setting the correct options.

 

I suppose that Jon got more info from the OP and decided that the Python solution is what the OP really wanted.  Perhaps.

This works only in very simple situations (i.e., labels are single words or symbols) and should fail in more realistic situations.

 

-Mike Palij

New York University

[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

John F Hall

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mike Palij
Sent: 24 September 2015 22:17
To: [hidden email]
Subject: Re: Creating new variables based on variable label

 

Mike

 

Replies embedded.

 

John

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

John,

I don't doubt that Python can be used to correct the problems

you describe below or more complex situations including the

now infamous "variable labels to variable names affair".  But

just look at the Python code you used.  If Jon were not available,

how long would it take for you to learn Python well enough to

be able to write something equivalent?

 

>> Forever.  I do actually have R: I took one look and ran, but may one day check it out with the new GUI. It’s not a priority to learn it, let alone master it, until my SPSS tutorials are complete.  However, it’s not entirely unlike Algol and I did actually manage to tweak a bit of Jon’s code myself.

 

What if you had to pay a Python programmer to do this for you -- could you afford

to do so?

 

>> No

 

Jon has been very generous with his time and knowledge

and providing Python solutions pro bono. If not for this list, where

would you find someone with these skills . .

 

>> Nowhere else, but there are other listers from SPSS-X and ASSESS who occasionally help with queries.

 

. . and how would you pay for them?

>> I’m on a fractional early retirement pension: I couldn’t.

 

All this points to asking how can you use the native abilities of SPSS to solve these types of problems.

 

>> By designing and writing decent *.sps and *.sav files in the first place.  I also use standard SPSS procedures to clean up after other people who create, and deposit for archiving, messy (and illiterate) *.sav files, often accompanied by poor or not easily usable documentation, if any.

 

In some cases, Python might be the only solution but what if Jon decides to

quit SPSS tomorrow to pursue other activities (maybe he want to explore becoming a smooth jazz performer) who is going to provide the Python code for free? 

 

>>  This was a one off, and of great value for a number of major surveys used in teaching and (secondary) research.  Jon actually volunteered the original code in response to a list request from me.  There followed some tweaking to take account of some complex question structures in the BSA ( basically how to deal with derived variables and with different question numbers depending on which questionnaire version was used on overlapping subsamples) but there are also versions which cope with other major surveys such as NORC GSS, which for some reason used UPPER CASE for all labels, and others which start all labels with a lower case letter.  If Jon is anything like me, he probably enjoys tackling new problems which he knows much better than others how to solve.

 

Since setting up my website, I have provided hundreds of hours of advice and support to students and researchers, plus months and months of research and development for tutorials on my website and for presentations to ASSESS.  I have never either requested or received any payment for this service, and do not intend to.  (I’d lose my free licence as an academic author!).  I do it because I love it, as I hope, does Jon.

 

And how many new computer languages do you want to learn . .

 

>>  Think I’ve had my fill with Algol and Atlas Autocode (Salford Survey Suite: 1964 – 1970) which was just like writing Latin prose.   Intense programming experience very useful for getting to grips with later off-the-shelf software.  SPSS, SDTAB, MUTOS  and JCL for CDC2900 (1970 – 76) , SPSS and George (ICL1900 and 2900 series: 1976 - ??) SPSS-X, VMS, EDT (Dec-10 , Vax Cluster: 1987 – 1993)  WordStar (PC: 1987 – 2001) SPSSPC+ (short-lived as useless for teaching) SPSS 11 – 23 for Windows, Windows, MS Office (Word, Powerpoint, Excel) on PC (2001 to date). 

 

. . before you go off to the great computer lab in the sky?

 

> > I’ll be 75 in December: after 23 years of (early) retirement, this keeps my gray cells active.  No plans to pop my clogs just yet, but you never know.

 

Rugby World Cup recordings beckon.

 

-Mike Palij

New York University

 

----- Original Message -----

Sent: Thursday, September 24, 2015 2:40 PM

Subject: RE: Creating new variables based on variable label

 

For questionnaire surveys I thoroughly agree about the importance of having meaningful var labels.  Most important is that the question number appears at the beginning of the label.  This enables researchers to find their way round the data set using the questionnaire as a guide.  It also enables researchers to find and use the variables they need, especially when using dialog boxes from the GUI.  Variable names can be confusing, especially mnemonics, but labels with the question number at the beginning enable quick scrolling to find what you want.  I use syntax for most things, but the GUI does have some useful facilities, the best of which for me when exploring data sets is Data >> Define Variable Properties. 

 

There is some coverage of this problem in the 2002 British Social Attitudes survey in my 2006 presentation to ASSESS (SPSS users in Europe)

http://surveyresearch.weebly.com/old-dog-old-tricks-using-spss-syntax-to-beat-the-mouse-trap.html especially in the third slide-show :

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/slides_3_-_european_social_survey_2002.ppt 

 

A specific problem with the 2011 British Social Attitudes Survey (on which I wanted to base some tutorials) was that the question numbers appeared at the end of sometimes very long labels (basically almost the full question text).  To save a great deal of manual editing, Jon Peck supplied many a Python routine to move question numbers to the beginning (and a few other very complex things besides).  I used this in my 2014 presentation to ASSESS:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/comments_on_the_distributed_spss_file_for_british_social_attitudes_2011.pdf  and the accompanying slide-show:

http://surveyresearch.weebly.com/uploads/2/9/9/8/2998485/4_british_social_attitudes.pptx 

 

The off-list correspondence with Jon was under subject:

Modifying British Social Attitudes variable labels               [with              syntax or              Python]

. . and I'm not sure if any of it went on Nabble.

 

Jon's final version, for which I shall be eternally grateful, is below:


title "Jon Peck's Python code for BSA 2011".

begin program.

import spss,re

from spssaux import _smartquote

for v in range(spss.GetVariableCount()):  

    vname = spss.GetVariableName(v)

    vlabel = spss.GetVariableLabel(v)

    vl = []

    # Find the question number and move to front

    mo = re.match(r"(.*)(:Q)(\d+).*", vlabel)

    if not mo is None:

        vl.append("Q." + mo.group(3) + ":  ")

        vl.append(mo.group(1))

        hasq = True

    else:  # no Q-style question number.  Check for multiple questions

        hasq = False

        mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I)

        if not mo is None:   # multiple q's

            vl.append(mo.group(2) + ":  ")

            vl.append(mo.group(1))

        if len(vl) == 0:

            vl.append("")

            vl.append(vlabel)

    # capitalize first letter of label excluding the Q number

    vl[-1] = vl[-1][0].upper() + vl[-1][1:]

    # find freestanding "dv"

    mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I)

    if not mo is None:

        if hasq:

            vlabel = vl[0] + "(dv) " + mo.group(1)

        else:

            if vl[0] != "":

                vl[0] = "(dv) " + vl[0]

                vlabel = vl[0] + mo.group(1) + mo.group(3)

            else:

                vlabel = "(dv) " + mo.group(1) + mo.group(3)

    else:

        vlabel = vl[0] + vl[1]

    spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel)))

end program.


 

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

-----Original Message-----

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Mike Palij

Sent: 24 September 2015 18:55

To: [hidden email]

Subject: Re: Creating new variables based on variable label

 

Jon uses a "toy" example below.  To show an example of a real world problem consider the following situation:  Bob Altemeyer's Right Wing Authoritarianism (RWA) scale is a very popular instrument in social and political psychology (Google scholar indicates that source I cite below.has been cited 1,185 times)  Here is some SPSS syntax describing the items where RWA1 to RWA32 are the Variable Names and the Variable Labels are the actual items (more or less).

 

title "Authoritarianism (RWA) & Social Dominance Orientation (SDO)Analyses".

** NOTE: Auth & SDO scale come from Altemeyer (1998).

** "The Other Auth Personality".

subtitle "Transforming Original Data Into Analyzable Data".

 

** Below is additional syntax to identify each RWA item.

** Var Labels for Authoritarianism Scales Items".

 

var label RWA1 "A01. Established Auths Are Right"

RWA2 "A02. Women Should Obey Their Husbands"

RWA3 "A03. Mightly Leader Needed-Destroy Radicals"

RWA4 "A04. Gays Are Healthy And Moral"

RWA5 "A05. Trust Authorities"

RWA6 "A06. Atheists As Good As Religious Folk"

RWA7 "A07. Back To Traditional Values & Tough Leaders"

RWA8 "A08. Nothing Wrong With Nudist Camps"

RWA9 "A09. We Need Free-Thinkers"

RWA10 "A10. Must Smash Perversions Destroying Our Country"

RWA11 "A11. Everyone Has Own Lifestyle & Values"

RWA12 "A12. Old-Fashioned Values Are Best"

RWA13 "A13. Admire Pro-Choice, Animal Rights, Abolish School Prayer"

RWA14 "A14. We Need Strong Determined Leader-Crush Evil"

RWA15 "A15. Best People Challenge Gov, Crit Religion"

RWA16 "A16. Strictly Follow Gods Law-Punish Violators"

RWA17 "A17. Best To Censor Trashy Magazine"

RWA18 "A18. Premarital Sex Is Okay"

RWA19 "A19. Great If We Do What Authorities Tell Us-Rid Rotten Apples"

RWA20 "A20. No ONE Right Way To Live For Everyone"

RWA21 "A21. Praise for Homosexuals & Feminists"

RWA22 "A22. Toublemakers Should Shut-Up"

RWA23 "A23. Auhtorities Should Put Out Radicals & Immoral Folks"

RWA24 "A24. Less Attention to Bible-More To Personal Standards"

RWA25 "A25. We Need Discipline-Everyone Follow The Leader"

RWA26 "A26. Better To Have Trashy Mags Than Censorship"

RWA27 "A27. Must Crack Down On Deviant Groups & Troublemakers"

RWA28 "A28. Our Rules For Modesty & Sex No Better Than Other Folks"

RWA29 "A29. Strongest Methods Should Be Used To Eliminate Troublemakers"

RWA30 "A30. Womens Place is Awhere She Wants"

RWA31 "A31. Good That Young People Have Freedom To Protest"

RWA32 "A32. Good Citizens Will Stomp Out The Rot Poisoning Our Country".

 

I would never substitute the variable labels for the variable names in this situation because it would make specifying analyses very difficult.  Consider the syntax needed to calculate Cronbach's Alpha for the RWA scale:

 

reliability var=RWA3 to RWA32/

scale(Authoritarianism)=RWA3 to RWA32/

model=alpha/

stat=desc,corr,scale/

summary=all/

icc=model(oneway).

 

Why would one use long variable names in such a situation or any other prodecude?

 

The real point I want to make is that the python code below would not work with real data such as this because the variable labels would have to be changed to eliminate the violations that the labsls would have as variable names, such as:

(1) Eliminate the blank spaces (using "_" would maintin readability)

(2) Eliminate commas and other punctuation (periods are okay)

(3) Truncate the label to 60 characters.

The procedure I suggested in an earlier post (run descriptives with the name and labels printed, edit the name+label string in either in SPSS or Excel, and then paste the edited strings into the names column of the variable view of the data

window) will produce the desired result.  What would have to be added to the python code below to deal with a situation using RWA items?

 

My own reading of the original post was that an instrument was used where the variables were named Q1 etc to reflect the ordinal position of the question in the questionnaire or interview and the label represented the content of the question.

The output from procedures may have only provided the variable name while both the name and label was desired. Setting the options to print both names and labels takes care of this problem.

 

Perhaps the OP was unaware that one could get both the variable name and variable label printed by setting the appropriate option.  The simple solution then is not python code but setting the correct options.

 

I suppose that Jon got more info from the OP and decided that the Python solution is what the OP really wanted.  Perhaps.

This works only in very simple situations (i.e., labels are single words or symbols) and should fail in more realistic situations.

 

-Mike Palij

New York University

[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

Bruce Weaver
Administrator
OT:

John, be sure to cheer for Canada tomorrow (vs the Italians).  We need all the help we can get, after that first match against the Irish!  


John F Hall wrote
--- snip ---
 
Rugby World Cup recordings beckon.

--- snip ---
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Creating new variables based on variable label

John F Hall
Bruce

As long as I don't make too big a hole in my supplies of Hen and Abbott
before the following South Africa - Samoa and England - Wales matches.
French government has banned sales of Hen and Abbott in cans: something to
do with carcinogens in the ink used to print the outsides.  Luckily Tesco
had both on special offer in UK last week at £1 a can: I've got 2.5 cases of
Abbott and a case of Hen to last the whole series (one case = 24 x 500 cl
cans).  At this rate it will be a while before I finish the fun/horror
SPSSpuds-U-Like tutorial based on my 2015 potato crop (six varieties grown
on three different plots, yield from each plant photographed and weighed,
well mostly).  I'll send you a draft when it's ready for public consumption,
but not under this subject header

John

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 25 September 2015 17:39
To: [hidden email]
Subject: Re: Creating new variables based on variable label

OT:

John, be sure to cheer for Canada tomorrow (vs the Italians).  We need all
the help we can get, after that first match against the Irish!  



John F Hall wrote
> --- snip ---
>  
> Rugby World Cup recordings beckon.
>
> --- snip ---





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Creating-new-variables-based-o
n-variable-label-tp5730637p5730672.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD