I just ran Jon’s previous Python code, written to move question numbers to the beginning of var labels in the 2011 British Social Attitudes survey, on a subset from the 2004 survey. This was on a teaching file with only 49 variables as used in Marsh and Elliott “Exploring Data” (2nd edition, Polity, 2008). title 'Python code to modify BSA variable labels (Jon Peck, IBM/SPSS, 2013)'. begin program. import spss,re from spssaux import _smartquote for v in range(spss.GetVariableCount()): vname = spss.GetVariableName(v) vlabel = spss.GetVariableLabel(v) vl = [] # Find the question number and move to front mo = re.match(r"(.*)(:Q)(\d+).*", vlabel) if not mo is None: vl.append("Q." + mo.group(3) + ": ") vl.append(mo.group(1)) hasq = True else: # no Q-style question number. Check for multiple questions hasq = False mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) if len(vl) == 0: vl.append("") vl.append(vlabel) # capitalize first letter of label excluding the Q number vl[-1] = vl[-1][0].upper() + vl[-1][1:] # find freestanding "dv" mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I) if not mo is None: if hasq: vlabel = vl[0] + "(dv) " + mo.group(1) else: if vl[0] != "": vl[0] = "(dv) " + vl[0] vlabel = vl[0] + mo.group(1) + mo.group(3) else: vlabel = "(dv) " + mo.group(1) + mo.group(3) else: vlabel = vl[0] + vl[1] spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel))) end program. The initial file has: Country: England, Scotland or Wales? Q28 Sex of Respondent Q39 Respondent's age in years Q40 People can be trusted/can't be too careful?A2.13 NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766 The Python worked on some: Q.28: Country: England, Scotland or Wales? Q.39: Sex of Respondent Q.40: Respondent's age in years A2.13: People can be trusted/can't be too careful? but not others, eg: NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766 Respondent give money to charity how often? B619 Respondent gives how much to charity per year B620 Party political identification (compressed) dv Q211 I can modify the subset it by hand, but the main file has over 800 variables. I’ve tried some clumsy modifications to the Python, but none of them seem to work. Help! John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
Administrator
|
John,
Maybe time for you to do a crash course on regular expressions so you at least understand the code before doing clumsy modifications? D
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by John F Hall
John,
The specifications for that code said that the question number was preceded by a colon. So the matching expression includes (:Q) These labels don't have the colon, so remove it from the search expression here mo = re.match(r"(.*)(:Q)(\d+).*",
vlabel)
I just ran Jon’s previous Python code, written to move question numbers to the beginning of var labels in the 2011 British Social Attitudes survey, on a subset from the 2004 survey. This was on a teaching file with only 49 variables as used in Marsh and Elliott “Exploring Data” (2nd edition, Polity, 2008).
title 'Python code to modify BSA variable labels (Jon Peck, IBM/SPSS, 2013)'. begin program. import spss,re from spssaux import _smartquote
for v in range(spss.GetVariableCount()): vname = spss.GetVariableName(v) vlabel = spss.GetVariableLabel(v) vl = [] # Find the question number and move to front mo = re.match(r"(.*)(:Q)(\d+).*", vlabel) if not mo is None: vl.append("Q." + mo.group(3) + ": ") vl.append(mo.group(1)) hasq = True else: # no Q-style question number. Check for multiple questions hasq = False mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) if len(vl) == 0: vl.append("") vl.append(vlabel) # capitalize first letter of label excluding the Q number vl[-1] = vl[-1][0].upper() + vl[-1][1:] # find freestanding "dv" mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I) if not mo is None: if hasq: vlabel = vl[0] + "(dv) " + mo.group(1) else: if vl[0] != "": vl[0] = "(dv) " + vl[0] vlabel = vl[0] + mo.group(1) + mo.group(3) else: vlabel = "(dv) " + mo.group(1) + mo.group(3) else: vlabel = vl[0] + vl[1] spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel))) end program.
The initial file has:
Country: England, Scotland or Wales? Q28 Sex of Respondent Q39 Respondent's age in years Q40 People can be trusted/can't be too careful?A2.13 NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766
The Python worked on some:
Q.28: Country: England, Scotland or Wales? Q.39: Sex of Respondent Q.40: Respondent's age in years A2.13: People can be trusted/can't be too careful?
but not others, eg:
NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766
Respondent give money to charity how often? B619 Respondent gives how much to charity per year B620
Party political identification (compressed) dv Q211
I can modify the subset it by hand, but the main file has over 800 variables. I’ve tried some clumsy modifications to the Python, but none of them seem to work.
Help!
John F Hall (Mr) [Retired academic survey researcher]
Email: johnfhall@... Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop
|
I did that, but it still didn’t work. Leave it for tonight and I’ll try again in the morning. From: Jon K Peck [mailto:[hidden email]] John, mo = re.match(r"(.*)(:Q)(\d+).*", vlabel)
title 'Python code to modify BSA variable labels (Jon Peck, IBM/SPSS, 2013)'. begin program. import spss,re from spssaux import _smartquote for v in range(spss.GetVariableCount()): vname = spss.GetVariableName(v) vlabel = spss.GetVariableLabel(v) vl = [] # Find the question number and move to front mo = re.match(r"(.*)(:Q)(\d+).*", vlabel) if not mo is None: vl.append("Q." + mo.group(3) + ": ") vl.append(mo.group(1)) hasq = True else: # no Q-style question number. Check for multiple questions hasq = False mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) if len(vl) == 0: vl.append("") vl.append(vlabel) # capitalize first letter of label excluding the Q number vl[-1] = vl[-1][0].upper() + vl[-1][1:] # find freestanding "dv" mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I) if not mo is None: if hasq: vlabel = vl[0] + "(dv) " + mo.group(1) else: if vl[0] != "": vl[0] = "(dv) " + vl[0] vlabel = vl[0] + mo.group(1) + mo.group(3) else: vlabel = "(dv) " + mo.group(1) + mo.group(3) else: vlabel = vl[0] + vl[1] spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel))) end program. The initial file has: Country: England, Scotland or Wales? Q28 Sex of Respondent Q39 Respondent's age in years Q40 People can be trusted/can't be too careful?A2.13 NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766 The Python worked on some: Q.28: Country: England, Scotland or Wales? Q.39: Sex of Respondent Q.40: Respondent's age in years A2.13: People can be trusted/can't be too careful? but not others, eg: NS-SEC - long version Q519 Respondent's main economic activity last week? Q539 Terminal education age<categorised> Q766 Respondent give money to charity how often? B619 Respondent gives how much to charity per year B620 Party political identification (compressed) dv Q211 I can modify the subset it by hand, but the main file has over 800 variables. I’ve tried some clumsy modifications to the Python, but none of them seem to work. Help! John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
In reply to this post by Jon K Peck
John,
You might want to consider replacing the last line: spss.Submit("""variable
label %s %s.""" % (vname, _smartquote(vlabel)))
with this: outfile = os.path.join(tempfile.gettempdir(), " syntax_" + time.strftime("%Y-%m-%d_%Hh%Mm%Ss") + " .sps") with open(outfile, "wb") as f: f.write("variable label %s %s%s." % (vname, _smartquote(vlabel), os.linesep)) print " ---> Done! Syntax read: '%s'" % f.name but put this at the beginning of the BEGIN PROGRAM block first: import os, tempfile, time That way, the generated syntax is written to the computer's temporary directory, e.g. as 'syntax_2014-02-15_14h18m20s.sps' . You can open it and, if necessary, finetune it manually. You can/should of course keep a copy of the syntax for future reference. That way, the Python program will do most of the work, but all the rare exceptions that would make the program overly compiicated are done by you. Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
In reply to this post by David Marso
I concurr, Esp. in regular expressions one character can even make a big difference.
These two reseources are very good, even though the first one is about Python 3. It is a sample chapter from a book by Mark Summerfield: http://ptgmedia.pearsoncmg.com/images/9780321680563/samplepages/0321680561_Sample.pdf http://docs.python.org/2/howto/regex.html Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
Free forum by Nabble | Edit this page |