There has been a lot of activity off-list since I first posted this problem. Jon Peck is an absolute star, patiently bearing with my stream of label modification requests. All the variables in the SPSS file for BSA 2011 distributed by UKDA have the question number at the end. Many of them are inordinately long and users cannot see the question number without vastly widening the Label column. BEFORE: Serial Number :Q1 Sample point :Q9 Stratification ID Person 2 relationship to Respondent <8 categories> :Q57 Person 2 relationship to Respondent <7 categories> DV :Q58 Consider your life in general these days how happy or unhappy you are A2.1. How much confidence in the Educational system in Britain A2.2a. How much confidence in the Health care system in Britain A2.2b. height in centimeters A2.27a. weight in kilograms. dv A2.27b. How comfortable having close relative in a relationship with someone who grew up in a Muslim country C2.8. Censorship of films and magazines is necessary to uphold moral standards A2.49fB2.26fC2.25f. It’s much easier to navigate SPSS files from questionnaire surveys when the question number is at the beginning of a label so that it can clearly be seen in the default Variable View. Jon’s eventual version works a treat and is worth sharing: begin program. import spss,re from spssaux import _smartquote for v in range(spss.GetVariableCount()): vname = spss.GetVariableName(v) vlabel = spss.GetVariableLabel(v) vl = [] # Find the question number and move to front mo = re.match(r"(.*)(:Q)(\d+).*", vlabel) if not mo is None: vl.append("Q." + mo.group(3) + ": ") vl.append(mo.group(1)) hasq = True else: # no Q-style question number. Check for multiple questions hasq = False mo = re.match(r"(.*)(a2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) mo = re.match(r"(.*)(b2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) mo = re.match(r"(.*)(c2\..*)", vlabel, flags=re.I) if not mo is None: # multiple q's vl.append(mo.group(2) + ": ") vl.append(mo.group(1)) if len(vl) == 0: vl.append("") vl.append(vlabel) # capitalize first letter of label excluding the Q number vl[-1] = vl[-1][0].upper() + vl[-1][1:] # find freestanding "dv" mo = re.search(r"(.*)(\bdv\b)(.*)", vl[1], flags=re.I) if not mo is None: if hasq: vlabel = vl[0] + "(dv) " + mo.group(1) else: if vl[0] != "": vl[0] = "(dv) " + vl[0] vlabel = vl[0] + mo.group(1) + mo.group(3) else: vlabel = "(dv) " + mo.group(1) + mo.group(3) else: vlabel = vl[0] + vl[1] spss.Submit("""variable label %s %s.""" % (vname, _smartquote(vlabel))) end program. AFTER: Q.1: Serial Number Q.9: Sample point Stratification ID Q.14: (dv) Population Density Quartiles Q.57: Person 2 relationship to Respondent <8 categories> (dv) Q.58: () Person 2 relationship to Respondent <7 categories> A2.1.: Consider your life in general these days how happy or unhappy you are A2.2a.: How much confidence in the Educational system in Britain A2.2b.: How much confidence in the Health care system in Britain A2.27a.: Height in centimeters (dv) A2.27b.: Weight in kilograms. A2.49fB2.26fC2.25f.: Censorship of films and magazines is necessary to uphold moral standards C2.8.: How comfortable having close relative in a relationship with someone who grew up in a Muslim country All question numbers (where they exist) have been moved to the beginning of the labels, a stop inserted after Q, a colon and space after the number, all original upper case letters retained, all lower case letters at the beginning of the label (after the question number) converted to upper case, and any free standing “dv” or “DV” deleted, enclosed in brackets and moved to just after Q format question numbers and at the beginning of labels in other formats. Truly a silk purse out of a pig’s ear! There are 30 annual surveys in the series and they all have the same file structure. The Python code can now hoefully be applied to all of them as well. Thanks a million, Jon John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html PS I’ve corrected correcting the measurment levels and now modifying the missing values in good ol’ syntax with a bit of help from Data > Define Variable Properties for a quick check on value labels (so far I’ve found 58 unique combinations of values, many with 4 or more per variable). At least I can do something by myself. |
Free forum by Nabble | Edit this page |