It is commendable that data and documentation for major surveys are eventually placed in the public domain in archives such as UKDS, the Roper Center and ICPSR: it should be, and mostly is, mandatory for those conducted using public funds. For the last few weeks I have been examining the published SPSS *.sav files for survey series such as the NORC General Social Survey (GSS) British Social Attitudes (BSA) Understanding Society (previously British Household Panel Survey) the European Social Survey (ESS) plus one or two single surveys such as Quality of American Life (Campbell et al, ISR Michigan, 1971) Quality of Life in Britain (Abrams and Hall 1971-75) and the recent ONS National Well-being survey. All these surveys have different conventions for naming and labelling variables, for labelling values and for specifying missing values. Most use (conventions based on) older versions of SPSS, including upper case (or lower case) throughout and (now obsolete) limitations on the number of characters used. Some use inordinately long labels with all the crucial information at the end, or use the full question text “to make tables easier to read”. Some are internally inconsistent in missing value specifications (if indeed there are any) and most have some quite bizarre text in the labelling. Some seem to be the result of (not very successful) automatic conversion from other software (Stata? OSIRIS?) some have clearly been constructed by (perhaps several) inexperienced people, possibly without adequate supervision. Whilst this may have been acceptable during time-constrained writing up by the original investigators of research reports, journal articles and books, it is in my opinion unprofessional to release such files into the public domain to be used for teaching and/or secondary analysis by people unconnected with the original investigation. This means that I have to spend many hours scrutinising the files and making necessary amendments such as specifying measurement levels, checking missing values and much else to bring them up to my stringent (and possibly pedantic) professional standards required even for publication, let alone for teaching and/or secondary analysis. I have now had a chance to look at the (very satisfactory) results of Jon’s various Python codes on all these surveys, but wonder if there’s an easy way to specify in advance, depending on the survey, a list of standard abbreviations which should remain in, or be converted to, upper case, (e.g. U.S., USA, U.S.A, SMSA, GSS, UK, U.K., PSU, DK, IAP etc.) before being fed into Python (it’s probably too much to ask for a spellcheck on words such as england). Variable labels can be dealt with by copy-pasting the Label column from Variable View to Word and using [CTRL]H to change strings such as 1St, 2Nd., but this doesn’t leave an audit trail or any syntax. Value labels are a different proposition altogether, unless the Define Variable Properties (DVP) facility is used to modify them and other properties (but have you seen the syntax that generates from PASTE?). However, even when all that work has been done, there’s no guarantee that the improved files will be approved by the original investigators, or adopted and made available by the distributors. In order to use data from 2011 British Social Attitudes 2011 or Understanding Society 2012 in current and planned new tutorials, I am now completely at the mercy of the respective original investigators, UKDS licensing rules and UKDS response times, even for quite small extracts, examples and exercises, before I can upload them to my site. At 73 in a few days, a month is a very long time in (what’s left of) my life. . John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html |
Administrator
|
"but have you seen the syntax that generates from PASTE?)."
Yes. AFAICT it is simply normal SPSS syntax for VARIABLE LABELS, VALUE LABELS VARIABKE TYPE etc. Is that a problem? Why do you feel compelled to do mop up on these poorly done surveys? I would think you have better things to do with your retirement than salvage OPs work. Pick up that cross and keep marching up the hill. Left right left right.... Alternatively learn enough python or Basic scripting to be able to modify the code yourself?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
David The surveys I am dealing with are used extensively in teaching and for secondary analysis, not only in UK, but also in Europe and USA. They are not "poorly done surveys": they are models of research design, substantive content and methodological rigour. It's only the SPSS files that are not always produced to the same high professional standards. I always taught my students that a well designed and documented SPSS saved file should be second nature and a required professional standard, both for their own further use and also by others. Yes, of course the DVP produces normal syntax: the only trouble is it produces a separate set of commands for each variable, repeating: VARIABLE LEVEL MISSING VALUES VARIABLE LABELS VALUE LABELS . . for each of the variables modified. I sometimes use DVP to do this, but always tidy up the syntax later, including abbreviating commands. In fact, curiosity led me yesterday to experiment with CTRL+H inside the highlighted Labels and Values columns in Variable View. In a few minutes I was able to restore abbreviations (smsa, du, psu, u.s. etc) to upper case throughout both the Labels and Values columns. The conversion was also applied to "hidden" value labels: brilliant! Worth a tutorial in its own right. Never too old to learn something new! Jon's Python codes can be applied to all surveys in a series, not just a single wave. I looked at Python with a view to learning it, but decided that it would take me far too long. I'm not a statistical computing expert, just a “sociologist of sorts” with a background in survey research and with an enviable record of research and teaching, including producing a generation of researchers at a time when empirical and quantitative research was under sustained and misguided attack in the UK, on one side from the Thatcher government, on the other from academic sociologists, including some of my own colleagues. My time is better spent producing high quality learning materials for the next such generation. If you want to know how serious the problem has been in the UK, have a look at a new book: Geoff Payne and Malcolm Williams [Eds] Teaching Quantitative Methods: getting the basics RIGHT (Sage, 2011) Malcolm is Director of the School of Social Sciences at Cardiff: in the late 1980s he learned SPSS and survey analysis (and did a spell as a temp checking and coding questionnaires on a real survey) as one of my undergraduate students. The book is a compilation of specially commissioned chapters reporting on undergraduate QM teaching initiatives across a range of disciplines and represents an important development in remedying the lack of QM qualified social scientists in the UK. These are part of a programme funded by the (UK) Economic and Social Research Council, the Joseph Rowntree Foundation and the Higher Education Funding Council, with Prof John Macinnes (Sociology, Edinburgh) as ESRC Strategic Advisor on undergraduate QM teaching. I like to think that my efforts are not to "salvage OPs work", but to make SPSS files based on important survey data (and heavily used in teaching) more accessible, understandable and attractive and much easier to use. As such these represent my own small contribution to QM teaching initiatives in particular and to social science research in general. Now what should I do: learn Python, go and pick the rest of my apples, catch up my huge backlog of films recorded from TV, or turn a few more pigs' ears into silk purses? John (aka Sisyphus) John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html -----Original Message----- "but have you seen the syntax that generates from PASTE?)." Yes. AFAICT it is simply normal SPSS syntax for VARIABLE LABELS, VALUE LABELS VARIABKE TYPE etc. Is that a problem? Why do you feel compelled to do mop up on these poorly done surveys? I would think you have better things to do with your retirement than salvage OPs work. Pick up that cross and keep marching up the hill. Left right left right.... Alternatively learn enough python or Basic scripting to be able to modify the code yourself? ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/FW-Change-labels-from-upper-to-lower-case-tp5723291p5723293.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |