I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey. Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html) For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values. Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? I have listed all the files in a match files command (as yet incomplete and untried). MATCH FILES file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav' ~~ ~ ~ ~ file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav' /keep <varlist to be decided>. Is there a quick way to compute variable [year] for each wave using something like . . DO REPEAT X = 1989 to 2014. COMPUTE year = x. END REPEAT . . such that [year] will pick up the value and add it to each file? . . or do I have to open each file one at a time and add [year] separately? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly. During this process there were messages for some files about Unicode and strings with tables like this:
However, when I tried to run: match files file 'C:\Users\John\Desktop\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav' /keep year rsex. Year and rsex seem to have been saved to an Untitled.sav file. freq year rsex. All I got was:
Any idea what happened to all the other datasets and cases? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop From: John F Hall [mailto:[hidden email]] I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey. Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html) For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values. Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? I have listed all the files in a match files command (as yet incomplete and untried). MATCH FILES file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav' ~~ ~ ~ ~ file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav' /keep <varlist to be decided>. Is there a quick way to compute variable [year] for each wave using something like . . DO REPEAT X = 1989 to 2014. COMPUTE year = x. END REPEAT . . such that [year] will pick up the value and add it to each file? . . or do I have to open each file one at a time and add [year] separately? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
In reply to this post by John F Hall
Thought I’d got using ADD FILES instead, but still only got: From: John F Hall [mailto:[hidden email]] I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly. During this process there were messages for some files about Unicode and strings with tables like this:
However, when I tried to run: match files file 'C:\Users\John\Desktop\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav' /keep year rsex. Year and rsex seem to have been saved to an Untitled.sav file. freq year rsex. All I got was:
Any idea what happened to all the other datasets and cases? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop From: John F Hall [[hidden email]] I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey. Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html) For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values. Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? I have listed all the files in a match files command (as yet incomplete and untried). MATCH FILES file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav' ~~ ~ ~ ~ file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav' /keep <varlist to be decided>. Is there a quick way to compute variable [year] for each wave using something like . . DO REPEAT X = 1989 to 2014. COMPUTE year = x. END REPEAT . . such that [year] will pick up the value and add it to each file? . . or do I have to open each file one at a time and add [year] separately? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
Administrator
|
John, it sounds like one problem may be that you have string variables common to more than one file that are not formatted the same in all files. I'd check that very carefully. Here is a nice way to do that (based on Andy W's post here: http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327.html#a5721333):
* Run the following ALTER TYPE command on all files to be merged. * Replace 255 with string width known to be large enough in all files. ALTER TYPE ALL (A = A255). * ADD FILES command here. * Run the following ALTER TYPE command on the merged file. ALTER TYPE ALL (A = AMIN). But never mind all that. Surely you must want ADD FILES here, given the number of files. You do want to stack them vertically, don't you?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In addition to Bruce's ADD FILES and string mop up theory.
ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005 ......./FILE=blahblah2015 .../IN=y2015. DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015. IF (Y EQ 1) Year = Value. END REPEAT. EXECUTE. DELETE VARIABLES y2004 TO y2015 . Other than that? Dates represented as you had them are crap and are indicative that utter rookies were involved in the design and creation of this mess. 'Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? ' John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or MOD and TRUNC. You have been doing this for a very long time and it is a shock that you have never mastered these things ;-( COMPUTE month=TRUNC(date/100). COMPUTE day=MOD(date,100). COMPUTE Date=DATE.MDY(month, day,year).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by John F Hall
Thought I’d found the solution using ADD FILES instead, but still only got:
Will persevere and post results, but I’d love to know why 1995 onwards aren’t included. Some data sets may already have had a year variable in a different format: will check back to original downloads. John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop From: John F Hall [[hidden email]] Thought I’d got using ADD FILES instead, but still only got: From: John F Hall [[hidden email]] I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly. During this process there were messages for some files about Unicode and strings with tables like this:
However, when I tried to run: match files file 'C:\Users\John\Desktop\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav' /keep year rsex. Year and rsex seem to have been saved to an Untitled.sav file. freq year rsex. All I got was:
Any idea what happened to all the other datasets and cases? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop From: John F Hall [[hidden email]] I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey. Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html) For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values. Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? I have listed all the files in a match files command (as yet incomplete and untried). MATCH FILES file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav' ~~ ~ ~ ~ file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav' /keep <varlist to be decided>. Is there a quick way to compute variable [year] for each wave using something like . . DO REPEAT X = 1989 to 2014. COMPUTE year = x. END REPEAT . . such that [year] will pick up the value and add it to each file? . . or do I have to open each file one at a time and add [year] separately? John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop |
In reply to this post by David Marso
Apologies for replying to David, not list. Easy mistake to make. Perhaps
default should be Reply to list, not writer? -----Original Message----- From: John F Hall [mailto:[hidden email]] Sent: 18 March 2016 06:50 To: 'David Marso' <[hidden email]> Cc: 'Bruce Weaver' <[hidden email]> Subject: RE: Syntax to process several *.sav files I had already made the switch to ADD FILES, but sent the mail to myself instead of the list. That has now been sent. Most of the surveys I have dealt with have been single time snapshots. ADD FILES is a command I may have used only three times in 50 years: MERGE FILES a few times when students were building up their own files from raw data using BSA1989. I have only rarely used date functions, and scouring the FM doesn't always yield solutions. Even then I doubt if I would have come up with David's neat IN = <year> device. Bruce is correct about the strings: that would explain all the warning messages: ALTER TYPE ALL (<varlist> = AMIN) when first opening the files. String variables with the same name should have the same format in all waves, but there are different string variables in some: I'll have to check. I agree with David about "rookies": I would have included "dyslexics" as well, but I can't go on the public record with comments like that: a positive and helpful approach is needed if I wish to maintain good working relationships with colleagues elsewhere. Working through the files I get a definite feeling that the writer(s) are not completely versed in good SPSS practice, particularly when the files are to be used by others. To be fair, the surveys were originally intended to measure trends across time, and not used as teaching aids. The series is stuck with 1983 mnemonic 8-character variable names (some diabolical inventions here) which remain constant across all waves, but make for tricky navigation. Early waves used printed questionnaires with indications for data-prep: this made secondary analysis quite easy using the facsimile questionnaires as navigation aids. Several years ago they switched to CAPI: the (annotated) BLAISE questionnaires are awkward and cumbersome to use as navigation aids. Report writing is farmed out to outside gurus, but some chapters are produced in-house. It is not clear who does the analysis for these, but authors doing their own may not always spot possible errors in the data they use. So, back to the syntax file to insert David's year-yyyy, then combing through 32 *.sav files looking for the strings (Highlight Type column, CTRL+F string) accompanied by dawn birdsong from thick mist outside. John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: 17 March 2016 23:25 To: [hidden email] Subject: Re: Syntax to process several *.sav files In addition to Bruce's ADD FILES and string mop up theory. ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005 ......./FILE=blahblah2015 .../IN=y2015. DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015. IF (Y EQ 1) Year = Value. END REPEAT. EXECUTE. DELETE VARIABLES y2004 TO y2015 . Other than that? Dates represented as you had them are crap and are indicative that utter rookies were involved in the design and creation of this mess. 'Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? ' John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or MOD and TRUNC. You have been doing this for a very long time and it is a shock that you have never mastered these things ;-( COMPUTE month=TRUNC(date/100). COMPUTE day=MOD(date,100). COMPUTE Date=DATE.MDY(month, day,year). Bruce Weaver wrote > John, it sounds like one problem may be that you have string variables > common to more than one file that are not formatted the same in all files. > I'd check that very carefully. Here is a nice way to do that (based > on Andy W's post here: > http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327 .html#a5721333): > > * Run the following ALTER TYPE command on all files to be merged. > * Replace 255 with string width known to be large enough in all files. > > ALTER TYPE ALL (A = A255). > > * ADD FILES command here. > * Run the following ALTER TYPE command on the merged file. > > ALTER TYPE ALL (A = AMIN). > > > But never mind all that. > Surely * > you must want ADD FILES here, given the number of files. You do want > to stack them vertically, don't you? > > > John F Hall wrote >> Thought I'd got using ADD FILES instead, but still only got: >> >> >> >> From: John F Hall [mailto: >> johnfhall@ >> ] >> Sent: 17 March 2016 19:21 >> To: ' >> SPSSX-L@.UGA >> ' < >> SPSSX-L@.UGA >> > >> Subject: RE: Syntax to process several *.sav files >> >> I have added a variable [year] to each of the files by hand and >> checked that the variable is there and that the year value has been >> added correctly. >> During this process there were messages for some files about Unicode >> and strings with tables like this: >> >> >> >> Altered Types >> >> Date of interview by interviewer Q36 >> A24 >> AMIN >> >> Computer Interview date Q37 >> A24 >> AMIN >> >> Start time HH:MM:SS Q38 >> A24 >> AMIN >> >> Interviewer Number Q1412 >> A12 >> AMIN >> >> >> >> However, when I tried to run: >> >> match files >> file 'C:\Users\John\Desktop\SPSS files\bsa1983.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav' >> /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav' >> /keep year rsex. >> >> Year and rsex seem to have been saved to an Untitled.sav file. >> >> freq year rsex. >> >> All I got was: >> >> >> year Year of Interview >> >> >> Frequency >> Percent >> Valid Percent >> Cumulative Percent >> >> Valid >> 1983 >> 1761 >> 39.3 >> 56.8 >> 56.8 >> >> 1985 >> 43 >> 1.0 >> 1.4 >> 58.2 >> >> 1986 >> 1296 >> 28.9 >> 41.8 >> 100.0 >> >> Total >> 3100 >> 69.1 >> 100.0 >> >> >> Missing >> System >> 1386 >> 30.9 >> >> >> >> Total >> 4486 >> 100.0 >> >> >> >> >> >> rsex Q91A RESPONDENTS SEX >> >> >> Frequency >> Percent >> Valid Percent >> Cumulative Percent >> >> Valid >> 1 MALE >> 2051 >> 45.7 >> 45.7 >> 45.7 >> >> 2 FEMALE >> 2435 >> 54.3 >> 54.3 >> 100.0 >> >> Total >> 4486 >> 100.0 >> 100.0 >> >> >> Any idea what happened to all the other datasets and cases? >> >> John F Hall (Mr) >> [Retired academic survey researcher] >> >> Email: <mailto: >> johnfhall@ >> > >> johnfhall@ >> >> Website: <http://www.surveyresearch.weebly.com/> >> www.surveyresearch.weebly.com >> SPSS start page: >> <http://surveyresearch.weebly.com/1-survey-analysis-workshop.html& >> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop >> >> >> >> >> From: John F Hall [mailto: >> johnfhall@ >> ] >> Sent: 17 March 2016 17:43 >> To: ' >> SPSSX-L@.UGA >> ' < >> SPSSX-L@.UGA >> <mailto: >> SPSSX-L@.UGA >> > > >> Subject: Syntax to process several *.sav files >> >> I am preparing exercises based on data sets from separate waves (1983 >> - >> 2014) of the British Social Attitudes survey. Some preliminary >> comments on the structure and content of the files are on my page >> Exploring British Social Attitudes >> (http://surveyresearch.weebly.com/exploring-british-social-attitudes. >> html) For instance some variables do not have missing values >> correctly >> specified: >> consequently scales derived from them have incorrect values. >> >> Some waves have a variable [year] and some have [date] in what >> appears to be numeric n4. How can I turn values for [date] of 331 >> and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm >> (31-Mar, 28_Oct)? >> >> I have listed all the files in a match files command (as yet >> incomplete and untried). >> >> MATCH FILES >> file 'C:\Users\John\Desktop\British Social Attitudes\BSA >> 1983-2014\SPSS files\bsa1983.sav' >> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA >> 1983-2014\SPSS files\bsa1984.sav' >> ~~ ~ ~ ~ >> file 'C:\Users\John\Desktop\British Social Attitudes\BSA >> 1983-2014\SPSS files\bsa2013.sav' >> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA >> 1983-2014\SPSS files\bsa2014.sav' >> /keep >> <varlist to be decided> >> . >> >> Is there a quick way to compute variable [year] for each wave using >> something like . . >> >> DO REPEAT >> X = 1989 to 2014. >> COMPUTE year = x. >> END REPEAT >> >> . . such that [year] will pick up the value and add it to each file? >> >> . . or do I have to open each file one at a time and add [year] >> separately? >> >> John F Hall (Mr) >> [Retired academic survey researcher] >> >> Email: <mailto: >> johnfhall@ >> > >> johnfhall@ >> >> Website: <http://www.surveyresearch.weebly.com/> >> www.surveyresearch.weebly.com >> SPSS start page: >> <http://surveyresearch.weebly.com/1-survey-analysis-workshop.html& >> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop >> >> >> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> LISTSERV@.UGA >> (not to SPSSX-L), with no body text except the command. To leave the >> list, send the command SIGNOFF SPSSX-L For a list of commands to >> manage subscriptions, send the command INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-to-process-several-sav- files-tp5731762p5731767.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
It may be overkill since this particular problem is mostly solved, but I would like to point out a few tools that could be useful in a similar situation. 1. Setting the year: It is easy to generate the year variable from the file name. Here is an example using one of the functions in the spssaux module installed with the Python materials. This example also calculates the date assuming that there is a variable named mday that has the format John described. spssaux.getDatasetInfo returns by default the filespec for the active dataset. The basename function extracts the file name itself; split breaks out the root from which the year digits are extracted. begin program. import spss, spssaux, os root = os.path.basename(spssaux.getDatasetInfo()).split(".")[0][3:] spss.Submit("""compute root = %s.""" % root) spss.Submit("""compute thedate = yrmoda(root, trunc(mday/100), mday - trunc(mday/100) * 100).""") end program. 2. The STATS ADJUST WIDTHS extension command takes a batch of sav files and checks for type and width consistency of selected variables. It corrects unequal widths and produces lists of situations where the types are inconsistent. With a lot of files to merge, this can be a huge timesaver. 3. SPSSINC PROCESS FILES takes a batch of syntax and applies it to each of a set of files specified typically by a wildcard expression such as bsa*.sav. In this example, it could be used to construct the year variable and rationalize the types and then iteratively do the ADD FILES command all without needing to enumerate all the files explicitly. (This is often combined with SPSSINC SPLIT DATASET in order to generalize SPLIT FILES so that the splits can operate, in effect, over a whole set of commands rather than within individual procedures.) On Thu, Mar 17, 2016 at 11:53 PM, John F Hall <[hidden email]> wrote: Apologies for replying to David, not list. Easy mistake to make. Perhaps |
In reply to this post by John F Hall
Jon British Social Attitudes 1983 – 2014 There’s been quite a few haystacks torn apart today and a few useful exchanges with Bruce and David. Too detailed to share on the list, but I’ll post something later. Basically I’ve had to deal with the same variable name [year] being used in some files but with different WIDTHS. In one file it was only a single digit, in another it was 4 and in yet another 5. Another variable [strttime] was NUMERIC 4 or 5 in different files and in one case STRING with data clearly a time in hh:mm:ss. Of course not all the above variables are repeated for each wave. I’ve made the necessary changes to keep everything consistent (and kept a detailed log). Part of the problem is that the BSA series was intended to measure change over time and is funded by clients who need the data often for policy purposes. Some of it has academic roots. Reports are written by gurus who may or may not have expertise in SPSS, others are written in-house. Authors using analysis without first checking the data run a serious risk of error. Metadata are written with reports in mind, not later users struggling to find their way round the files for teaching or secondary analysis. Although the data were not originally intended for teaching, they are an incredibly valuable resource for such (I have used them since 1983 when Roger Jowell, several months before publication of reports, gave me early access to the raw data so that I had time to prepare materials before courses started. The SPSS files generated can even today be used as models of file construction. Another problem is archiving software which often strips off things like measurement levels and other metadata from even the most carefully crafted SPSS files. There has been advice from Bruce and David to use macros and, from you, Python, but I’m not a programmer, I’m a (sort of) sociologist dealing with the substance of dozens of surveys, and trying to keep things simple for beginners and non-numerate students and clients. Progress so far has resulted in a combined file for 1983 to 1991 inclusive and a large Excel table showing all relevant information for all waves (I’ll send you a copy off-list) but now for the hairy bits dealing with inconsistent strings and other problems in the remaining waves. John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/1-survey-analysis-workshop From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon Peck It may be overkill since this particular problem is mostly solved, but I would like to point out a few tools that could be useful in a similar situation. 1. Setting the year: It is easy to generate the year variable from the file name. Here is an example using one of the functions in the spssaux module installed with the Python materials. This example also calculates the date assuming that there is a variable named mday that has the format John described. spssaux.getDatasetInfo returns by default the filespec for the active dataset. The basename function extracts the file name itself; split breaks out the root from which the year digits are extracted. begin program. import spss, spssaux, os root = os.path.basename(spssaux.getDatasetInfo()).split(".")[0][3:] spss.Submit("""compute root = %s.""" % root) spss.Submit("""compute thedate = yrmoda(root, trunc(mday/100), mday - trunc(mday/100) * 100).""") end program. 2. The STATS ADJUST WIDTHS extension command takes a batch of sav files and checks for type and width consistency of selected variables. It corrects unequal widths and produces lists of situations where the types are inconsistent. With a lot of files to merge, this can be a huge timesaver. 3. SPSSINC PROCESS FILES takes a batch of syntax and applies it to each of a set of files specified typically by a wildcard expression such as bsa*.sav. In this example, it could be used to construct the year variable and rationalize the types and then iteratively do the ADD FILES command all without needing to enumerate all the files explicitly. (This is often combined with SPSSINC SPLIT DATASET in order to generalize SPLIT FILES so that the splits can operate, in effect, over a whole set of commands rather than within individual procedures.) On Thu, Mar 17, 2016 at 11:53 PM, John F Hall <[hidden email]> wrote:
-- Jon K Peck ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by John F Hall
By the way, you do not want to use MATCH FILES, but in scanning this thread with
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
its very long quotations, I did not see a mention of why MATCH FILES is wrong here, because of exactly what it does -- MATCH FILES ordinarily is used with an ID variable to specify the matching. When there is no ID to match on, it provides a 1-to-1 match, effectively matching by Case number. IMO, the procedure should give a notification or warning when there is no matching variable, because the 1-to-1 match, sequentially, is less likely to be "intentional" than it is "error". The result (before using KEEP) is a file with the number of cases of the longest file. Each successive file in the set of FILE= adds its new files to the Variable list, and (I think) inserts its values in place of whatever values were there for the duplicated names -- I never tried it, because that sort of use would be terrible practice (UPDATE, using ID, handles updating). -- Rich Ulrich Date: Thu, 17 Mar 2016 17:43:17 +0100 From: [hidden email] Subject: Syntax to process several *.sav files To: [hidden email] I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey. Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html) For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.
Some waves have a variable [year] and some have [date] in what appears to be numeric n4. How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?
I have listed all the files in a match files command (as yet incomplete and untried).
MATCH FILES file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav' ~~ ~ ~ ~ file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav' /file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav' /keep <varlist to be decided>.
Is there a quick way to compute variable [year] for each wave using something like . .
DO REPEAT X = 1989 to 2014. COMPUTE year = x. END REPEAT
. . such that [year] will pick up the value and add it to each file?
. . or do I have to open each file one at a time and add [year] separately?
John F Hall (Mr) [Retired academic survey researcher]
|
Free forum by Nabble | Edit this page |