Syntax to process several *.sav files

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Syntax to process several *.sav files

John F Hall

I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey.  Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)  For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.

 

Some waves have a variable [year] and some have [date] in what appears to be  numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?

 

I have listed all the files in a match files command (as yet incomplete and untried).

 

MATCH FILES

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav'

~~ ~ ~ ~

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav'

/keep <varlist to be decided>.

 

Is there a quick way to compute variable [year] for each wave using something like . .

 

DO REPEAT

X = 1989 to 2014.

COMPUTE year = x.

END REPEAT

 

. . such that [year] will pick up the value and add it to each file?

 

. . or do I have to open each file one at a time  and add [year] separately?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

John F Hall

I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly.  During this process there were messages for some files about Unicode and strings with tables like this:

 

 

Altered Types

Date of interview by interviewer Q36

A24

AMIN

Computer Interview date Q37

A24

AMIN

Start time  HH:MM:SS Q38

A24

AMIN

Interviewer Number Q1412

A12

AMIN

 

 

 

However, when I tried to run:

 

match files

 file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'

/keep year rsex.

 

Year and rsex seem to have been saved to an Untitled.sav file.

 

freq year rsex.

 

All I got was:

 

year Year of Interview

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1983

1761

39.3

56.8

56.8

1985

43

1.0

1.4

58.2

1986

1296

28.9

41.8

100.0

Total

3100

69.1

100.0

 

Missing

System

1386

30.9

 

 

Total

4486

100.0

 

 

 

 

rsex Q91A RESPONDENTS SEX

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 MALE

2051

45.7

45.7

45.7

2 FEMALE

2435

54.3

54.3

100.0

Total

4486

100.0

100.0

 

 

Any idea what happened to all the other datasets and cases?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 17:43
To: '[hidden email]' <[hidden email]>
Subject: Syntax to process several *.sav files

 

I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey.  Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)  For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.

 

Some waves have a variable [year] and some have [date] in what appears to be  numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?

 

I have listed all the files in a match files command (as yet incomplete and untried).

 

MATCH FILES

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav'

~~ ~ ~ ~

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav'

/keep <varlist to be decided>.

 

Is there a quick way to compute variable [year] for each wave using something like . .

 

DO REPEAT

X = 1989 to 2014.

COMPUTE year = x.

END REPEAT

 

. . such that [year] will pick up the value and add it to each file?

 

. . or do I have to open each file one at a time  and add [year] separately?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

John F Hall
In reply to this post by John F Hall

Thought I’d got using ADD FILES instead, but still only got:

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 19:21
To: '[hidden email]' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

 

I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly.  During this process there were messages for some files about Unicode and strings with tables like this:

 

 

Altered Types

Date of interview by interviewer Q36

A24

AMIN

Computer Interview date Q37

A24

AMIN

Start time  HH:MM:SS Q38

A24

AMIN

Interviewer Number Q1412

A12

AMIN

 

 

 

However, when I tried to run:

 

match files

 file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'

/keep year rsex.

 

Year and rsex seem to have been saved to an Untitled.sav file.

 

freq year rsex.

 

All I got was:

 

year Year of Interview

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1983

1761

39.3

56.8

56.8

1985

43

1.0

1.4

58.2

1986

1296

28.9

41.8

100.0

Total

3100

69.1

100.0

 

Missing

System

1386

30.9

 

 

Total

4486

100.0

 

 

 

 

rsex Q91A RESPONDENTS SEX

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 MALE

2051

45.7

45.7

45.7

2 FEMALE

2435

54.3

54.3

100.0

Total

4486

100.0

100.0

 

 

Any idea what happened to all the other datasets and cases?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

From: John F Hall [[hidden email]]
Sent: 17 March 2016 17:43
To: '[hidden email]' <[hidden email]>
Subject: Syntax to process several *.sav files

 

I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey.  Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)  For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.

 

Some waves have a variable [year] and some have [date] in what appears to be  numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?

 

I have listed all the files in a match files command (as yet incomplete and untried).

 

MATCH FILES

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav'

~~ ~ ~ ~

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav'

/keep <varlist to be decided>.

 

Is there a quick way to compute variable [year] for each wave using something like . .

 

DO REPEAT

X = 1989 to 2014.

COMPUTE year = x.

END REPEAT

 

. . such that [year] will pick up the value and add it to each file?

 

. . or do I have to open each file one at a time  and add [year] separately?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

Bruce Weaver
Administrator
John, it sounds like one problem may be that you have string variables common to more than one file that are not formatted the same in all files.  I'd check that very carefully.  Here is a nice way to do that (based on Andy W's post here:  http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327.html#a5721333):

* Run the following ALTER TYPE command on all files to be merged.
* Replace 255 with string width known to be large enough in all files.

ALTER TYPE ALL (A = A255).

* ADD FILES command here.
* Run the following ALTER TYPE command on the merged file.

ALTER TYPE ALL (A = AMIN).


But never mind all that.  Surely you must want ADD FILES here, given the number of files.  You do want to stack them vertically, don't you?  



John F Hall wrote
Thought I'd got using ADD FILES instead, but still only got:
 
 
 
From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 19:21
To: '[hidden email]' <[hidden email]>
Subject: RE: Syntax to process several *.sav files
 
I have added a variable [year] to each of the files by hand and checked that
the variable is there and that the year value has been added correctly.
During this process there were messages for some files about Unicode and
strings with tables like this:
 
 

Altered Types

Date of interview by interviewer Q36
A24
AMIN

Computer Interview date Q37
A24
AMIN

Start time  HH:MM:SS Q38
A24
AMIN

Interviewer Number Q1412
A12
AMIN
 
 
 
However, when I tried to run:
 
match files
 file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'
/keep year rsex.
 
Year and rsex seem to have been saved to an Untitled.sav file.
 
freq year rsex.
 
All I got was:
 

year Year of Interview

 
Frequency
Percent
Valid Percent
Cumulative Percent

Valid
1983
1761
39.3
56.8
56.8

1985
43
1.0
1.4
58.2

1986
1296
28.9
41.8
100.0

Total
3100
69.1
100.0
 

Missing
System
1386
30.9
 
 

Total
4486
100.0
 
 
 
 

rsex Q91A RESPONDENTS SEX

 
Frequency
Percent
Valid Percent
Cumulative Percent

Valid
1 MALE
2051
45.7
45.7
45.7

2 FEMALE
2435
54.3
54.3
100.0

Total
4486
100.0
100.0
 
 
Any idea what happened to all the other datasets and cases?
 
John F Hall (Mr)
[Retired academic survey researcher]
 
Email:    <mailto:[hidden email]> [hidden email] 
Website:  <http://www.surveyresearch.weebly.com/>
www.surveyresearch.weebly.com
SPSS start page:
<http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
www.surveyresearch.weebly.com/1-survey-analysis-workshop
 
 
 
 
From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 17:43
To: '[hidden email]' <[hidden email]
<mailto:[hidden email]> >
Subject: Syntax to process several *.sav files
 
I am preparing exercises based on data sets from separate waves (1983 -
2014) of the British Social Attitudes survey.  Some preliminary comments on
the structure and content of the files are on my page Exploring British
Social Attitudes
(http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)
For instance some variables do not have missing values correctly specified:
consequently scales derived from them have incorrect values.
 
Some waves have a variable [year] and some have [date] in what appears to be
numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day
in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?
 
I have listed all the files in a match files command (as yet incomplete and
untried).
 
MATCH FILES
file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa1983.sav'
/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa1984.sav'
~~ ~ ~ ~
file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa2013.sav'
/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa2014.sav'
/keep <varlist to be decided>.
 
Is there a quick way to compute variable [year] for each wave using
something like . .
 
DO REPEAT
X = 1989 to 2014.
COMPUTE year = x.
END REPEAT
 
. . such that [year] will pick up the value and add it to each file?
 
. . or do I have to open each file one at a time  and add [year] separately?
 
John F Hall (Mr)
[Retired academic survey researcher]
 
Email:    <mailto:[hidden email]> [hidden email] 
Website:  <http://www.surveyresearch.weebly.com/>
www.surveyresearch.weebly.com
SPSS start page:
<http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
www.surveyresearch.weebly.com/1-survey-analysis-workshop
 
 
 

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

David Marso
Administrator
In addition to Bruce's ADD FILES and string mop up theory.

ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005 ......./FILE=blahblah2015 .../IN=y2015.


DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015.
IF (Y EQ 1) Year = Value.
END REPEAT.
EXECUTE.
DELETE VARIABLES y2004 TO y2015 .

Other than that?  Dates represented as you had them are crap and are indicative that utter rookies were involved in the design and creation of this mess.

'Some waves have a variable [year] and some have [date] in what appears to be
numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day
in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? '

John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or MOD and TRUNC.  
You have been doing this for a very long time and it is a shock that you have never mastered these things ;-(

COMPUTE month=TRUNC(date/100).
COMPUTE day=MOD(date,100).
COMPUTE Date=DATE.MDY(month, day,year).



Bruce Weaver wrote
John, it sounds like one problem may be that you have string variables common to more than one file that are not formatted the same in all files.  I'd check that very carefully.  Here is a nice way to do that (based on Andy W's post here:  http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327.html#a5721333):

* Run the following ALTER TYPE command on all files to be merged.
* Replace 255 with string width known to be large enough in all files.

ALTER TYPE ALL (A = A255).

* ADD FILES command here.
* Run the following ALTER TYPE command on the merged file.

ALTER TYPE ALL (A = AMIN).


But never mind all that.  Surely you must want ADD FILES here, given the number of files.  You do want to stack them vertically, don't you?  



John F Hall wrote
Thought I'd got using ADD FILES instead, but still only got:
 
 
 
From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 19:21
To: '[hidden email]' <[hidden email]>
Subject: RE: Syntax to process several *.sav files
 
I have added a variable [year] to each of the files by hand and checked that
the variable is there and that the year value has been added correctly.
During this process there were messages for some files about Unicode and
strings with tables like this:
 
 

Altered Types

Date of interview by interviewer Q36
A24
AMIN

Computer Interview date Q37
A24
AMIN

Start time  HH:MM:SS Q38
A24
AMIN

Interviewer Number Q1412
A12
AMIN
 
 
 
However, when I tried to run:
 
match files
 file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'
/file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'
/keep year rsex.
 
Year and rsex seem to have been saved to an Untitled.sav file.
 
freq year rsex.
 
All I got was:
 

year Year of Interview

 
Frequency
Percent
Valid Percent
Cumulative Percent

Valid
1983
1761
39.3
56.8
56.8

1985
43
1.0
1.4
58.2

1986
1296
28.9
41.8
100.0

Total
3100
69.1
100.0
 

Missing
System
1386
30.9
 
 

Total
4486
100.0
 
 
 
 

rsex Q91A RESPONDENTS SEX

 
Frequency
Percent
Valid Percent
Cumulative Percent

Valid
1 MALE
2051
45.7
45.7
45.7

2 FEMALE
2435
54.3
54.3
100.0

Total
4486
100.0
100.0
 
 
Any idea what happened to all the other datasets and cases?
 
John F Hall (Mr)
[Retired academic survey researcher]
 
Email:    <mailto:[hidden email]> [hidden email] 
Website:  <http://www.surveyresearch.weebly.com/>
www.surveyresearch.weebly.com
SPSS start page:
<http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
www.surveyresearch.weebly.com/1-survey-analysis-workshop
 
 
 
 
From: John F Hall [mailto:[hidden email]]
Sent: 17 March 2016 17:43
To: '[hidden email]' <[hidden email]
<mailto:[hidden email]> >
Subject: Syntax to process several *.sav files
 
I am preparing exercises based on data sets from separate waves (1983 -
2014) of the British Social Attitudes survey.  Some preliminary comments on
the structure and content of the files are on my page Exploring British
Social Attitudes
(http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)
For instance some variables do not have missing values correctly specified:
consequently scales derived from them have incorrect values.
 
Some waves have a variable [year] and some have [date] in what appears to be
numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day
in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?
 
I have listed all the files in a match files command (as yet incomplete and
untried).
 
MATCH FILES
file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa1983.sav'
/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa1984.sav'
~~ ~ ~ ~
file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa2013.sav'
/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS
files\bsa2014.sav'
/keep <varlist to be decided>.
 
Is there a quick way to compute variable [year] for each wave using
something like . .
 
DO REPEAT
X = 1989 to 2014.
COMPUTE year = x.
END REPEAT
 
. . such that [year] will pick up the value and add it to each file?
 
. . or do I have to open each file one at a time  and add [year] separately?
 
John F Hall (Mr)
[Retired academic survey researcher]
 
Email:    <mailto:[hidden email]> [hidden email] 
Website:  <http://www.surveyresearch.weebly.com/>
www.surveyresearch.weebly.com
SPSS start page:
<http://surveyresearch.weebly.com/1-survey-analysis-workshop.html>
www.surveyresearch.weebly.com/1-survey-analysis-workshop
 
 
 

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

FW: Syntax to process several *.sav files

John F Hall
In reply to this post by John F Hall

Thought I’d found the solution using ADD FILES instead, but still only got:

 

 

year Year of Interview

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1983

1761

1.8

7.7

7.7

1984

1675

1.7

7.3

15.0

1985

1804

1.8

7.9

22.9

1986

3100

3.1

13.6

36.5

1987

2847

2.9

12.4

48.9

1989

3029

3.1

13.2

62.1

1990

2797

2.8

12.2

74.4

1991

2918

2.9

12.8

87.1

1993

2945

3.0

12.9

100.0

Total

22876

23.1

100.0

 

Missing

System

76041

76.9

 

 

Total

98917

100.0

 

 

 

 

rsex Q91A RESPONDENTS SEX

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 MALE

42138

42.6

44.1

44.1

2 FEMALE

53492

54.1

55.9

100.0

Total

95630

96.7

100.0

 

Missing

System

3287

3.3

 

 

Total

98917

100.0

 

 

 

Will persevere and post results, but I’d love to know why 1995 onwards aren’t included.  Some data sets may already have had a year variable in a different format: will check back to original downloads.

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

From: John F Hall [[hidden email]]
Sent: 17 March 2016 19:26
To: '[hidden email]' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

 

Thought I’d got using ADD FILES instead, but still only got:

 

 

 

From: John F Hall [[hidden email]]
Sent: 17 March 2016 19:21
To: '[hidden email]' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

 

I have added a variable [year] to each of the files by hand and checked that the variable is there and that the year value has been added correctly.  During this process there were messages for some files about Unicode and strings with tables like this:

 

 

Altered Types

Date of interview by interviewer Q36

A24

AMIN

Computer Interview date Q37

A24

AMIN

Start time  HH:MM:SS Q38

A24

AMIN

Interviewer Number Q1412

A12

AMIN

 

 

 

However, when I tried to run:

 

match files

 file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'

/keep year rsex.

 

Year and rsex seem to have been saved to an Untitled.sav file.

 

freq year rsex.

 

All I got was:

 

year Year of Interview

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1983

1761

39.3

56.8

56.8

1985

43

1.0

1.4

58.2

1986

1296

28.9

41.8

100.0

Total

3100

69.1

100.0

 

Missing

System

1386

30.9

 

 

Total

4486

100.0

 

 

 

 

rsex Q91A RESPONDENTS SEX

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

1 MALE

2051

45.7

45.7

45.7

2 FEMALE

2435

54.3

54.3

100.0

Total

4486

100.0

100.0

 

 

Any idea what happened to all the other datasets and cases?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

From: John F Hall [[hidden email]]
Sent: 17 March 2016 17:43
To: '[hidden email]' <[hidden email]>
Subject: Syntax to process several *.sav files

 

I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey.  Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)  For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.

 

Some waves have a variable [year] and some have [date] in what appears to be  numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?

 

I have listed all the files in a match files command (as yet incomplete and untried).

 

MATCH FILES

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav'

~~ ~ ~ ~

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav'

/keep <varlist to be decided>.

 

Is there a quick way to compute variable [year] for each wave using something like . .

 

DO REPEAT

X = 1989 to 2014.

COMPUTE year = x.

END REPEAT

 

. . such that [year] will pick up the value and add it to each file?

 

. . or do I have to open each file one at a time  and add [year] separately?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

FW: Syntax to process several *.sav files

John F Hall
In reply to this post by David Marso
Apologies for replying to David, not list.  Easy mistake to make.  Perhaps
default should be Reply to list, not writer?

-----Original Message-----
From: John F Hall [mailto:[hidden email]]
Sent: 18 March 2016 06:50
To: 'David Marso' <[hidden email]>
Cc: 'Bruce Weaver' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

I had already made the switch to ADD FILES, but sent the mail to myself
instead of the list.  That has now been sent.

Most of the surveys I have dealt with have been single time  snapshots.  ADD
FILES is a command I may have used only three times in 50 years: MERGE FILES
a few times when students were building up their own files from raw data
using BSA1989.   I have only rarely used date functions, and scouring the FM
doesn't always yield solutions.  Even then I doubt if I would have come up
with David's neat IN = <year> device.

Bruce is correct about the strings: that would explain all the warning
messages: ALTER TYPE ALL (<varlist> = AMIN) when first opening the files.
String variables with the same name should have the same format in all
waves, but there are different string variables in some: I'll have to check.

I agree with David about "rookies": I would have included "dyslexics" as
well, but I can't go on the public record  with comments like that: a
positive and helpful approach is needed if I wish to maintain good working
relationships with colleagues elsewhere.  Working through the files I get a
definite feeling that the writer(s) are not completely versed in good SPSS
practice, particularly when the files are to be used by others.  

To be fair, the surveys were originally intended to measure trends across
time, and not used as teaching aids.  The series is stuck with 1983 mnemonic
8-character variable names (some diabolical inventions here) which remain
constant across all waves, but make for tricky navigation.  Early waves used
printed questionnaires with indications for data-prep: this made secondary
analysis quite easy using the facsimile questionnaires as navigation aids.
Several years ago they switched to  CAPI: the (annotated) BLAISE
questionnaires are awkward and cumbersome to use as navigation aids.  Report
writing is farmed out to outside gurus, but some chapters are produced
in-house.  It is not clear who does the analysis for these, but authors
doing their own may not always spot possible errors in the data they use.

So, back to the syntax file to insert David's year-yyyy, then combing
through 32 *.sav files looking for the strings (Highlight Type column,
CTRL+F string) accompanied by dawn birdsong from thick mist outside.


John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]  
Website: www.surveyresearch.weebly.com
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop









-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 17 March 2016 23:25
To: [hidden email]
Subject: Re: Syntax to process several *.sav files

In addition to Bruce's ADD FILES and string mop up theory.

ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005
......./FILE=blahblah2015 .../IN=y2015.


DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015.
IF (Y EQ 1) Year = Value.
END REPEAT.
EXECUTE.
DELETE VARIABLES y2004 TO y2015 .

Other than that?  Dates represented as you had them are crap and are
indicative that utter rookies were involved in the design and creation of
this mess.

'Some waves have a variable [year] and some have [date] in what appears to
be numeric n4.  How can I turn values for [date] of 331 and 1028 into
month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? '

John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or
MOD and TRUNC.  
You have been doing this for a very long time and it is a shock that you
have never mastered these things ;-(

COMPUTE month=TRUNC(date/100).
COMPUTE day=MOD(date,100).
COMPUTE Date=DATE.MDY(month, day,year).




Bruce Weaver wrote
> John, it sounds like one problem may be that you have string variables
> common to more than one file that are not formatted the same in all files.
> I'd check that very carefully.  Here is a nice way to do that (based
> on Andy W's post here:
>
http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327
.html#a5721333):

>
> * Run the following ALTER TYPE command on all files to be merged.
> * Replace 255 with string width known to be large enough in all files.
>
> ALTER TYPE ALL (A = A255).
>
> * ADD FILES command here.
> * Run the following ALTER TYPE command on the merged file.
>
> ALTER TYPE ALL (A = AMIN).
>
>
> But never mind all that.  
*
> Surely
*
>  you must want ADD FILES here, given the number of files.  You do want
> to stack them vertically, don't you?

>
>
> John F Hall wrote
>> Thought I'd got using ADD FILES instead, but still only got:
>>  
>>  
>>  
>> From: John F Hall [mailto:

>> johnfhall@

>> ]
>> Sent: 17 March 2016 19:21
>> To: '

>> SPSSX-L@.UGA

>> ' &lt;

>> SPSSX-L@.UGA

>> &gt;
>> Subject: RE: Syntax to process several *.sav files
>>  
>> I have added a variable [year] to each of the files by hand and
>> checked that the variable is there and that the year value has been
>> added correctly.
>> During this process there were messages for some files about Unicode
>> and strings with tables like this:
>>  
>>  
>>
>> Altered Types
>>
>> Date of interview by interviewer Q36
>> A24
>> AMIN
>>
>> Computer Interview date Q37
>> A24
>> AMIN
>>
>> Start time  HH:MM:SS Q38
>> A24
>> AMIN
>>
>> Interviewer Number Q1412
>> A12
>> AMIN
>>  
>>  
>>  
>> However, when I tried to run:
>>  
>> match files
>>  file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'
>> /keep year rsex.
>>  
>> Year and rsex seem to have been saved to an Untitled.sav file.
>>  
>> freq year rsex.
>>  
>> All I got was:
>>  
>>
>> year Year of Interview
>>
>>  
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1983
>> 1761
>> 39.3
>> 56.8
>> 56.8
>>
>> 1985
>> 43
>> 1.0
>> 1.4
>> 58.2
>>
>> 1986
>> 1296
>> 28.9
>> 41.8
>> 100.0
>>
>> Total
>> 3100
>> 69.1
>> 100.0
>>  
>>
>> Missing
>> System
>> 1386
>> 30.9
>>  
>>  
>>
>> Total
>> 4486
>> 100.0
>>  
>>  
>>  
>>  
>>
>> rsex Q91A RESPONDENTS SEX
>>
>>  
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1 MALE
>> 2051
>> 45.7
>> 45.7
>> 45.7
>>
>> 2 FEMALE
>> 2435
>> 54.3
>> 54.3
>> 100.0
>>
>> Total
>> 4486
>> 100.0
>> 100.0
>>  
>>  
>> Any idea what happened to all the other datasets and cases?
>>  
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>  
>> Email:    &lt;mailto:

>> johnfhall@

>> &gt;

>> johnfhall@

>>  
>> Website:  &lt;http://www.surveyresearch.weebly.com/&gt;
>> www.surveyresearch.weebly.com
>> SPSS start page:
>> &lt;http://surveyresearch.weebly.com/1-survey-analysis-workshop.html&
>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop
>>  
>>  
>>  
>>  
>> From: John F Hall [mailto:

>> johnfhall@

>> ]
>> Sent: 17 March 2016 17:43
>> To: '

>> SPSSX-L@.UGA

>> ' &lt;

>> SPSSX-L@.UGA

>> &lt;mailto:

>> SPSSX-L@.UGA

>> &gt; >
>> Subject: Syntax to process several *.sav files
>>  
>> I am preparing exercises based on data sets from separate waves (1983
>> -
>> 2014) of the British Social Attitudes survey.  Some preliminary
>> comments on the structure and content of the files are on my page
>> Exploring British Social Attitudes
>> (http://surveyresearch.weebly.com/exploring-british-social-attitudes.
>> html) For instance some variables do not have missing values
>> correctly
>> specified:
>> consequently scales derived from them have incorrect values.
>>  
>> Some waves have a variable [year] and some have [date] in what
>> appears to be numeric n4.  How can I turn values for [date] of 331
>> and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm
>> (31-Mar, 28_Oct)?
>>  
>> I have listed all the files in a match files command (as yet
>> incomplete and untried).
>>  
>> MATCH FILES
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1984.sav'
>> ~~ ~ ~ ~
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2014.sav'
>> /keep
>> <varlist to be decided>
>> .
>>  
>> Is there a quick way to compute variable [year] for each wave using
>> something like . .
>>  
>> DO REPEAT
>> X = 1989 to 2014.
>> COMPUTE year = x.
>> END REPEAT
>>  
>> . . such that [year] will pick up the value and add it to each file?
>>  
>> . . or do I have to open each file one at a time  and add [year]
>> separately?
>>  
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>  
>> Email:    &lt;mailto:

>> johnfhall@

>> &gt;

>> johnfhall@

>>  
>> Website:  &lt;http://www.surveyresearch.weebly.com/&gt;
>> www.surveyresearch.weebly.com
>> SPSS start page:
>> &lt;http://surveyresearch.weebly.com/1-survey-analysis-workshop.html&
>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop
>>  
>>  
>>  
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to

>> LISTSERV@.UGA

>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email
me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-to-process-several-sav-
files-tp5731762p5731767.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: FW: Syntax to process several *.sav files

Jon Peck
It may be overkill since this particular problem is mostly solved, but I would like to point out a few tools that could be useful in a similar situation.

1. Setting the year: It is easy to generate the year variable from the file name.  Here is an example using one of the functions in the spssaux module installed with the Python materials.  This example also calculates the date assuming that there is a variable named mday that has the format John described.  spssaux.getDatasetInfo returns by default the filespec for the active dataset.  The basename function extracts the file name itself; split breaks out the root from which the year digits are extracted.

begin program.
import spss, spssaux, os

root = os.path.basename(spssaux.getDatasetInfo()).split(".")[0][3:]
spss.Submit("""compute root = %s.""" % root)

spss.Submit("""compute thedate = yrmoda(root, trunc(mday/100), mday - trunc(mday/100) * 100).""")

end program.

2.  The STATS ADJUST WIDTHS extension command takes a batch of sav files and checks for type and width consistency of selected variables.  It corrects unequal widths and produces lists of situations where the  types are inconsistent.  With a lot of files to merge, this can be a huge timesaver.

3.  SPSSINC PROCESS FILES takes a batch of syntax and applies it to each of a set of files specified typically by a wildcard expression such as bsa*.sav.  In this example, it could be used to construct the year variable and rationalize the types and then iteratively do the ADD FILES command all without needing to enumerate all the files explicitly.  (This is often combined with SPSSINC SPLIT DATASET in order to generalize SPLIT FILES so that the splits can operate, in effect, over a whole set of commands rather than within individual procedures.)

On Thu, Mar 17, 2016 at 11:53 PM, John F Hall <[hidden email]> wrote:
Apologies for replying to David, not list.  Easy mistake to make.  Perhaps
default should be Reply to list, not writer?

-----Original Message-----
From: John F Hall [mailto:[hidden email]]
Sent: 18 March 2016 06:50
To: 'David Marso' <[hidden email]>
Cc: 'Bruce Weaver' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

I had already made the switch to ADD FILES, but sent the mail to myself
instead of the list.  That has now been sent.

Most of the surveys I have dealt with have been single time  snapshots.  ADD
FILES is a command I may have used only three times in 50 years: MERGE FILES
a few times when students were building up their own files from raw data
using BSA1989.   I have only rarely used date functions, and scouring the FM
doesn't always yield solutions.  Even then I doubt if I would have come up
with David's neat IN = <year> device.

Bruce is correct about the strings: that would explain all the warning
messages: ALTER TYPE ALL (<varlist> = AMIN) when first opening the files.
String variables with the same name should have the same format in all
waves, but there are different string variables in some: I'll have to check.

I agree with David about "rookies": I would have included "dyslexics" as
well, but I can't go on the public record  with comments like that: a
positive and helpful approach is needed if I wish to maintain good working
relationships with colleagues elsewhere.  Working through the files I get a
definite feeling that the writer(s) are not completely versed in good SPSS
practice, particularly when the files are to be used by others.

To be fair, the surveys were originally intended to measure trends across
time, and not used as teaching aids.  The series is stuck with 1983 mnemonic
8-character variable names (some diabolical inventions here) which remain
constant across all waves, but make for tricky navigation.  Early waves used
printed questionnaires with indications for data-prep: this made secondary
analysis quite easy using the facsimile questionnaires as navigation aids.
Several years ago they switched to  CAPI: the (annotated) BLAISE
questionnaires are awkward and cumbersome to use as navigation aids.  Report
writing is farmed out to outside gurus, but some chapters are produced
in-house.  It is not clear who does the analysis for these, but authors
doing their own may not always spot possible errors in the data they use.

So, back to the syntax file to insert David's year-yyyy, then combing
through 32 *.sav files looking for the strings (Highlight Type column,
CTRL+F string) accompanied by dawn birdsong from thick mist outside.


John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]
Website: www.surveyresearch.weebly.com
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop









-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 17 March 2016 23:25
To: [hidden email]
Subject: Re: Syntax to process several *.sav files

In addition to Bruce's ADD FILES and string mop up theory.

ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005
......./FILE=blahblah2015 .../IN=y2015.


DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015.
IF (Y EQ 1) Year = Value.
END REPEAT.
EXECUTE.
DELETE VARIABLES y2004 TO y2015 .

Other than that?  Dates represented as you had them are crap and are
indicative that utter rookies were involved in the design and creation of
this mess.

'Some waves have a variable [year] and some have [date] in what appears to
be numeric n4.  How can I turn values for [date] of 331 and 1028 into
month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? '

John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or
MOD and TRUNC.
You have been doing this for a very long time and it is a shock that you
have never mastered these things ;-(

COMPUTE month=TRUNC(date/100).
COMPUTE day=MOD(date,100).
COMPUTE Date=DATE.MDY(month, day,year).




Bruce Weaver wrote
> John, it sounds like one problem may be that you have string variables
> common to more than one file that are not formatted the same in all files.
> I'd check that very carefully.  Here is a nice way to do that (based
> on Andy W's post here:
>
http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327
.html#a5721333
):
>
> * Run the following ALTER TYPE command on all files to be merged.
> * Replace 255 with string width known to be large enough in all files.
>
> ALTER TYPE ALL (A = A255).
>
> * ADD FILES command here.
> * Run the following ALTER TYPE command on the merged file.
>
> ALTER TYPE ALL (A = AMIN).
>
>
> But never mind all that.
*
> Surely
*
>  you must want ADD FILES here, given the number of files.  You do want
> to stack them vertically, don't you?

>
>
> John F Hall wrote
>> Thought I'd got using ADD FILES instead, but still only got:
>>
>>
>>
>> From: John F Hall [mailto:

>> johnfhall@

>> ]
>> Sent: 17 March 2016 19:21
>> To: '

>> SPSSX-L@.UGA

>> ' &lt;

>> SPSSX-L@.UGA

>> &gt;
>> Subject: RE: Syntax to process several *.sav files
>>
>> I have added a variable [year] to each of the files by hand and
>> checked that the variable is there and that the year value has been
>> added correctly.
>> During this process there were messages for some files about Unicode
>> and strings with tables like this:
>>
>>
>>
>> Altered Types
>>
>> Date of interview by interviewer Q36
>> A24
>> AMIN
>>
>> Computer Interview date Q37
>> A24
>> AMIN
>>
>> Start time  HH:MM:SS Q38
>> A24
>> AMIN
>>
>> Interviewer Number Q1412
>> A12
>> AMIN
>>
>>
>>
>> However, when I tried to run:
>>
>> match files
>>  file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'
>> /keep year rsex.
>>
>> Year and rsex seem to have been saved to an Untitled.sav file.
>>
>> freq year rsex.
>>
>> All I got was:
>>
>>
>> year Year of Interview
>>
>>
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1983
>> 1761
>> 39.3
>> 56.8
>> 56.8
>>
>> 1985
>> 43
>> 1.0
>> 1.4
>> 58.2
>>
>> 1986
>> 1296
>> 28.9
>> 41.8
>> 100.0
>>
>> Total
>> 3100
>> 69.1
>> 100.0
>>
>>
>> Missing
>> System
>> 1386
>> 30.9
>>
>>
>>
>> Total
>> 4486
>> 100.0
>>
>>
>>
>>
>>
>> rsex Q91A RESPONDENTS SEX
>>
>>
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1 MALE
>> 2051
>> 45.7
>> 45.7
>> 45.7
>>
>> 2 FEMALE
>> 2435
>> 54.3
>> 54.3
>> 100.0
>>
>> Total
>> 4486
>> 100.0
>> 100.0
>>
>>
>> Any idea what happened to all the other datasets and cases?
>>
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>
>> Email:    &lt;mailto:

>> johnfhall@

>> &gt;

>> johnfhall@

>>
>> Website:  &lt;http://www.surveyresearch.weebly.com/&gt;
>> www.surveyresearch.weebly.com
>> SPSS start page:
>> &lt;http://surveyresearch.weebly.com/1-survey-analysis-workshop.html&
>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop
>>
>>
>>
>>
>> From: John F Hall [mailto:

>> johnfhall@

>> ]
>> Sent: 17 March 2016 17:43
>> To: '

>> SPSSX-L@.UGA

>> ' &lt;

>> SPSSX-L@.UGA

>> &lt;mailto:

>> SPSSX-L@.UGA

>> &gt; >
>> Subject: Syntax to process several *.sav files
>>
>> I am preparing exercises based on data sets from separate waves (1983
>> -
>> 2014) of the British Social Attitudes survey.  Some preliminary
>> comments on the structure and content of the files are on my page
>> Exploring British Social Attitudes
>> (http://surveyresearch.weebly.com/exploring-british-social-attitudes.
>> html) For instance some variables do not have missing values
>> correctly
>> specified:
>> consequently scales derived from them have incorrect values.
>>
>> Some waves have a variable [year] and some have [date] in what
>> appears to be numeric n4.  How can I turn values for [date] of 331
>> and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm
>> (31-Mar, 28_Oct)?
>>
>> I have listed all the files in a match files command (as yet
>> incomplete and untried).
>>
>> MATCH FILES
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1984.sav'
>> ~~ ~ ~ ~
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2014.sav'
>> /keep
>> <varlist to be decided>
>> .
>>
>> Is there a quick way to compute variable [year] for each wave using
>> something like . .
>>
>> DO REPEAT
>> X = 1989 to 2014.
>> COMPUTE year = x.
>> END REPEAT
>>
>> . . such that [year] will pick up the value and add it to each file?
>>
>> . . or do I have to open each file one at a time  and add [year]
>> separately?
>>
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>
>> Email:    &lt;mailto:

>> johnfhall@

>> &gt;

>> johnfhall@

>>
>> Website:  &lt;http://www.surveyresearch.weebly.com/&gt;
>> www.surveyresearch.weebly.com
>> SPSS start page:
>> &lt;http://surveyresearch.weebly.com/1-survey-analysis-workshop.html&
>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop
>>
>>
>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to

>> LISTSERV@.UGA

>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email
me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-to-process-several-sav-
files-tp5731762p5731767.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

John F Hall
In reply to this post by John F Hall

Jon

 

British Social Attitudes 1983 – 2014

 

There’s been quite a few haystacks torn apart today and a few useful exchanges with Bruce and David.  Too detailed to share on the list, but I’ll post something later.

 

Basically I’ve had to deal with the same variable name [year] being used in some files but with different WIDTHS.  In one file it was only a single digit, in another it was 4 and in yet another 5.  Another variable [strttime] was NUMERIC 4 or 5 in different files and in one case STRING with data clearly a time in hh:mm:ss.  Of course not all the above variables are repeated for each wave.  I’ve made the necessary changes to keep everything consistent (and kept a detailed log).

 

Part of the problem is that the BSA series was intended to measure change over time and is funded by clients who need the data often for policy purposes.  Some of it has academic roots.  Reports are written by gurus who may or may not have expertise in SPSS, others are written in-house.  Authors using analysis without first checking the data run a serious risk of error.

Metadata are written with reports in mind, not later users struggling to find their way round the files for teaching or secondary analysis.

 

Although the data were not originally intended for teaching, they are an incredibly valuable resource for such  (I have used them since 1983 when Roger Jowell, several months before publication of reports,  gave me early access to the raw data so that I had time to prepare materials before courses started.   The SPSS files generated can even today be used as models of file construction. 

 

Another problem is archiving software which often strips off things like measurement levels and other metadata from even the most carefully crafted SPSS files.

 

There has been advice from Bruce and David to use macros and, from you, Python, but I’m not a programmer, I’m a (sort of) sociologist dealing with the substance of dozens of surveys, and trying to keep things simple for beginners and non-numerate students and clients. 

 

Progress so far has resulted in a combined file for 1983 to 1991 inclusive and a large Excel table showing all relevant information for all waves (I’ll send you a copy off-list) but now for the hairy bits dealing with inconsistent strings and other problems in the remaining waves.

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon Peck
Sent: 18 March 2016 15:32
To: [hidden email]
Subject: Re: FW: Syntax to process several *.sav files

 

It may be overkill since this particular problem is mostly solved, but I would like to point out a few tools that could be useful in a similar situation.

 

1. Setting the year: It is easy to generate the year variable from the file name.  Here is an example using one of the functions in the spssaux module installed with the Python materials.  This example also calculates the date assuming that there is a variable named mday that has the format John described.  spssaux.getDatasetInfo returns by default the filespec for the active dataset.  The basename function extracts the file name itself; split breaks out the root from which the year digits are extracted.

 

begin program.

import spss, spssaux, os

 

root = os.path.basename(spssaux.getDatasetInfo()).split(".")[0][3:]

spss.Submit("""compute root = %s.""" % root)

 

spss.Submit("""compute thedate = yrmoda(root, trunc(mday/100), mday - trunc(mday/100) * 100).""")

 

end program.

 

2.  The STATS ADJUST WIDTHS extension command takes a batch of sav files and checks for type and width consistency of selected variables.  It corrects unequal widths and produces lists of situations where the  types are inconsistent.  With a lot of files to merge, this can be a huge timesaver.

 

3.  SPSSINC PROCESS FILES takes a batch of syntax and applies it to each of a set of files specified typically by a wildcard expression such as bsa*.sav.  In this example, it could be used to construct the year variable and rationalize the types and then iteratively do the ADD FILES command all without needing to enumerate all the files explicitly.  (This is often combined with SPSSINC SPLIT DATASET in order to generalize SPLIT FILES so that the splits can operate, in effect, over a whole set of commands rather than within individual procedures.)

 

On Thu, Mar 17, 2016 at 11:53 PM, John F Hall <[hidden email]> wrote:

Apologies for replying to David, not list.  Easy mistake to make.  Perhaps
default should be Reply to list, not writer?

-----Original Message-----
From: John F Hall [mailto:[hidden email]]
Sent: 18 March 2016 06:50
To: 'David Marso' <[hidden email]>
Cc: 'Bruce Weaver' <[hidden email]>
Subject: RE: Syntax to process several *.sav files

I had already made the switch to ADD FILES, but sent the mail to myself
instead of the list.  That has now been sent.

Most of the surveys I have dealt with have been single time  snapshots.  ADD
FILES is a command I may have used only three times in 50 years: MERGE FILES
a few times when students were building up their own files from raw data
using BSA1989.   I have only rarely used date functions, and scouring the FM
doesn't always yield solutions.  Even then I doubt if I would have come up
with David's neat IN = <year> device.

Bruce is correct about the strings: that would explain all the warning
messages: ALTER TYPE ALL (<varlist> = AMIN) when first opening the files.
String variables with the same name should have the same format in all
waves, but there are different string variables in some: I'll have to check.

I agree with David about "rookies": I would have included "dyslexics" as
well, but I can't go on the public record  with comments like that: a
positive and helpful approach is needed if I wish to maintain good working
relationships with colleagues elsewhere.  Working through the files I get a
definite feeling that the writer(s) are not completely versed in good SPSS
practice, particularly when the files are to be used by others.

To be fair, the surveys were originally intended to measure trends across
time, and not used as teaching aids.  The series is stuck with 1983 mnemonic
8-character variable names (some diabolical inventions here) which remain
constant across all waves, but make for tricky navigation.  Early waves used
printed questionnaires with indications for data-prep: this made secondary
analysis quite easy using the facsimile questionnaires as navigation aids.
Several years ago they switched to  CAPI: the (annotated) BLAISE
questionnaires are awkward and cumbersome to use as navigation aids.  Report
writing is farmed out to outside gurus, but some chapters are produced
in-house.  It is not clear who does the analysis for these, but authors
doing their own may not always spot possible errors in the data they use.

So, back to the syntax file to insert David's year-yyyy, then combing
through 32 *.sav files looking for the strings (Highlight Type column,
CTRL+F string) accompanied by dawn birdsong from thick mist outside.


John F Hall (Mr)
[Retired academic survey researcher]

Email:   [hidden email]
Website: www.surveyresearch.weebly.com
SPSS start page:  www.surveyresearch.weebly.com/1-survey-analysis-workshop








-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: 17 March 2016 23:25
To: [hidden email]
Subject: Re: Syntax to process several *.sav files

In addition to Bruce's ADD FILES and string mop up theory.

ADD FILES / FILE = blahblah2004 / IN=y2004 / FILE = blahblah2005 /IN=y2005
......./FILE=blahblah2015 .../IN=y2015.


DO REPEAT y=y2004 TO y2015 / value=2004 TO 2015.
IF (Y EQ 1) Year = Value.
END REPEAT.
EXECUTE.
DELETE VARIABLES y2004 TO y2015 .

Other than that?  Dates represented as you had them are crap and are
indicative that utter rookies were involved in the design and creation of
this mess.

'Some waves have a variable [year] and some have [date] in what appears to
be numeric n4.  How can I turn values for [date] of 331 and 1028 into
month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)? '

John, Please see the basic functions STRING, CHAR.SUBSTR, DATE.MDY, etc or
MOD and TRUNC.
You have been doing this for a very long time and it is a shock that you
have never mastered these things ;-(

COMPUTE month=TRUNC(date/100).
COMPUTE day=MOD(date,100).
COMPUTE Date=DATE.MDY(month, day,year).




Bruce Weaver wrote
> John, it sounds like one problem may be that you have string variables
> common to more than one file that are not formatted the same in all files.
> I'd check that very carefully.  Here is a nice way to do that (based
> on Andy W's post here:
>
http://spssx-discussion.1045642.n5.nabble.com/a-useful-improvement-td5721327
.html#a5721333
):


>
> * Run the following ALTER TYPE command on all files to be merged.
> * Replace 255 with string width known to be large enough in all files.
>
> ALTER TYPE ALL (A = A255).
>
> * ADD FILES command here.
> * Run the following ALTER TYPE command on the merged file.
>
> ALTER TYPE ALL (A = AMIN).
>
>
> But never mind all that.
*
> Surely
*
>  you must want ADD FILES here, given the number of files.  You do want
> to stack them vertically, don't you?

>
>
> John F Hall wrote
>> Thought I'd got using ADD FILES instead, but still only got:
>>
>>
>>
>> From: John F Hall [mailto:

>> johnfhall@

>> ]
>> Sent: 17 March 2016 19:21
>> To: '

>> [hidden email]

>> ' &lt;

>> [hidden email]

>> &gt;
>> Subject: RE: Syntax to process several *.sav files
>>
>> I have added a variable [year] to each of the files by hand and
>> checked that the variable is there and that the year value has been
>> added correctly.
>> During this process there were messages for some files about Unicode
>> and strings with tables like this:
>>
>>
>>
>> Altered Types
>>
>> Date of interview by interviewer Q36
>> A24
>> AMIN
>>
>> Computer Interview date Q37
>> A24
>> AMIN
>>
>> Start time  HH:MM:SS Q38
>> A24
>> AMIN
>>
>> Interviewer Number Q1412
>> A12
>> AMIN
>>
>>
>>
>> However, when I tried to run:
>>
>> match files
>>  file  'C:\Users\John\Desktop\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1984.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1985.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1986.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1987.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1989.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1990.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1991.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1993.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1994.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1995.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996au.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1996bu.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1997a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1998a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa1999a.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2000.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2001soc.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2002.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2003.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2004.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2005.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2006.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2007.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2008.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2009.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2010.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2011.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2012.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\SPSS files\bsa2014.sav'
>> /keep year rsex.
>>
>> Year and rsex seem to have been saved to an Untitled.sav file.
>>
>> freq year rsex.
>>
>> All I got was:
>>
>>
>> year Year of Interview
>>
>>
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1983
>> 1761
>> 39.3
>> 56.8
>> 56.8
>>
>> 1985
>> 43
>> 1.0
>> 1.4
>> 58.2
>>
>> 1986
>> 1296
>> 28.9
>> 41.8
>> 100.0
>>
>> Total
>> 3100
>> 69.1
>> 100.0
>>
>>
>> Missing
>> System
>> 1386
>> 30.9
>>
>>
>>
>> Total
>> 4486
>> 100.0
>>
>>
>>
>>
>>
>> rsex Q91A RESPONDENTS SEX
>>
>>
>> Frequency
>> Percent
>> Valid Percent
>> Cumulative Percent
>>
>> Valid
>> 1 MALE
>> 2051
>> 45.7
>> 45.7
>> 45.7
>>
>> 2 FEMALE
>> 2435
>> 54.3
>> 54.3
>> 100.0
>>
>> Total
>> 4486
>> 100.0
>> 100.0
>>
>>
>> Any idea what happened to all the other datasets and cases?
>>
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>
>> Email:    &lt;mailto:

>> johnfhall@

>> &gt;

>> johnfhall@

>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop
>>
>>
>>
>>
>> From: John F Hall [mailto:


>> johnfhall@

>> ]
>> Sent: 17 March 2016 17:43
>> To: '

>> [hidden email]

>> ' &lt;

>> [hidden email]

>> &lt;mailto:

>> [hidden email]

>> &gt; >
>> Subject: Syntax to process several *.sav files
>>
>> I am preparing exercises based on data sets from separate waves (1983
>> -
>> 2014) of the British Social Attitudes survey.  Some preliminary
>> comments on the structure and content of the files are on my page
>> Exploring British Social Attitudes
>> (http://surveyresearch.weebly.com/exploring-british-social-attitudes.
>> html) For instance some variables do not have missing values

>> correctly
>> specified:
>> consequently scales derived from them have incorrect values.
>>
>> Some waves have a variable [year] and some have [date] in what
>> appears to be numeric n4.  How can I turn values for [date] of 331
>> and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm
>> (31-Mar, 28_Oct)?
>>
>> I have listed all the files in a match files command (as yet
>> incomplete and untried).
>>
>> MATCH FILES
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1983.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa1984.sav'
>> ~~ ~ ~ ~
>> file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2013.sav'
>> /file 'C:\Users\John\Desktop\British Social Attitudes\BSA
>> 1983-2014\SPSS files\bsa2014.sav'
>> /keep
>> <varlist to be decided>
>> .
>>
>> Is there a quick way to compute variable [year] for each wave using
>> something like . .
>>
>> DO REPEAT
>> X = 1989 to 2014.
>> COMPUTE year = x.
>> END REPEAT
>>
>> . . such that [year] will pick up the value and add it to each file?
>>
>> . . or do I have to open each file one at a time  and add [year]
>> separately?
>>
>> John F Hall (Mr)
>> [Retired academic survey researcher]
>>
>> Email:    &lt;mailto:


>> johnfhall@

>> &gt;

>> johnfhall@

>> gt; www.surveyresearch.weebly.com/1-survey-analysis-workshop

>>
>>
>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to


>> [hidden email]

>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email
me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Syntax-to-process-several-sav-
files-tp5731762p5731767.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Syntax to process several *.sav files

Rich Ulrich
In reply to this post by John F Hall
By the way, you do not want to use MATCH FILES, but in scanning this thread with
its very long quotations, I did not see a mention of why MATCH FILES is wrong here,
because of exactly what it does --

MATCH FILES ordinarily is used with an ID variable to specify the matching.  When there is
no ID to match on, it provides a 1-to-1 match, effectively matching by Case number.  IMO, the
procedure should give a notification or warning when there is no matching variable,
because the 1-to-1 match, sequentially, is less likely to be "intentional" than it is "error".

The result (before using KEEP) is a file with the number of cases of the longest file.
Each successive file in the set of FILE=   adds its new files to the Variable list, and (I think)
inserts its values in place of whatever values were there for the duplicated names -- I never
tried it, because that sort of use would be terrible practice (UPDATE, using ID, handles updating).

--
Rich Ulrich




Date: Thu, 17 Mar 2016 17:43:17 +0100
From: [hidden email]
Subject: Syntax to process several *.sav files
To: [hidden email]

I am preparing exercises based on data sets from separate waves (1983 – 2014) of the British Social Attitudes survey.  Some preliminary comments on the structure and content of the files are on my page Exploring British Social Attitudes (http://surveyresearch.weebly.com/exploring-british-social-attitudes.html)  For instance some variables do not have missing values correctly specified: consequently scales derived from them have incorrect values.

 

Some waves have a variable [year] and some have [date] in what appears to be  numeric n4.  How can I turn values for [date] of 331 and 1028 into month-day in mmm-dd format (Mar-31, Oct-28)or dd-mmm (31-Mar, 28_Oct)?

 

I have listed all the files in a match files command (as yet incomplete and untried).

 

MATCH FILES

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1983.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa1984.sav'

~~ ~ ~ ~

file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2013.sav'

/file 'C:\Users\John\Desktop\British Social Attitudes\BSA 1983-2014\SPSS files\bsa2014.sav'

/keep <varlist to be decided>.

 

Is there a quick way to compute variable [year] for each wave using something like . .

 

DO REPEAT

X = 1989 to 2014.

COMPUTE year = x.

END REPEAT

 

. . such that [year] will pick up the value and add it to each file?

 

. . or do I have to open each file one at a time  and add [year] separately?

 

John F Hall (Mr)

[Retired academic survey researcher]

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD