Macro to run syntax on all datasets in a folder

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Macro to run syntax on all datasets in a folder

Asbury
Hi all,

I have been programming with SPSS for over 15 years and I still cannot
figure out how to use Macros even after reading through posting here, the
SPSS manual and    website. I routinely run the same syntax on batches of
DHS country files by using the INSERT COMMAND. Often these programs save
.sav files for each country. I know that there is a more efficient way to do
this using macros. A former researcher here used a macro that would open
each dataset existing in a folder, run all variable constructions and
computations and then save individual data files for each country file that
was read. Her syntax no longer works and I can't figure out the logic. So
instead my programming looks like what is presented below.

Basically I'd like to know how to write a macro that would
1.open each country data file(eg. Kenya 2005) in the
D":\Datasets\DHS\SSAfrica\" directory
2.run the recode program
3.save a new dataset for each country,(eg. Kenya 2005 births).

Thanks in advance for any guidance.
Asb



*Program 1- opens each country data file, executes recodes via Insert
command and saves new data file for each country.

cd  'I:\Datasets\External\DHS\Birth level\temp'.

Get file='D:\Datasets\DHS\SSAfrica\Kenya 2005.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Kenya 2005 births.sav.

Get file='D:\Datasets\DHS\SSAfrica\Mali 2006.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Mali 2006 births.sav.

Get file='D:\Datasets\DHS\SSAfrica\Ghana 2000.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Ghana 2000 births.sav.

*********************end here.




*Program 2- contains code for variable constructions.

*******.
*Weight.
********.
compute wgt=v005/1000000.


*Create 3 category wealth variable---MUST BE BASED ON WEIGHTED FILE.
*descriptives vars=v191/stat min max mean.
*USING RANK COMMAND.
WEIGHT BY WGT.
rank variables=v191
/ntiles(3) into wlth3/print yes.
variable label wlth3 'Wealth 3-cat recode[v191]'.
value label wlth3 1 'lowest'  2 'middle' 3 'highest'.
freq vars=wlth3.



freq vars=m10$1 m10$2 m10$3 m10$4 m10$5 m10$6.
*initial each birth flag to 0.
compute birth1a=0.
compute birth2a=0.
compute birth3a=0.
compute birth4a=0.
compute birth5a=0.
compute birth6a=0.


if m10$1 ge 1 birth1a=1.
if m10$2 ge 1 birth2a=1.
if m10$3 ge 1 birth3a=1.
if m10$4 ge 1 birth4a=1.
if m10$5 ge 1 birth5a=1.
if m10$6 ge 1 birth6a=1.
*currently pregnant.
if v213=1 curpreg=1.
variable label curpreg 'current pregnancy'.

freq vars=birth1a to birth6a.

if birth1a=1 and m10$1=2  mistim1a=1.
if birth1a=1 and m10$1 ne 2  mistim1a=0.

if birth2a=1 and m10$2=2 mistim2a=1.
if birth2a=1 and m10$2 ne 2  mistim2a=0.

if birth3a=1 and m10$3=2 mistim3a=1.
if birth3a=1 and m10$3 ne 2 mistim3a=0.

if birth4a=1 and m10$4=2 mistim4a=1.
if birth4a=1 and m10$4 ne 2  mistim4a=0.

*intialize for births 5 and 6.
*fyi Guinea 2005 doesn't have m10$5-6, intializing will create mistime5-6
with missing values and prevent error message.
compute mistim5a=$sysmis.
if birth5a=1 and m10$5=2 mistim5a=1.
if birth5a=1 and m10$5 ne 2 mistim5a=0.

compute mistim6a=$sysmis.
if birth6a=1 and m10$6=2 mistim6a=1.
if birth6a=1 and m10$6 ne 2 mistim6a=0.


if curpreg=1 and  v225=2 mistimc=1.
if curpreg=1 and v225 ne 2 mistimc=0.
variable label mistimc 'current pregnancy mistimed'.

***********.

if birth1a=1 and  m10$1=3 unwant1a=1.
if birth1a=1 and  m10$1 ne 3 unwant1a=0.

if birth2a=1 and  m10$2=3 unwant2a=1.
if birth2a=1 and  m10$2 ne 3 unwant2a=0.

if birth3a=1 and  m10$3=3 unwant3a=1.
if birth3a=1 and  m10$3 ne 3 unwant3a=0.

if birth4a=1 and  m10$4=3 unwant4a=1.
if birth4a=1 and  m10$4 ne 3 unwant4a=0.


*intialize for births 5 and 6.
*fyi Guinea 2005 doesn't have m10$5-6, intializing will create unwant5-6
with missing values and prevent error message.
compute unwant5a=$sysmis.
if birth5a=1 and  m10$5=3 unwant5a=1.
if birth5a=1 and  m10$5 ne 3 unwant5a=0.

compute unwant6a=$sysmis.
if birth6a=1 and  m10$6=3 unwant6a=1.
if birth6a=1 and  m10$6 ne 3 unwant6a=0.


if curpreg=1 and  v225=3 unwantc=1.
if curpreg=1 and v225 ne 3 unwantc=0.
variable label unwantc 'current pregnancy unwanted'.


*currently preg w no info on wantedness of pregnancy.
if curpreg=1 and missing(v225) unwantc=9.
if curpreg=1 and missing(v225) unwantc=9.


SORT CASES BY caseid v001 v002 v003.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Macro to run syntax on all datasets in a folder

Peck, Jon
If you can use Python, you will find in the spssaux2 module a ready-made function that will do all this.

applySyntaxToFiles
is a function that takes a wildcard input sav file specification and a syntax file.  It applies the syntax to each matching sav file, and, if desired, writes the transformed file to another directory.  It can also produce a separate Viewer file for each or collect them all together.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Asbury
Sent: Tuesday, September 16, 2008 12:51 PM
To: [hidden email]
Subject: [SPSSX-L] Macro to run syntax on all datasets in a folder

Hi all,

I have been programming with SPSS for over 15 years and I still cannot
figure out how to use Macros even after reading through posting here, the
SPSS manual and    website. I routinely run the same syntax on batches of
DHS country files by using the INSERT COMMAND. Often these programs save
.sav files for each country. I know that there is a more efficient way to do
this using macros. A former researcher here used a macro that would open
each dataset existing in a folder, run all variable constructions and
computations and then save individual data files for each country file that
was read. Her syntax no longer works and I can't figure out the logic. So
instead my programming looks like what is presented below.

Basically I'd like to know how to write a macro that would
1.open each country data file(eg. Kenya 2005) in the
D":\Datasets\DHS\SSAfrica\" directory
2.run the recode program
3.save a new dataset for each country,(eg. Kenya 2005 births).

Thanks in advance for any guidance.
Asb



*Program 1- opens each country data file, executes recodes via Insert
command and saves new data file for each country.

cd  'I:\Datasets\External\DHS\Birth level\temp'.

Get file='D:\Datasets\DHS\SSAfrica\Kenya 2005.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Kenya 2005 births.sav.

Get file='D:\Datasets\DHS\SSAfrica\Mali 2006.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Mali 2006 births.sav.

Get file='D:\Datasets\DHS\SSAfrica\Ghana 2000.sav'.
Insert file='D:\Datasets\DHS\programs\recodes.sps'.
save outfile=Ghana 2000 births.sav.

*********************end here.




*Program 2- contains code for variable constructions.

*******.
*Weight.
********.
compute wgt=v005/1000000.


*Create 3 category wealth variable---MUST BE BASED ON WEIGHTED FILE.
*descriptives vars=v191/stat min max mean.
*USING RANK COMMAND.
WEIGHT BY WGT.
rank variables=v191
/ntiles(3) into wlth3/print yes.
variable label wlth3 'Wealth 3-cat recode[v191]'.
value label wlth3 1 'lowest'  2 'middle' 3 'highest'.
freq vars=wlth3.



freq vars=m10$1 m10$2 m10$3 m10$4 m10$5 m10$6.
*initial each birth flag to 0.
compute birth1a=0.
compute birth2a=0.
compute birth3a=0.
compute birth4a=0.
compute birth5a=0.
compute birth6a=0.


if m10$1 ge 1 birth1a=1.
if m10$2 ge 1 birth2a=1.
if m10$3 ge 1 birth3a=1.
if m10$4 ge 1 birth4a=1.
if m10$5 ge 1 birth5a=1.
if m10$6 ge 1 birth6a=1.
*currently pregnant.
if v213=1 curpreg=1.
variable label curpreg 'current pregnancy'.

freq vars=birth1a to birth6a.

if birth1a=1 and m10$1=2  mistim1a=1.
if birth1a=1 and m10$1 ne 2  mistim1a=0.

if birth2a=1 and m10$2=2 mistim2a=1.
if birth2a=1 and m10$2 ne 2  mistim2a=0.

if birth3a=1 and m10$3=2 mistim3a=1.
if birth3a=1 and m10$3 ne 2 mistim3a=0.

if birth4a=1 and m10$4=2 mistim4a=1.
if birth4a=1 and m10$4 ne 2  mistim4a=0.

*intialize for births 5 and 6.
*fyi Guinea 2005 doesn't have m10$5-6, intializing will create mistime5-6
with missing values and prevent error message.
compute mistim5a=$sysmis.
if birth5a=1 and m10$5=2 mistim5a=1.
if birth5a=1 and m10$5 ne 2 mistim5a=0.

compute mistim6a=$sysmis.
if birth6a=1 and m10$6=2 mistim6a=1.
if birth6a=1 and m10$6 ne 2 mistim6a=0.


if curpreg=1 and  v225=2 mistimc=1.
if curpreg=1 and v225 ne 2 mistimc=0.
variable label mistimc 'current pregnancy mistimed'.

***********.

if birth1a=1 and  m10$1=3 unwant1a=1.
if birth1a=1 and  m10$1 ne 3 unwant1a=0.

if birth2a=1 and  m10$2=3 unwant2a=1.
if birth2a=1 and  m10$2 ne 3 unwant2a=0.

if birth3a=1 and  m10$3=3 unwant3a=1.
if birth3a=1 and  m10$3 ne 3 unwant3a=0.

if birth4a=1 and  m10$4=3 unwant4a=1.
if birth4a=1 and  m10$4 ne 3 unwant4a=0.


*intialize for births 5 and 6.
*fyi Guinea 2005 doesn't have m10$5-6, intializing will create unwant5-6
with missing values and prevent error message.
compute unwant5a=$sysmis.
if birth5a=1 and  m10$5=3 unwant5a=1.
if birth5a=1 and  m10$5 ne 3 unwant5a=0.

compute unwant6a=$sysmis.
if birth6a=1 and  m10$6=3 unwant6a=1.
if birth6a=1 and  m10$6 ne 3 unwant6a=0.


if curpreg=1 and  v225=3 unwantc=1.
if curpreg=1 and v225 ne 3 unwantc=0.
variable label unwantc 'current pregnancy unwanted'.


*currently preg w no info on wantedness of pregnancy.
if curpreg=1 and missing(v225) unwantc=9.
if curpreg=1 and missing(v225) unwantc=9.


SORT CASES BY caseid v001 v002 v003.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD