SPSSX Discussion

Merging numerous files via macro

Classic

List

Threaded

4 messages Options

Jim Van Overschelde

Merging numerous files via macro

I am new to the group so I hope I post this correctly.

I am processing lots of files with student assessment data. For one project I have to merge cases from ~120 different files.

Each file is for a different grade, language of test, and administration (e.g., Math3Eng1_2011, Math3Sp1_2011, Math4Eng1_2011, Math5-1_2011, Math6_2011). Each school year there are 26-34 different files. I want to be able to easily/efficiently merge the cases from all files for a particular school year. To make matters more complicated, the file naming structure changes year to year. I don’t want to merge files one at a time because a total of over 5M records exist and it takes forever to do 25+ data passes. I don’t know how to merge multiple files in one data pass unless I hard coded the # of files to merge (but this varies).

I could take a directory listing via windows command prompt, then using an editor add the prefixes and suffixes to each line so it can be pasted into a merge statement, but I wanted a more automated process.

Anybody have a good suggestions?

Kreischer,Resha M

Automatic reply: Merging numerous files via macro

I will be out of the office until the afternoon of July 30th. I will respond to your email upon my return.

Sincerely,

Resha

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jon K Peck

Re: Merging numerous files via macro

In reply to this post by Jim Van Overschelde

Easy to do with a few lines of code using Python programmability. You would need to install the Python Essentials from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral) if you haven't already done that.

Then run this from a syntax window. Change the filespec line below to select the files, e.g.,
filespec = r"c:/data/*2111.sav"
I've assumed that they are all in the same directory.

begin program.
import spss, glob

filespec = r"c:/temp/parts/e*.sav"

files = glob.glob(filespec)
cmd = "ADD FILES "
all = " ".join(["""/FILE="%s" """ % f for f in files])
cmd = cmd + all
spss.Submit(cmd)
end program.

dataset name merged.
exec.

HTH,

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: Jim Van Overschelde <[hidden email]>
To: [hidden email]
Date: 07/25/2012 01:33 PM
Subject: [SPSSX-L] Merging numerous files via macro
Sent by: "SPSSX(r) Discussion" <[hidden email]>

I am new to the group so I hope I post this correctly.

I am processing lots of files with student assessment data. For one project I have to merge cases from ~120 different files.
Each file is for a different grade, language of test, and administration (e.g., Math3Eng1_2011, Math3Sp1_2011, Math4Eng1_2011, Math5-1_2011, Math6_2011). Each school year there are 26-34 different files. I want to be able to easily/efficiently merge the cases from all files for a particular school year. To make matters more complicated, the file naming structure changes year to year. I don’t want to merge files one at a time because a total of over 5M records exist and it takes forever to do 25+ data passes. I don’t know how to merge multiple files in one data pass unless I hard coded the # of files to merge (but this varies).

I could take a directory listing via windows command prompt, then using an editor add the prefixes and suffixes to each line so it can be pasted into a merge statement, but I wanted a more automated process.

Anybody have a good suggestions?

Jim Van Overschelde

Re: Merging numerous files via macro

I installed Python essentials and ran code but got no errors, output, or action.

Removed Python essentials and reinstalled as administrator and it worked great!!!

Thanks Jon.

From: Jon K Peck [mailto:[hidden email]]
Sent: Wednesday, July 25, 2012 3:06 PM
To: Jim Van Overschelde
Cc: [hidden email]
Subject: Re: [SPSSX-L] Merging numerous files via macro