converting/appending a list of csv files to spss

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

converting/appending a list of csv files to spss

Bruce Colton

I have daily csv files that must be converted and then added or stacked together at the end of the month to create a single monthly spss dataset.  The daily csv files all have the same variables, but the  number of cases may vary by day.   

 

My thought on this was to create an empty .sav file with variables, for example, Jan_09_day1, Jan_09_day2, .........Jan_09_day28, with these variable names being the 28 csv dataset names.  Then create a varlist within Python using varlist.append to stack these 28 csv file names.  Follow this in Python with :

for var in varlist:   get data /type=txt .........  save outfile= ..... 

 

I think this brief sketch would create the individual 28 .sav files I need.

 

To add all 28 .sav files:

 

within Python:

for i in range GetVariableCount:

  if i>0

    add files

        /file=file <with getvariablename(i-1)>

       /file=file <with getvariablename(i)>

exe.

save outfile=file <with getvariablename(i)>.

 

I can probably fill in most of the Python details myself, but I haven't done much work in Python, so I'm tryin to get some feedback to see if I'm even on the right track with this approach;  if there is a more efficient way to do it, and I'm sure there is, I would certainly appreciate hearing about it.  Thanks.


 

Reply | Threaded
Open this post in threaded view
|

Re: converting/appending a list of csv files to spss

Albert-Jan Roskam
hi!

Maybe not quite what you want, but perhaps still of use:

import csv, glob, os.path
path = "d:/temp"
csvs = glob.glob(os.path.join(path, "*.csv"))
outfile = open(os.path.join(path, "merged.csv"), "ab")
for day, csvx in enumerate(sorted(csvs)):
    infile = open(os.path.join(path, csvx), "rb")
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        writer.writerow([day+1] + row)
infile.close()
outfile.close()

It appends each individual csv to one big csv. The individual csvs are sorted, and the one on top of the heap is assumed to be day #1. The code assumes that there are no variable headers (although it would still work).

You could use one GET DATA statement to import it into spss and save it to .sav.

Cheers!!
Albert-Jan

--- On Thu, 9/17/09, Bruce Colton <[hidden email]> wrote:

> From: Bruce Colton <[hidden email]>
> Subject: [SPSSX-L] converting/appending a list of csv files to spss
> To: [hidden email]
> Date: Thursday, September 17, 2009, 4:57 PM
> #yiv789597448 p
> {margin:0;}I
> have daily csv files that must be converted and then added
> or stacked together at the end of the month to create a
> single monthly spss dataset.  The daily csv files all
> have the same variables, but the  number of cases may
> vary by day.
>
> My thought on this was to create an empty .sav file with
> variables, for example, Jan_09_day1, Jan_09_day2,
> .........Jan_09_day28, with these variable names being the
> 28 csv dataset names.  Then create a varlist within
> Python using varlist.append to stack these 28 csv file
> names.  Follow this in Python with :
> for var in varlist:   get data /type=txt
> .........  save outfile= .....
>
> I think this brief sketch would create the
> individual 28 .sav files I need.
>
> To add all 28 .sav files:
>
> within Python:
> for i in range GetVariableCount:
>   if i>0
>     add files
>         /file=file
> <with getvariablename(i-1)>
>        /file=file <with
> getvariablename(i)>
> exe.
> save outfile=file <with getvariablename(i)>.
>
> I can probably fill in most of the Python details
> myself, but I haven't done much work in Python, so
> I'm tryin to get some feedback to see if I'm even on
> the right track with this approach;  if there is a more
> efficient way to do it, and I'm sure there is, I would
> certainly appreciate hearing about it.  Thanks.
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD