SPSSX Discussion

file name and time stamp as variables. help on SPSSAUX (?) requested

Classic

List

Threaded

4 messages Options

Maurice Vergeer

file name and time stamp as variables. help on SPSSAUX (?) requested

dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
if fno % 50 == 0:
print "--> Verwerkt file %s\n" % fno
reader = csv.reader(open(f, "rU"), delimiter=",")
if fno > 0:
skipheader = reader.next()
for lino, line in enumerate(reader):
if fno == 0 and lino == 0:
header = writer.writerow(line+ ["bestand"])
else:
writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.

--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Albert-Jan Roskam

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

Hi Maurice,

Wonder why my code no longer works ;-) What errors are you getting?

Below are two methods to calculate a datestamp. A time stamp can be made by using e.g. "%H:%M:%S". In this case, the file modification date is used. The "time" method may be slightly faster, but the datetime method is really neat when you'd like to do arithmetic with dates/times.

>>> import time, datetime, os
>>> f = "d:/temp/somefile.csv"
>>> iso_mdate = time.strftime("%Y-%m-%d", time.localtime(os.path.getmtime(f)))

>>> mtime = datetime.datetime.fromtimestamp(os.path.getmtime(f))
>>> peilmoment = datetime.datetime(2012, 8, 14, 15, 0, 0, 0)
>>> print (mtime - peilmoment).days

Regards,
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Maurice Vergeer <[hidden email]>
To: [hidden email]
Sent: Tuesday, August 21, 2012 8:57 PM
Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX (?) requested

dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
if fno % 50 == 0:
print "--> Verwerkt file %s\n" % fno
reader = csv.reader(open(f, "rU"), delimiter=",")
if fno > 0:
skipheader = reader.next()
for lino, line in enumerate(reader):
if fno == 0 and lino == 0:
header = writer.writerow(line+ ["bestand"])
else:
writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.

--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Albert-Jan Roskam

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

In reply to this post by Maurice Vergeer

Hi again,

The following method does not need spss (http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/)

Note that it sometimes gives errors with e.g. chinese variabele names, so YMMV. The code below is untested.

import os, time, glob, csv
sys.path.append(r"file://server/share/folder/subfolder") # this is where the next .py file + spssio32.dll live.
from SavReaderWriter import * # http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/

tempdir = os.getenv("temp")
header = ["x", "iso_mdate", "filename"]
varTypes = {'x': 0, 'iso_mdate': 10, 'filename': 200}
savFileName = os.path.join(tempdir, "combined.sav")
with SavWriter(savFileName, header, varTypes) as sav:
for n, csvfile in enumerate(sorted(glob.glob(os.path.join(tempdir, "*.csv"))):
    with open(csvfile, "rb") as f:
      reader = csv.reader(f, delimiter=";")
      skipheader = reader.next()
      iso_mdate = time.strftime("%Y-%m-%d", time.localtime(os.path.getmtime(f.name)))
      for line in reader:
        sav.writerow(line + [iso_mdate, f.name])

Regards,
Albert-Jan

From: Maurice Vergeer <[hidden email]>
To: [hidden email]
Sent: Tuesday, August 21, 2012 8:57 PM
Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX (?) requested

dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
if fno % 50 == 0:
print "--> Verwerkt file %s\n" % fno
reader = csv.reader(open(f, "rU"), delimiter=",")
if fno > 0:
skipheader = reader.next()
for lino, line in enumerate(reader):
if fno == 0 and lino == 0:
header = writer.writerow(line+ ["bestand"])
else:
writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.

--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Maurice Vergeer

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

Hi Albert-Jan,

thanks. Tinkering with your initial code from two years back it didn't
work because the csv files were structured differently. For instance,
the standard delimiter was a comma, but also " for a text field. I do
not yet see how to implement this. The text-field has comma's in it as
well, so the " as a delimiter is needed.
I will try to test your code. One thing though, I already have trouble
finding the suggested spssio32.dll-file on the IBM site. Also the 32
suggests this is a 32 bit dll? I run windows 64 bit, so would I need a
64-version?

I'll get back when I have more.

Thanks
Maurice

On Wed, Aug 22, 2012 at 1:24 PM, Albert-Jan Roskam <[hidden email]> wrote:

> Hi again,
>
> The following method does not need spss
> (http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/)
> Note that it sometimes gives errors with e.g. chinese variabele names, so
> YMMV. The code below is untested.
>
> import os, time, glob, csv
> sys.path.append(r"file://server/share/folder/subfolder") # this is where the
> next .py file + spssio32.dll live.
> from SavReaderWriter import * #
> http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/
> tempdir = os.getenv("temp")
> header = ["x", "iso_mdate", "filename"]
> varTypes = {'x': 0, 'iso_mdate': 10, 'filename': 200}
> savFileName = os.path.join(tempdir, "combined.sav")
> with SavWriter(savFileName, header, varTypes) as sav:
> for n, csvfile in enumerate(sorted(glob.glob(os.path.join(tempdir,
> "*.csv"))):
> with open(csvfile, "rb") as f:
> reader = csv.reader(f, delimiter=";")
> skipheader = reader.next()
> iso_mdate = time.strftime("%Y-%m-%d",
> time.localtime(os.path.getmtime(f.name)))
> for line in reader:
> sav.writerow(line + [iso_mdate, f.name])
>
> Regards,
> Albert-Jan
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> All right, but apart from the sanitation, the medicine, education, wine,
> public order, irrigation, roads, a
> fresh water system, and public health, what have the Romans ever done for
> us?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> From: Maurice Vergeer <[hidden email]>
> To: [hidden email]
> Sent: Tuesday, August 21, 2012 8:57 PM
> Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX
> (?) requested
>
> dear all
>
> this is the problem: I have many (appr 600) csv-files that I need to
> import and finally merge into a single spss system file.
> -the name of the csv-file need to become a value in a variable
> -the date/time stamp of each file needs to become a values of a time
> variable
>
> Some time ago (two years) I asked for the assistence on an allmost
> similar problem for which Albert-Jan Roskam provided me with this
> python code (see below) which back then worked like clockwork. However
> not anymore.
> Also I understand there is the new package SPSSAUX which should do the
> trick. However going through the programming pdf and looking at
> examples was not very heklpful (mind you, I am not a real programmer).
> So any hints on where to look for it or some generic example of
> SPSSAUX in the context of this problem would be greatly appreciated.
>
> best regards
> Maurice
>
> *** merge all separate files in one single text file.
> BEGIN PROGRAM.
> import glob, spss, csv, os
> fs = sorted(glob.glob("V:/19082012/*.csv"))
> merged = "d:/temp/merged.txt"
> m = open(merged, "wb")
> writer = csv.writer(m, delimiter="\t")
> for fno, f in enumerate(fs):
> if fno % 50 == 0:
> print "--> Verwerkt file %s\n" % fno
> reader = csv.reader(open(f, "rU"), delimiter=",")
> if fno > 0:
> skipheader = reader.next()
> for lino, line in enumerate(reader):
> if fno == 0 and lino == 0:
> header = writer.writerow(line+ ["bestand"])
> else:
> writer.writerow(line + [os.path.basename(f)])
> m.close()
> cmd = r"""
> new file.
> get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
> /arrangement = delimited /firstcase = 2 /importcase = all /variables =
> x f18.2
> source f1.0
> bestand a40
> .
> cache.
> fre source.
> """
> print cmd % (merged)
> spss.Submit(cmd % (merged))
> spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
> END PROGRAM.
>
>
>
> --
> ___________________________________________________________________
> Maurice Vergeer
> To contact me, see http://mauricevergeer.nl/node/5
> To see my publications, see http://mauricevergeer.nl/node/1
> ___________________________________________________________________
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD