file name and time stamp as variables. help on SPSSAUX (?) requested

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

file name and time stamp as variables. help on SPSSAUX (?) requested

Maurice Vergeer
dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
  if fno % 50 == 0:
    print "--> Verwerkt file %s\n" % fno
  reader = csv.reader(open(f, "rU"), delimiter=",")
  if fno > 0:
    skipheader = reader.next()
  for lino, line in enumerate(reader):
    if fno == 0 and lino == 0:
      header = writer.writerow(line+ ["bestand"])
    else:
      writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.



--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

Albert-Jan Roskam
Hi Maurice,
 
Wonder why my code no longer works ;-) What errors are you getting?
Below are two methods to calculate a datestamp. A time stamp can be made by using e.g. "%H:%M:%S". In this case, the file modification date is used. The "time" method may be slightly faster, but the datetime method is really neat when you'd like to do arithmetic with dates/times.
 
>>> import time, datetime, os
>>> f = "d:/temp/somefile.csv"
>>> iso_mdate = time.strftime("%Y-%m-%d", time.localtime(os.path.getmtime(f)))
>>> mtime = datetime.datetime.fromtimestamp(os.path.getmtime(f))
>>> peilmoment = datetime.datetime(2012, 8, 14, 15, 0, 0, 0)
>>> print (mtime - peilmoment).days
 
Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
From: Maurice Vergeer <[hidden email]>
To: [hidden email]
Sent: Tuesday, August 21, 2012 8:57 PM
Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX (?) requested

dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
  if fno % 50 == 0:
    print "--> Verwerkt file %s\n" % fno
  reader = csv.reader(open(f, "rU"), delimiter=",")
  if fno > 0:
    skipheader = reader.next()
  for lino, line in enumerate(reader):
    if fno == 0 and lino == 0:
      header = writer.writerow(line+ ["bestand"])
    else:
      writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.



--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

Albert-Jan Roskam
In reply to this post by Maurice Vergeer
Hi again,
 
Note that it sometimes gives errors with e.g. chinese variabele names, so YMMV. The code below is untested.
 
import os, time, glob, csv
sys.path.append(r"file://server/share/folder/subfolder") # this is where the next .py file + spssio32.dll live.
from SavReaderWriter import * # http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/
tempdir = os.getenv("temp")
header = ["x", "iso_mdate", "filename"]
varTypes = {'x': 0, 'iso_mdate': 10, 'filename': 200}
savFileName = os.path.join(tempdir, "combined.sav")
with SavWriter(savFileName, header, varTypes) as sav:
  for n, csvfile in enumerate(sorted(glob.glob(os.path.join(tempdir, "*.csv"))):
    with open(csvfile, "rb") as f:
      reader = csv.reader(f, delimiter=";")
      skipheader = reader.next()
      iso_mdate = time.strftime("%Y-%m-%d", time.localtime(os.path.getmtime(f.name)))
      for line in reader:
        sav.writerow(line + [iso_mdate, f.name])
 
Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
From: Maurice Vergeer <[hidden email]>
To: [hidden email]
Sent: Tuesday, August 21, 2012 8:57 PM
Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX (?) requested

dear all

this is the problem: I have many (appr 600) csv-files that I need to
import and finally merge into a single spss system file.
-the name of the csv-file need to become a value in a variable
-the date/time stamp of each file needs to become a values of a time variable

Some time ago (two years) I asked for the assistence on an allmost
similar problem for which Albert-Jan Roskam provided me with this
python code (see below) which back then worked like clockwork. However
not anymore.
Also I understand there is the new package SPSSAUX which should do the
trick. However going through the programming pdf and looking at
examples was not very heklpful (mind you, I am not a real programmer).
So any hints on where to look for it or some generic example of
SPSSAUX in the context of this problem would be greatly appreciated.

best regards
Maurice

*** merge all separate files in one single text file.
BEGIN PROGRAM.
import glob, spss, csv, os
fs = sorted(glob.glob("V:/19082012/*.csv"))
merged = "d:/temp/merged.txt"
m = open(merged, "wb")
writer = csv.writer(m, delimiter="\t")
for fno, f in enumerate(fs):
  if fno % 50 == 0:
    print "--> Verwerkt file %s\n" % fno
  reader = csv.reader(open(f, "rU"), delimiter=",")
  if fno > 0:
    skipheader = reader.next()
  for lino, line in enumerate(reader):
    if fno == 0 and lino == 0:
      header = writer.writerow(line+ ["bestand"])
    else:
      writer.writerow(line + [os.path.basename(f)])
m.close()
cmd = r"""
new file.
get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
/arrangement = delimited /firstcase = 2 /importcase = all /variables =
x f18.2
source f1.0
bestand a40
.
cache.
fre source.
"""
print cmd % (merged)
spss.Submit(cmd % (merged))
spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
END PROGRAM.



--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: file name and time stamp as variables. help on SPSSAUX (?) requested

Maurice Vergeer
Hi Albert-Jan,

thanks. Tinkering with your initial code from two years back it didn't
work because the csv files were structured differently. For instance,
the standard delimiter was a comma, but also " for a text field. I do
not yet see how to implement this. The text-field has comma's in it as
well, so the " as a delimiter is needed.
I will try to test your code. One thing though, I already have trouble
finding the suggested spssio32.dll-file on the IBM site. Also the 32
suggests this is a 32 bit dll? I run windows 64 bit, so would I need a
64-version?

I'll get back when I have more.

Thanks
Maurice

On Wed, Aug 22, 2012 at 1:24 PM, Albert-Jan Roskam <[hidden email]> wrote:

> Hi again,
>
> The following method does not need spss
> (http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/)
> Note that it sometimes gives errors with e.g. chinese variabele names, so
> YMMV. The code below is untested.
>
> import os, time, glob, csv
> sys.path.append(r"file://server/share/folder/subfolder") # this is where the
> next .py file + spssio32.dll live.
> from SavReaderWriter import * #
> http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/
> tempdir = os.getenv("temp")
> header = ["x", "iso_mdate", "filename"]
> varTypes = {'x': 0, 'iso_mdate': 10, 'filename': 200}
> savFileName = os.path.join(tempdir, "combined.sav")
> with SavWriter(savFileName, header, varTypes) as sav:
>   for n, csvfile in enumerate(sorted(glob.glob(os.path.join(tempdir,
> "*.csv"))):
>     with open(csvfile, "rb") as f:
>       reader = csv.reader(f, delimiter=";")
>       skipheader = reader.next()
>       iso_mdate = time.strftime("%Y-%m-%d",
> time.localtime(os.path.getmtime(f.name)))
>       for line in reader:
>         sav.writerow(line + [iso_mdate, f.name])
>
> Regards,
> Albert-Jan
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> All right, but apart from the sanitation, the medicine, education, wine,
> public order, irrigation, roads, a
> fresh water system, and public health, what have the Romans ever done for
> us?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> From: Maurice Vergeer <[hidden email]>
> To: [hidden email]
> Sent: Tuesday, August 21, 2012 8:57 PM
> Subject: [SPSSX-L] file name and time stamp as variables. help on SPSSAUX
> (?) requested
>
> dear all
>
> this is the problem: I have many (appr 600) csv-files that I need to
> import and finally merge into a single spss system file.
> -the name of the csv-file need to become a value in a variable
> -the date/time stamp of each file needs to become a values of a time
> variable
>
> Some time ago (two years) I asked for the assistence on an allmost
> similar problem for which Albert-Jan Roskam provided me with this
> python code (see below) which back then worked like clockwork. However
> not anymore.
> Also I understand there is the new package SPSSAUX which should do the
> trick. However going through the programming pdf and looking at
> examples was not very heklpful (mind you, I am not a real programmer).
> So any hints on where to look for it or some generic example of
> SPSSAUX in the context of this problem would be greatly appreciated.
>
> best regards
> Maurice
>
> *** merge all separate files in one single text file.
> BEGIN PROGRAM.
> import glob, spss, csv, os
> fs = sorted(glob.glob("V:/19082012/*.csv"))
> merged = "d:/temp/merged.txt"
> m = open(merged, "wb")
> writer = csv.writer(m, delimiter="\t")
> for fno, f in enumerate(fs):
>   if fno % 50 == 0:
>     print "--> Verwerkt file %s\n" % fno
>   reader = csv.reader(open(f, "rU"), delimiter=",")
>   if fno > 0:
>     skipheader = reader.next()
>   for lino, line in enumerate(reader):
>     if fno == 0 and lino == 0:
>       header = writer.writerow(line+ ["bestand"])
>     else:
>       writer.writerow(line + [os.path.basename(f)])
> m.close()
> cmd = r"""
> new file.
> get data /type = txt /file = '%s' /delcase = line /delimiters = "\t"
> /arrangement = delimited /firstcase = 2 /importcase = all /variables =
> x f18.2
> source f1.0
> bestand a40
> .
> cache.
> fre source.
> """
> print cmd % (merged)
> spss.Submit(cmd % (merged))
> spss.Submit("save outfile = '%s.sav'." % (merged[:-4]))
> END PROGRAM.
>
>
>
> --
> ___________________________________________________________________
> Maurice Vergeer
> To contact me, see http://mauricevergeer.nl/node/5
> To see my publications, see http://mauricevergeer.nl/node/1
> ___________________________________________________________________
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>



--
___________________________________________________________________
Maurice Vergeer
To contact me, see http://mauricevergeer.nl/node/5
To see my publications, see http://mauricevergeer.nl/node/1
___________________________________________________________________

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD