|
Hi to you all,
How can I find the date a excel file has been changed and use this to create value that I can save for each case? I would love to fine a way to make the exportdate automatic and dependent of excel-file change date, so that if there is a excel file that has been exported on a different day, it can be detected in SPSS data. I am importing regularly 17 excel files that I first convert to 17 SPSS files, which I then match to one large SPSS file. Some of the files provide cases, others provide variables. Having the dates of export, would allowe me to ensure that my data is consistent. There can be a timelag between export and import, and for me the exportdate is actually the most important date. These excel files are created manually, and things can go wrong. Now I make the assumption that all my files are exported on the same day, which of course does not need to be true if something has gone wrong. Thus, I could easily end up having one old excel file among several new ones. Now I use a pseudo code like for each of my 17 files. GET FILE ... * manuell update of exportdate. COMPUTE exportdate= DATE.DMY(12,06,2008) . VARIABLE LABELS exportdate 'Date for exporting information from database'. * automatic update of importdate. COMPUTE importdate =XDATE.DATE($time). VARIABLE LABELS importdate 'Date for importing information to SPSS'. EXECUTE. SAVE FILE... I suspect that this problem is a common one and a great solution exists allready. I am willing to use Python, as I cannot see how this can be done within SPSS, only. I am running my syntax files through the SPSS Production Mode Facility, if that would make any difference. It would be best if the syntax runs on both v15 and v16. Any suggestions? Sincerely, Eero Olli ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi,
Based on a file I downloaded somehwere I used the python code below. Is this what you're looking for? Cheers!! Albert-Jan begin program. """ open all .sav files in a given dir and create a variable lastmod_date that indicates that file's last file modification date. save as originalfilename_modified.sav """ import os, glob, time, spss mydir = 'd:/temp/' for folder in glob.glob(mydir): for file in glob.glob(folder + '/*.sav'): stats = os.stat(file) lastmod_date = time.localtime(stats[8]) year = str(lastmod_date[0]) month = str(lastmod_date[1]).zfill(2) day = str(lastmod_date[2]).zfill(2) timestamp = year + month + day spss.Submit("""get file = '%s'. string lastmod_date (a8). compute lastmod_date = '%s'. variable label lastmod_date 'file modification date (year, month, day)'). save outfile = '%s_modified.sav'. """ %(file, timestamp, file)) end program. set mprint = on. --- On Fri, 6/27/08, Eero Olli <[hidden email]> wrote: > From: Eero Olli <[hidden email]> > Subject: How to find date of excel file change and save that as a value for a variable? > To: [hidden email] > Date: Friday, June 27, 2008, 3:26 PM > Hi to you all, > > How can I find the date a excel file has been changed and > use this to > create value that I can save for each case? > > I would love to fine a way to make the exportdate automatic > and > dependent of excel-file change date, so that if there is a > excel file > that has been exported on a different day, it can be > detected in SPSS > data. > > I am importing regularly 17 excel files that I first > convert to 17 SPSS > files, which I then match to one large SPSS file. Some of > the files > provide cases, others provide variables. Having the dates > of export, > would allowe me to ensure that my data is consistent. > There can be a > timelag between export and import, and for me the > exportdate is actually > the most important date. > > These excel files are created manually, and things can go > wrong. Now I > make the assumption that all my files are exported on the > same day, > which of course does not need to be true if something has > gone wrong. > Thus, I could easily end up having one old excel file among > several new > ones. > > Now I use a pseudo code like for each of my 17 files. > > GET FILE ... > > * manuell update of exportdate. > COMPUTE exportdate= DATE.DMY(12,06,2008) . > VARIABLE LABELS exportdate 'Date for exporting > information from > database'. > > * automatic update of importdate. > COMPUTE importdate =XDATE.DATE($time). > VARIABLE LABELS importdate 'Date for importing > information to SPSS'. > EXECUTE. > > SAVE FILE... > > > I suspect that this problem is a common one and a great > solution exists > allready. I am willing to use Python, as I cannot see how > this can be > done within SPSS, only. I am running my syntax files > through the SPSS > Production Mode Facility, if that would make any > difference. It would > be best if the syntax runs on both v15 and v16. > > Any suggestions? > > Sincerely, > Eero Olli > ________________________________________ > Eero Olli > Advisor > the Equality and Anti-discrimination Ombud > [hidden email] +47 2405 5951 > POB 8048 Dep, N-0031 Oslo, Norway > > > ====================To manage your subscription to SPSSX-L, > send a message to > [hidden email] (not to SPSSX-L), with no body > text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the > command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Eero Olli
Here is another take on a solution. It takes a file wildcard specification and creates an SPSS dataset of names, dates in seconds, and readable dates.
Suppose the code below is saved in a file named inventory.py that is on the Python search path, e.g., in c:/python25/lib/site-packages (for 16) or c:/python24/lib/site-packages (for 15). Then you could use it as begin program. import spss, inventory count = inventory.createfilechangedataset("c:/temp/*.xls", "Excelfiles") print "File count:", count end program. This will work in 15 or 16 and requires the spssaux, spssdata, and namedtuple modules from SPSS Developer Central (www.spss.com/devcentral). The first date variable is the file date in seconds as returned by the os. The second one is an SPSS date variable holding day, month, year. HTH, Jon Peck (If the mailer mangles the line structure too much, I can send this offline as an attachment.) # create dataset of change times for all files of selected pattern import spss, os, glob, time from spssdata import vdef, Spssdata, yrmodasec def createfilechangedataset(filespec, dsname): """Create an SPSS dataset named dsname of all files matching filespec. Return count Contents are variables filepath and change time (in seconds).""" filelist = glob.glob(filespec) if not filelist: return 0 curs = Spssdata(accessType='n') curs.append(vdef('filespec', vfmt=("A", 100))) curs.append(vdef("changedata", vlabel="Date of most recent change in seconds")) curs.append(vdef("readabledate", vlabel="Date string of most recent change", vfmt=("DATE",12))) curs.commitdict() for f in filelist: changedate = os.stat(f).st_ctime readabledate = yrmodasec(time.localtime(changedate)[:3]) curs.appendvalue(0, f) curs.appendvalue(1, changedate) curs.appendvalue(2, readabledate) curs.CommitCase() curs.CClose() return len(filelist) -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Eero Olli Sent: Friday, June 27, 2008 7:27 AM To: [hidden email] Subject: [SPSSX-L] How to find date of excel file change and save that as a value for a variable? Hi to you all, How can I find the date a excel file has been changed and use this to create value that I can save for each case? I would love to fine a way to make the exportdate automatic and dependent of excel-file change date, so that if there is a excel file that has been exported on a different day, it can be detected in SPSS data. I am importing regularly 17 excel files that I first convert to 17 SPSS files, which I then match to one large SPSS file. Some of the files provide cases, others provide variables. Having the dates of export, would allowe me to ensure that my data is consistent. There can be a timelag between export and import, and for me the exportdate is actually the most important date. These excel files are created manually, and things can go wrong. Now I make the assumption that all my files are exported on the same day, which of course does not need to be true if something has gone wrong. Thus, I could easily end up having one old excel file among several new ones. Now I use a pseudo code like for each of my 17 files. GET FILE ... * manuell update of exportdate. COMPUTE exportdate= DATE.DMY(12,06,2008) . VARIABLE LABELS exportdate 'Date for exporting information from database'. * automatic update of importdate. COMPUTE importdate =XDATE.DATE($time). VARIABLE LABELS importdate 'Date for importing information to SPSS'. EXECUTE. SAVE FILE... I suspect that this problem is a common one and a great solution exists allready. I am willing to use Python, as I cannot see how this can be done within SPSS, only. I am running my syntax files through the SPSS Production Mode Facility, if that would make any difference. It would be best if the syntax runs on both v15 and v16. Any suggestions? Sincerely, Eero Olli ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ======= To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks to all who have helped!
Jon's python script took me a big leap forward. Now I can simply include this in my Production job and get a summary of the filedate everytime I run the script. And it is easy to save the dates to existing files as new variables, when needed. Another alternative, which I realised first today, would be to insert the filenames and dates into a DOCUMENT statement to make sure that the datafile does allways contain information about date for exporting the data. *Show file dates.sps begin program. import spss, inventory count = inventory.createfilechangedataset("c:/temp/*.xls", "Excelfiles") print "File count:", count end program. DELETE VARIABLE changedata. RENAME VARIABLE readabledate=exportdate. SUMMARIZE /TABLES=filespec exportdate /FORMAT=VALIDLIST NOCASENUM TOTAL LIMIT=100 /TITLE='Export dates for all files' /MISSING=VARIABLE /CELLS=COUNT . I did find a "bug" in the inventory.py script. The date used was st_ctime, which shows creation date. I changed this to st_mtime, which shows modified time, inorder to get the date I was looking for. The modifed python file is below. # create dataset of change times for all files of selected pattern # Writen by Jon Peck # Modified by Eero Olli 30.06.2008 - changed from st_ctime to st_mtime import spss, os, glob, time from spssdata import vdef, Spssdata, yrmodasec def createfilechangedataset(filespec, dsname): """Create an SPSS dataset named dsname of all files matching filespec. Return count Contents are variables filepath and change time (in seconds).""" filelist = glob.glob(filespec) if not filelist: return 0 curs = Spssdata(accessType='n') curs.append(vdef('filespec', vfmt=("A", 100))) curs.append(vdef("changedata", vlabel="Date of most recent change in seconds")) curs.append(vdef("readabledate", vlabel="Date string of most recent change", vfmt=("DATE",12))) curs.commitdict() for f in filelist: changedate = os.stat(f).st_mtime readabledate = yrmodasec(time.localtime(changedate)[:3]) curs.appendvalue(0, f) curs.appendvalue(1, changedate) curs.appendvalue(2, readabledate) curs.CommitCase() curs.CClose() return len(filelist) Many thanks to all who have helped! Best, Eero ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Sorry about the st_ctime.
The stat structure provides st_atime for last access time st_mtime for last modification, and st_ctime, which on Windows is the creation time but is a little different on *nix platforms. BTW, many people don't know that if you look at the Windows Explorer details view, you see the modification date, but under View/Choose Details you can opt to see the other timestamps (as well as lots of other interesting and sometimes bizarre properties). Regards, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Eero Olli Sent: Monday, June 30, 2008 3:37 AM To: [hidden email] Subject: [SPSSX-L] SV: [SPSSX-L] How to find date of excel file change and save that as a value for a variable? Thanks to all who have helped! Jon's python script took me a big leap forward. Now I can simply include this in my Production job and get a summary of the filedate everytime I run the script. And it is easy to save the dates to existing files as new variables, when needed. Another alternative, which I realised first today, would be to insert the filenames and dates into a DOCUMENT statement to make sure that the datafile does allways contain information about date for exporting the data. *Show file dates.sps begin program. import spss, inventory count = inventory.createfilechangedataset("c:/temp/*.xls", "Excelfiles") print "File count:", count end program. DELETE VARIABLE changedata. RENAME VARIABLE readabledate=exportdate. SUMMARIZE /TABLES=filespec exportdate /FORMAT=VALIDLIST NOCASENUM TOTAL LIMIT=100 /TITLE='Export dates for all files' /MISSING=VARIABLE /CELLS=COUNT . I did find a "bug" in the inventory.py script. The date used was st_ctime, which shows creation date. I changed this to st_mtime, which shows modified time, inorder to get the date I was looking for. The modifed python file is below. # create dataset of change times for all files of selected pattern # Writen by Jon Peck # Modified by Eero Olli 30.06.2008 - changed from st_ctime to st_mtime import spss, os, glob, time from spssdata import vdef, Spssdata, yrmodasec def createfilechangedataset(filespec, dsname): """Create an SPSS dataset named dsname of all files matching filespec. Return count Contents are variables filepath and change time (in seconds).""" filelist = glob.glob(filespec) if not filelist: return 0 curs = Spssdata(accessType='n') curs.append(vdef('filespec', vfmt=("A", 100))) curs.append(vdef("changedata", vlabel="Date of most recent change in seconds")) curs.append(vdef("readabledate", vlabel="Date string of most recent change", vfmt=("DATE",12))) curs.commitdict() for f in filelist: changedate = os.stat(f).st_mtime readabledate = yrmodasec(time.localtime(changedate)[:3]) curs.appendvalue(0, f) curs.appendvalue(1, changedate) curs.appendvalue(2, readabledate) curs.CommitCase() curs.CClose() return len(filelist) Many thanks to all who have helped! Best, Eero ________________________________________ Eero Olli Advisor the Equality and Anti-discrimination Ombud [hidden email] +47 2405 5951 POB 8048 Dep, N-0031 Oslo, Norway ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
