I have a list of excel files to be read and saved and other than the just writing a lot of syntax, is there a compact way to do this. I know (even) I can write a macro to read and save a file but how to feed a list of file names through
the macro, I don’t know. I suspect this is a perfect job for a python snippet and unlike in the past I’ll use it but there’s a learning curve for me and a question of whether the default install has all the needed prerequisites. Gene Maguin |
The SPSSINC PROCESS FILES extension command (Utilites > Process Data Files) can apply a file of syntax to a set of files selected by a wildcard expression. It defines file handles and macros for each selected file and then invokes the specified syntax file. In that file you would have syntax to open a file referring to the predefined file handle, do any necessary work, and then save the file. If you are saving the Excel file as a sav file, the predefined macros can be used to generate a sav file name using the base name of the Excel file. Here is an example from the syntax help for the command of what the syntax file would look like to open a batch of Excel files and save them as sav files using the predefined macros. define !out () !quote(!concat(!unquote(!eval(!JOB_OUTPUTDATADIR)),
!unquote(!eval(!JOB_DATAFILEROOT)),
".sav")) !enddefine.
GET DATA /TYPE=XLSX
/FILE=JOB_INPUTFILE
/SHEET=name 'Sheet1'
/CELLRANGE=full /READNAMES=on
/ASSUMEDSTRWIDTH=100.
SAVE OUTFILE=!out. The PROCESS FILES command might look like this, where the code above is in the specified syntax file. The command also specifies where the Viewer output goes and can write a log file if desired. SPSSINC PROCESS FILES INPUTDATA="c:\inputdata\*.xlsx" SYNTAX="c:\mycode\convertExcel.sps" OUTPUTDATADIR="c:\outputdata" CONTINUEONERROR=YES VIEWERDIR= "c:\outputdata" CLOSEDATA=YES MACRONAME="!JOB" LOGFILEMODE=APPEND. You can alternatively specify a file that lists all the input files to process instead of using the wildcard. SPSSINC PROCESS FILES may already be installed depending on your Statistics version, but if it isn't (check the Utilities menu), you can install it from the Utilities (V22-23) or Extensions (V24+) menu. You can get the full syntax help by typing the command name in the SE and pressing F1. On Fri, Dec 29, 2017 at 7:31 AM, Maguin, Eugene <[hidden email]> wrote:
|
Jon, thank you. I’d like to talk through how I’d implement this in the larger flow. (specifics: 43 excel files, each with 6 named sheets. Files are distinguished from one another by a two characters ID substring (first two characters) in the
filename string. The filename strings are composed of the ID substring, filler substring 1, contents type substring, filler substring 2, with all substring types having the same length and contents across all files in the set). So. My impression is that the spssinc process files command is just another command that can be pasted into a syntax file and run from there.
Easier to name the exel files in a text file rather than a macro. One line per filename-sheet name combination. In the example you have a get file command. GET DATA /TYPE=XLSX /FILE=JOB_INPUTFILE /SHEET=name 'Sheet1' /CELLRANGE=full /READNAMES=on /ASSUMEDSTRWIDTH=100. SAVE OUTFILE=!out. What does the !out reference on the same outfile command. I’ve never seen this before. My file names are going to need to have a name consisting of the ID substring
plus the contents type string. Is that manageable? How is save file naming controlled? Before the file is saved, I need to add a variable whose contents are the ID substring. Is this possible? It looks like doing some things to the file is possible
but it looks like the python programmability plug-in is required and since I have never used python, I doubt that I have this. Would this be true? How would I check for this and if not present, what is the sequence to get the correct version, install, and
operate it? Once each file is read and saved, the next step will be to aggregate the files and process contents further, which seems pretty easy and simple compared to the
read/save step. Thanks. Gene Maguin From: Jon Peck [mailto:[hidden email]]
The SPSSINC PROCESS FILES extension command (Utilites > Process Data Files) can apply a file of syntax to a set of files selected by a wildcard expression. It defines file handles and macros for each selected
file and then invokes the specified syntax file. In that file you would have syntax to open a file referring to the predefined file handle, do any necessary work, and then save the file. If you are saving the Excel file as a sav file, the predefined macros
can be used to generate a sav file name using the base name of the Excel file. Here is an example from the syntax help for the command of what the syntax file would look like to open a batch of Excel files and save them as sav files using the predefined macros. define !out () !quote(!concat(!unquote(!eval(!JOB_OUTPUTDATADIR)), !unquote(!eval(!JOB_DATAFILEROOT)), ".sav")) !enddefine. GET DATA /TYPE=XLSX
/FILE=JOB_INPUTFILE /SHEET=name 'Sheet1' /CELLRANGE=full /READNAMES=on /ASSUMEDSTRWIDTH=100. SAVE OUTFILE=!out. The PROCESS FILES command might look like this, where the code above is in the specified syntax file. The command also specifies where the Viewer output goes and can write a log file if desired. SPSSINC PROCESS FILES INPUTDATA="c:\inputdata\*.xlsx" SYNTAX="c:\mycode\convertExcel.sps" OUTPUTDATADIR="c:\outputdata" CONTINUEONERROR=YES VIEWERDIR= "c:\outputdata" CLOSEDATA=YES MACRONAME="!JOB" LOGFILEMODE=APPEND. You can alternatively specify a file that lists all the input files to process instead of using the wildcard. SPSSINC PROCESS FILES may already be installed depending on your Statistics version, but if it isn't (check the Utilities menu), you can install it from the Utilities (V22-23) or Extensions (V24+) menu. You can get the full syntax help by typing the command name in the SE and pressing F1. On Fri, Dec 29, 2017 at 7:31 AM, Maguin, Eugene <[hidden email]> wrote:
-- Jon K Peck |
"So. My impression is that the spssinc process files command is just another command that can be pasted into a syntax file and run from there. Easier to name the exel files in a text file rather than a macro. One line per filename-sheet name combination." Yes, extension commands are just like regular commands, once installed. "each with 6 named sheets" The example I posted assumed one sheet named sheet1, but you can write any appropriate GET DATA command or any other ordinary syntax. Presumably each sheet should be a separate file, so you would have six GET DATA and SAVE commands in the syntax file that would be executed for each Excel file. You would only list the file names in the input specification, not the sheet names, and refer to the input specification as FILELIST=“input data file specification” in the SPSSINC PROCESS FILES command. The format for that file is one name per line including the path to the file. The names in the file must be enclosed in double quotes (“). Anything following on the line is ignored. Blank lines and lines starting with # are ignored. I hope that the sheet names are all the same across files. If they are all different, we would need to build appropriate tables of the names. "What does the !out reference on the same outfile command. " That is on the SAVE command and refers to the macro defined earlier in the file that uses the predefined macro definitions generated by the PROCESS FILES command. I have copied the details on the automatic file handles and macros below from the syntax help. In the example, the output file name is the same as the input file name except with a different extension. In your case, you would need to construct the output name macro differently. "Before the file is saved, I need to add a variable" You can write any ordinary syntax (or Python code) in the syntax file. So, similarly to the !out macro defined in the example, you can extract any part of the filename using ordinary macro syntax exploiting the !JOB_DATAFILEROOT predefined macro. "it looks like the python programmability plug-in is required and since I have never used python, I doubt that I have this." This whole command is implemented in Python. The Python plugin is installed by default since around V22 of Statistics, although you can refuse it at installation time. One way to check whether it is installed is to run this code from the syntax window (be sure to select all three lines). begin program. print "Python is here" end program. I suggest that if you have further questions, we take this offline. From the syntax help... The file handles are as follows.
For a SAV file you could read the data with the command Macros are defined with these same names except starting with ”!“. Two additional macros are defined.
On Fri, Dec 29, 2017 at 9:04 AM, Maguin, Eugene <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |