Dear friends list
A total of 1000 interviewers typed their data and each generated a file in txt format. All files are separated by tabs with the variable name at the top. The problem is that there was no rule on the naming of files and each gave a different name. For example: Clark.txt Myllena.txt interview.txt w.txt My problem is to import all these files and perform the merge of them automatically via a macro. Any suggestions? I'm working on it this Friday. If the files were standardized name no problem, solve the macro below. For example if I want to do the merge of 1000 files each with a part of the name in common, located in the c: \ spss: file1.sav file2.sav file3.sav . . . file1000.sav / * Perform the merge any number of files as long as they have the same variables. DEFINE MacroMergeAllFiles (DiretorioComum=!TOKENS(1)/ NomeComum=!TOKENS(1)/ QtdeArquivos=!TOKENS(1)) !DO !FILE= 1 !TO !QtdeArquivos. !IF (!FILE=1) !THEN GET FILE= !DiretorioComum+!NomeComum+"1.sav". !ELSE ADD FILES /FILE=* /FILE=!DiretorioComum+!NomeComum+!QUOTE(!FILE)+".sav". EXECUTE. !IFEND. !DOEND. !ENDDEFINE. /*MacroMergeAllFiles DiretorioComum='C:\spss\' NomeComum='file' QtdeArquivos=1000. The problem is to perform the same procedure but with different file names. Thanks for all. Carlos Renato Statistician - Brazil |
if you are on a windows system do something like
this untested procedure It is what we did before there was
PYthon.
put all the .txt in a folder e.g. c:\temptxt open the command window md c:\alltxt cd c:\temptxt copy *.txt c:\alltxt\combo.txt read combo.txt into spss with all fields tab separated strings do if $casenum eq 1. compute source =1. else if varname1 "whatever". compute source source+1. else. compute source = lag(source). compute keepit = 1. end if. select if keepit eq 1. * this may be a rare instance where you might need this execute. execute. * before there was alter type we wrote the data out as text and read it in in a new format. *see the help for alter type you might be able to use 'all' as the variable list. alter type id (f8) v1 v5 (f2) v2 to v4 v6 to v99 (f1) v100 (adate10). * I have not tested what happens if a string variable is altered to a string variable to make your variable list more complete. *you can use several alter type command one kind of specification shortens strings to the longest string necessary for that variable. This way you can eliminate many warnings. HTH Art Kendall Social Research Consultants On 11/21/2011 8:38 AM, Carlos Renato wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARDDear friends list A total of 1000 interviewers typed their data and each generated a file in txt format. All files are separated by tabs with the variable name at the top. The problem is that there was no rule on the naming of files and each gave a different name. For example: Clark.txt Myllena.txt interview.txt w.txt My problem is to import all these files and perform the merge of them automatically via a macro. Any suggestions? I'm working on it this Friday. If the files were standardized name no problem, solve the macro below. For example if I want to do the merge of 1000 files each with a part of the name in common, located in the c: \ spss: file1.sav file2.sav file3.sav . . . file1000.sav / * Perform the merge any number of files as long as they have the same variables. DEFINE MacroMergeAllFiles (DiretorioComum=!TOKENS(1)/ NomeComum=!TOKENS(1)/ QtdeArquivos=!TOKENS(1)) !DO !FILE= 1 !TO !QtdeArquivos. !IF (!FILE=1) !THEN GET FILE= !DiretorioComum+!NomeComum+"1.sav". !ELSE ADD FILES /FILE=* /FILE=!DiretorioComum+!NomeComum+!QUOTE(!FILE)+".sav". EXECUTE. !IFEND. !DOEND. !ENDDEFINE. /*MacroMergeAllFiles DiretorioComum='C:\spss\' NomeComum='file' QtdeArquivos=1000. The problem is to perform the same procedure but with different file names. Thanks for all. Carlos Renato Statistician - Brazil -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Complex-merge-files-tp5010533p5010533.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Since you are doing these one at a time,
you will make a lot of data passes, but if the files are not huge this
will not matter.
The easiest way to do this task would be to use the SPSSINC PROCESS FILES extension command from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral). This command processes all the files in a given location, optionally matching a wildcard expression. It runs a batch of syntax that you specify over each file. It defines file handles and macros that you can use in the syntax job. the command appears on the Utilities menu as Process Data Files. HTH, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Art Kendall <[hidden email]> To: [hidden email] Date: 11/21/2011 07:30 AM Subject: Re: [SPSSX-L] Complex merge files Sent by: "SPSSX(r) Discussion" <[hidden email]> if you are on a windows system do something like this untested procedure It is what we did before there was PYthon. put all the .txt in a folder e.g. c:\temptxt open the command window md c:\alltxt cd c:\temptxt copy *.txt c:\alltxt\combo.txt read combo.txt into spss with all fields tab separated strings do if $casenum eq 1. compute source =1. else if varname1 "whatever". compute source source+1. else. compute source = lag(source). compute keepit = 1. end if. select if keepit eq 1. * this may be a rare instance where you might need this execute. execute. * before there was alter type we wrote the data out as text and read it in in a new format. *see the help for alter type you might be able to use 'all' as the variable list. alter type id (f8) v1 v5 (f2) v2 to v4 v6 to v99 (f1) v100 (adate10). * I have not tested what happens if a string variable is altered to a string variable to make your variable list more complete. *you can use several alter type command one kind of specification shortens strings to the longest string necessary for that variable. This way you can eliminate many warnings. HTH Art Kendall Social Research Consultants On 11/21/2011 8:38 AM, Carlos Renato wrote: Dear friends list A total of 1000 interviewers typed their data and each generated a file in txt format. All files are separated by tabs with the variable name at the top. The problem is that there was no rule on the naming of files and each gave a different name. For example: Clark.txt Myllena.txt interview.txt w.txt My problem is to import all these files and perform the merge of them automatically via a macro. Any suggestions? I'm working on it this Friday. If the files were standardized name no problem, solve the macro below. For example if I want to do the merge of 1000 files each with a part of the name in common, located in the c: \ spss: file1.sav file2.sav file3.sav . . . file1000.sav / * Perform the merge any number of files as long as they have the same variables. DEFINE MacroMergeAllFiles (DiretorioComum=!TOKENS(1)/ NomeComum=!TOKENS(1)/ QtdeArquivos=!TOKENS(1)) !DO !FILE= 1 !TO !QtdeArquivos. !IF (!FILE=1) !THEN GET FILE= !DiretorioComum+!NomeComum+"1.sav". !ELSE ADD FILES /FILE=* /FILE=!DiretorioComum+!NomeComum+!QUOTE(!FILE)+".sav". EXECUTE. !IFEND. !DOEND. !ENDDEFINE. /*MacroMergeAllFiles DiretorioComum='C:\spss\' NomeComum='file' QtdeArquivos=1000. The problem is to perform the same procedure but with different file names. Thanks for all. Carlos Renato Statistician - Brazil -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Complex-merge-files-tp5010533p5010533.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Carlos Renato
Hi Carlos,
If you wish to use traditional macros, you can go this way: 1. Create the list of all file names. Under Windows, you can e.g. create a text file named "list.bat" containing this line: dir *.txt > list_of_files.dat By runing the bat (doubleclick), you get the list in the file list_of_files.dat. Open it in MS Word and extract the names of files (use a proportional font and select rectangular blocks by keeping Alt key while working with the mouse). 2. Then process the list as an argument of the macro cycle !DO !varname !IN (list) (Otherwise use Python, see Jon Peck's post.) Best regards, Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Carlos Renato Sent: Monday, November 21, 2011 2:38 PM To: [hidden email] Subject: Complex merge files Dear friends list A total of 1000 interviewers typed their data and each generated a file in txt format. All files are separated by tabs with the variable name at the top. The problem is that there was no rule on the naming of files and each gave a different name. For example: Clark.txt Myllena.txt interview.txt w.txt My problem is to import all these files and perform the merge of them automatically via a macro. Any suggestions? I'm working on it this Friday. If the files were standardized name no problem, solve the macro below. For example if I want to do the merge of 1000 files each with a part of the name in common, located in the c: \ spss: file1.sav file2.sav file3.sav . . . file1000.sav / * Perform the merge any number of files as long as they have the same variables. DEFINE MacroMergeAllFiles (DiretorioComum=!TOKENS(1)/ NomeComum=!TOKENS(1)/ QtdeArquivos=!TOKENS(1)) !DO !FILE= 1 !TO !QtdeArquivos. !IF (!FILE=1) !THEN GET FILE= !DiretorioComum+!NomeComum+"1.sav". !ELSE ADD FILES /FILE=* /FILE=!DiretorioComum+!NomeComum+!QUOTE(!FILE)+".sav". EXECUTE. !IFEND. !DOEND. !ENDDEFINE. /*MacroMergeAllFiles DiretorioComum='C:\spss\' NomeComum='file' QtdeArquivos=1000. The problem is to perform the same procedure but with different file names. Thanks for all. Carlos Renato Statistician - Brazil -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Complex-merge-files-tp5010533p5010533.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _____________ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. Are you sure that you really need a print version of this message and/or its attachments? Think about nature. -.- -- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Friends
Thank you so much for all the answers. I will address all suggestions. All are great. Carlos Renato Statistician |
In reply to this post by Jon K Peck
Jon,
Does that extension command include a way to include a variable that specifies the source file for a case? Identifying the source helps track down data anomalies. Also, unless the data files are proofread or double-entered and compared at the source, QA needs to be done by the receiver. The source id needs to be done in a consistent way so that keyings/enterings can be paired. Art On 11/21/2011 9:50 AM, Jon K Peck wrote: Since you are doing these one at a time, you will make a lot of data passes, but if the files are not huge this will not matter.===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
The procedure runs one file at a time.
The generated macros and file handles do identify the input file,
mainly so that the code can generate an output file based on the input
name in typical usage, but the syntax being run is anything you want.
Here is an extract from the help. The syntax file will be invoked for each input dataset. It should read the file and carry out any desired operations. File handles and macros are defined to refer to the input file and various output locations. The file handles are as follows.
GET FILE="JOB_INPUTFILE". • Macros are defined with these same names except starting with "!". Two additional macros are defined.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Art Kendall <[hidden email]> To: Jon K Peck/Chicago/IBM@IBMUS Cc: [hidden email] Date: 11/21/2011 11:40 AM Subject: Re: [SPSSX-L] Complex merge files Jon, Does that extension command include a way to include a variable that specifies the source file for a case? Identifying the source helps track down data anomalies. Also, unless the data files are proofread or double-entered and compared at the source, QA needs to be done by the receiver. The source id needs to be done in a consistent way so that keyings/enterings can be paired. Art On 11/21/2011 9:50 AM, Jon K Peck wrote: Since you are doing these one at a time, you will make a lot of data passes, but if the files are not huge this will not matter. The easiest way to do this task would be to use the SPSSINC PROCESS FILES extension command from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral). This command processes all the files in a given location, optionally matching a wildcard expression. It runs a batch of syntax that you specify over each file. It defines file handles and macros that you can use in the syntax job. the command appears on the Utilities menu as Process Data Files. HTH, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM peck@... new phone: 720-342-5621 From: Art Kendall <Art@...> To: [hidden email] Date: 11/21/2011 07:30 AM Subject: Re: [SPSSX-L] Complex merge files Sent by: "SPSSX(r) Discussion" [hidden email] if you are on a windows system do something like this untested procedure It is what we did before there was PYthon. put all the .txt in a folder e.g. c:\temptxt open the command window md c:\alltxt cd c:\temptxt copy *.txt c:\alltxt\combo.txt read combo.txt into spss with all fields tab separated strings do if $casenum eq 1. compute source =1. else if varname1 "whatever". compute source source+1. else. compute source = lag(source). compute keepit = 1. end if. select if keepit eq 1. * this may be a rare instance where you might need this execute. execute. * before there was alter type we wrote the data out as text and read it in in a new format. *see the help for alter type you might be able to use 'all' as the variable list. alter type id (f8) v1 v5 (f2) v2 to v4 v6 to v99 (f1) v100 (adate10). * I have not tested what happens if a string variable is altered to a string variable to make your variable list more complete. *you can use several alter type command one kind of specification shortens strings to the longest string necessary for that variable. This way you can eliminate many warnings. HTH Art Kendall Social Research Consultants On 11/21/2011 8:38 AM, Carlos Renato wrote: Dear friends list A total of 1000 interviewers typed their data and each generated a file in txt format. All files are separated by tabs with the variable name at the top. The problem is that there was no rule on the naming of files and each gave a different name. For example: Clark.txt Myllena.txt interview.txt w.txt My problem is to import all these files and perform the merge of them automatically via a macro. Any suggestions? I'm working on it this Friday. If the files were standardized name no problem, solve the macro below. For example if I want to do the merge of 1000 files each with a part of the name in common, located in the c: \ spss: file1.sav file2.sav file3.sav . . . file1000.sav / * Perform the merge any number of files as long as they have the same variables. DEFINE MacroMergeAllFiles (DiretorioComum=!TOKENS(1)/ NomeComum=!TOKENS(1)/ QtdeArquivos=!TOKENS(1)) !DO !FILE= 1 !TO !QtdeArquivos. !IF (!FILE=1) !THEN GET FILE= !DiretorioComum+!NomeComum+"1.sav". !ELSE ADD FILES /FILE=* /FILE=!DiretorioComum+!NomeComum+!QUOTE(!FILE)+".sav". EXECUTE. !IFEND. !DOEND. !ENDDEFINE. /*MacroMergeAllFiles DiretorioComum='C:\spss\' NomeComum='file' QtdeArquivos=1000. The problem is to perform the same procedure but with different file names. Thanks for all. Carlos Renato Statistician - Brazil -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Complex-merge-files-tp5010533p5010533.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Carlos Renato
Dear friends
In Windows: - Select all files in the dir using CTRL+A - Next press F2 and rename one file, for example, rename for DATA. - the effect is propagated to the other system files and rename all adding an index in brackets. DATA (1).DAT DATA (2).DAT . . . DATA (100).DAT Using the macro below, we can merge all files in this format. DEFINE MacroMergeAllTXT (!POSITIONAL !TOKENS(1)/ !POSITIONAL !TOKENS(1)/ !POSITIONAL !TOKENS(1)) !LET !Dir = !1. !LET !NomeComum = !2. !LET !QtdeArq = !3. !DO !FILE= 1 !TO !QtdeArq. GET DATA /TYPE=TXT /FILE=!QUOTE(!CONCAT(!UNQUOTE(!1),"\",!UNQUOTE(!2),"(",!File,")",".DAT")) /DELCASE=LINE /DELIMITERS="\t" /ARRANGEMENT=DELIMITED /FIRSTCASE=2 /IMPORTCASE=ALL /VARIABLES= ID F3.0 V1 A19 V2 A19 V3 A18 V4 A19 V5 A18. CACHE. EXECUTE. SAVE OUTFILE=!QUOTE(!CONCAT(!UNQUOTE(!Dir),"\",!UNQUOTE(!NomeComum),!File,"_TEMP.SAV")) /COMPRESSED. NEW FILE. !DOEND. !DO !FILE= 1 !TO !QtdeArq. !IF (!FILE=1) !THEN GET FILE= !QUOTE(!CONCAT(!UNQUOTE(!Dir),"\",!UNQUOTE(!NomeComum),!File,"_TEMP.SAV")). !ELSE ADD FILES /FILE=* /FILE= !QUOTE(!CONCAT(!UNQUOTE(!Dir),"\",!UNQUOTE(!NomeComum),!File,"_TEMP.SAV")). EXECUTE. !IFEND. !DOEND. SAVE OUTFILE=!QUOTE(!CONCAT(!UNQUOTE(!Dir),"\",!UNQUOTE(!NomeComum),"_ALL_CASES.SAV")) /COMPRESSED. NEW FILE. !DO !FILE= 1 !TO !QtdeArq. ERASE FILE = !QUOTE(!CONCAT(!UNQUOTE(!Dir),"\",!UNQUOTE(!NomeComum),!File,"_TEMP.SAV")). !DOEND. !ENDDEFINE. MergeAllTXT 'D:\Minicurso Teresina\Parte II - SPSS Macros\Exemplo Motivacional I\Data' 'data ' 20. Carlos Renato Statistician - Brazil |
Free forum by Nabble | Edit this page |