|
Dear All,
I'm working on one project to import batch of CSV files, whose number maybe slightly different from each other. e.g, one file has variable A B C D, the other one has A C D E, and I will analyze on A C D, and need to save csv files as .sav files in the final. I tried many ways, even tried to use python and R in SPSS, but all failed. I'm newer to Python and R. :P Does anyone know how to use syntax to import CSV files to SPSS without fix the variable format/name/number? Then I may do the loop to read in different csv files. Thanks very much! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Check out the Python CSV reader.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Xenia Sent: Wednesday, February 25, 2009 3:10 PM To: [hidden email] Subject: How to import CSV(Dynamic variable number) to SPSS Dear All, I'm working on one project to import batch of CSV files, whose number maybe slightly different from each other. e.g, one file has variable A B C D, the other one has A C D E, and I will analyze on A C D, and need to save csv files as .sav files in the final. I tried many ways, even tried to use python and R in SPSS, but all failed. I'm newer to Python and R. :P Does anyone know how to use syntax to import CSV files to SPSS without fix the variable format/name/number? Then I may do the loop to read in different csv files. Thanks very much! ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thank you, Richard. But how to save to SPSS files? I tried the csv reader in SPSS, seams the data is in memory, but not in active datasets. Do you have example? :)
On Wed, Feb 25, 2009 at 5:26 PM, Oliver, Richard <[hidden email]> wrote: Check out the Python CSV reader. |
|
There are probably better ways, but you only bring the data into Python to get the variable list (the contents of the first row of the CSV file). I haven't looked at it in a while, but I'm reasonably sure you can, in fact, just read in the first row of the CSV file. In Python you then construct GET DATA or DATA LIST syntax to actually read the data into SPSS. It's not a particularly robust solution, since it doesn't contain any information on data types; so unless the data are all numbers or all strings, it won't work very well. You could of course read the first few rows of data and try to evaluate data type. The simplest work around for the unknown data type problem is to read them all as very long strings, and then do the conversion in SPSS. Of course, if you know the data types in advance (for example A is always numeric, B is always string, C is always a date, etc.), this problem goes away. |
|
Hi!
I did something very similar a while ago and i just modified the code I used. It assumes that the first row contains the variable names, and that the file is tab-separated (it should be easy to use another delimiter), and that you have a dir called d:/temp. It reads the first two rows of the each csv file in that dir and creates GET DATA syntaxes based on this, which are then applied using INSERT. Date vars are treated as strings. Fairly straightforward (but the devil was in the details ;-) Cheers!! Albert-Jan * sample data. set rng = mt mtindex = 12345. input program. loop #case = 1 to 100. +compute n1 = rv.normal(0,1). +compute n2 = rnd(rv.uniform(0,1)). +compute n3 = rnd(rv.uniform(0,1)). +string #alpha (a82) s1 to s3 (a4). +compute #alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890~@#$%^&*()_+?><{}[]|'. +compute s1 = substr(#alpha, trunc(rv.uniform(1, 82)), 1). +compute s2 = substr(#alpha, trunc(rv.uniform(40, 82)), 1). +compute s3 = substr(#alpha, trunc(rv.uniform(1, 40)), 1). +end case. +end loop. +end file. end input program. save translate outfile='d:\temp\file1.csv' / type=tab /replace /fieldnames / keep = n1 to n3. save translate outfile='d:\temp\file2.csv' / type=tab /replace /fieldnames / keep = s1 to s3. save translate outfile='d:\temp\file3.csv' / type=tab /replace /fieldnames / keep = all. save translate outfile='d:\temp\file4.csv' / type=tab /replace /fieldnames / keep = n1 s2 n2 n3 s3. * actual code. begin program. import glob, csv, os.path, spss in_files = glob.glob("d:/temp/*.csv") sps_file = open("d:/temp/mysps.sps", "wb") for in_file in in_files: reader = csv.reader(open(in_file, "rb"), delimiter="\t") fmts = [] for k, row in enumerate(reader): if k == 0: fieldnames = row elif k == 1: for col in row: if col.isdigit() or col[0] == "-": fmts.append("f" + str(len(col)) + ".2") else: fmts.append("a" + str(len(col))) title = "\r\n** Syntax for file: %s." % os.path.basename(in_file) sps_file.write(title) header = r""" get data /type = txt /file = '%s' /delcase = line /delimiters = '\t' /arrangement = delimited /firstcase = 2 /importcase = all /variables = """ % in_file + "\r\n" sps_file.write(header) for var, fmt in zip (fieldnames, fmts): body = "\t" + var + " " + fmt + "\r\n" sps_file.write(body) footer = ".\r\ncache.\r\nexe.\r\ndataset name %s window=asis." \ "\r\nvariable width all (12).\r\n" % os.path.basename(in_file) sps_file.write(footer) sps_file.close() spss.Submit("insert file = 'd:/temp/mysps.sps'.") end program. --- On Thu, 2/26/09, Oliver, Richard <[hidden email]> wrote: > From: Oliver, Richard <[hidden email]> > Subject: Re: How to import CSV(Dynamic variable number) to SPSS > To: [hidden email] > Date: Thursday, February 26, 2009, 3:08 AM > There are probably better ways, but you only bring the data > into Python to get the variable list (the contents of the > first row of the CSV file). I haven't looked at it in a > while, but I'm reasonably sure you can, in fact, just > read in the first row of the CSV file. In Python you then > construct GET DATA or DATA LIST syntax to actually read the > data into SPSS. It's not a particularly robust solution, > since it doesn't contain any information on data types; > so unless the data are all numbers or all strings, it > won't work very well. You could of course read the first > few rows of data and try to evaluate data type. The simplest > work around for the unknown data type problem is to read > them all as very long strings, and then do the conversion in > SPSS. Of course, if you know the data types in advance (for > example A is always numeric, B is always string, C is always > a date, etc.), this problem goes away. > > > -----Original Message----- > From: SPSSX(r) Discussion on behalf of Qunqun Xu > Sent: Wed 2/25/2009 5:48 PM > To: [hidden email] > Subject: Re: How to import CSV(Dynamic variable > number) to SPSS > > Thank you, Richard. But how to save to SPSS files? I tried > the csv reader in > SPSS, seams the data is in memory, but not in active > datasets. Do you have > example? :) > > On Wed, Feb 25, 2009 at 5:26 PM, Oliver, Richard > <[hidden email]> wrote: > > > Check out the Python CSV reader. > > > > -----Original Message----- > > From: SPSSX(r) Discussion > [mailto:[hidden email]] On Behalf Of > > Xenia > > Sent: Wednesday, February 25, 2009 3:10 PM > > To: [hidden email] > > Subject: How to import CSV(Dynamic variable number) to > SPSS > > > > Dear All, > > > > I'm working on one project to import batch of CSV > files, whose number > > maybe slightly different from each other. e.g, one > file has variable A B C > > D, the other one has A C D E, and I will analyze on A > C D, and need to > > save csv files as .sav files in the final. > > > > I tried many ways, even tried to use python and R in > SPSS, but all failed. > > I'm newer to Python and R. :P > > > > Does anyone know how to use syntax to import CSV files > to SPSS without fix > > the variable format/name/number? Then I may do the > loop to read in > > different csv files. > > > > Thanks very much! > > > > ===================== > > To manage your subscription to SPSSX-L, send a message > to > > [hidden email] (not to SPSSX-L), with no > body text except the > > command. To leave the list, send the command > > SIGNOFF SPSSX-L > > For a list of commands to manage subscriptions, send > the command > > INFO REFCARD > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
