How to import CSV(Dynamic variable number) to SPSS

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to import CSV(Dynamic variable number) to SPSS

Xenia-3
Dear All,

I'm working on one project to import batch of CSV files, whose number
maybe slightly different from each other. e.g, one file has variable A B C
D, the other one has A C D E, and I will analyze on A C D, and need to
save csv files as .sav files in the final.

I tried many ways, even tried to use python and R in SPSS, but all failed.
I'm newer to Python and R. :P

Does anyone know how to use syntax to import CSV files to SPSS without fix
the variable format/name/number? Then I may do the loop to read in
different csv files.

Thanks very much!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to import CSV(Dynamic variable number) to SPSS

Oliver, Richard
Check out the Python CSV reader.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Xenia
Sent: Wednesday, February 25, 2009 3:10 PM
To: [hidden email]
Subject: How to import CSV(Dynamic variable number) to SPSS

Dear All,

I'm working on one project to import batch of CSV files, whose number
maybe slightly different from each other. e.g, one file has variable A B C
D, the other one has A C D E, and I will analyze on A C D, and need to
save csv files as .sav files in the final.

I tried many ways, even tried to use python and R in SPSS, but all failed.
I'm newer to Python and R. :P

Does anyone know how to use syntax to import CSV files to SPSS without fix
the variable format/name/number? Then I may do the loop to read in
different csv files.

Thanks very much!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to import CSV(Dynamic variable number) to SPSS

Xenia-3
Thank you, Richard. But how to save to SPSS files? I tried the csv reader in SPSS, seams the data is in memory, but not in active datasets. Do you have example? :)

On Wed, Feb 25, 2009 at 5:26 PM, Oliver, Richard <[hidden email]> wrote:
Check out the Python CSV reader.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Xenia
Sent: Wednesday, February 25, 2009 3:10 PM
To: [hidden email]
Subject: How to import CSV(Dynamic variable number) to SPSS

Dear All,

I'm working on one project to import batch of CSV files, whose number
maybe slightly different from each other. e.g, one file has variable A B C
D, the other one has A C D E, and I will analyze on A C D, and need to
save csv files as .sav files in the final.

I tried many ways, even tried to use python and R in SPSS, but all failed.
I'm newer to Python and R. :P

Does anyone know how to use syntax to import CSV files to SPSS without fix
the variable format/name/number? Then I may do the loop to read in
different csv files.

Thanks very much!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: How to import CSV(Dynamic variable number) to SPSS

Oliver, Richard
RE: Re: How to import CSV(Dynamic variable number) to SPSS

There are probably better ways, but you only bring the data into Python to get the variable list (the contents of the first row of the CSV file). I haven't looked at it in a while, but I'm reasonably sure you can, in fact, just read in the first row of the CSV file. In Python you then construct GET DATA or DATA LIST syntax to actually read the data into SPSS. It's not a particularly robust solution, since it doesn't contain any information on data types; so unless the data are all numbers or all strings, it won't work very well. You could of course read the first few rows of data and try to evaluate data type. The simplest work around for the unknown data type problem is to read them all as very long strings, and then do the conversion in SPSS. Of course, if you know the data types  in advance (for example A is always numeric, B is always string, C is always a date, etc.), this problem goes away.


-----Original Message-----
From: SPSSX(r) Discussion on behalf of Qunqun Xu
Sent: Wed 2/25/2009 5:48 PM
To: [hidden email]
Subject:      Re: How to import CSV(Dynamic variable number) to SPSS

Thank you, Richard. But how to save to SPSS files? I tried the csv reader in
SPSS, seams the data is in memory, but not in active datasets. Do you have
example? :)

On Wed, Feb 25, 2009 at 5:26 PM, Oliver, Richard <[hidden email]> wrote:

> Check out the Python CSV reader.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [[hidden email]] On Behalf Of
> Xenia
> Sent: Wednesday, February 25, 2009 3:10 PM
> To: [hidden email]
> Subject: How to import CSV(Dynamic variable number) to SPSS
>
>  Dear All,
>
> I'm working on one project to import batch of CSV files, whose number
> maybe slightly different from each other. e.g, one file has variable A B C
> D, the other one has A C D E, and I will analyze on A C D, and need to
> save csv files as .sav files in the final.
>
> I tried many ways, even tried to use python and R in SPSS, but all failed.
> I'm newer to Python and R. :P
>
> Does anyone know how to use syntax to import CSV files to SPSS without fix
> the variable format/name/number? Then I may do the loop to read in
> different csv files.
>
> Thanks very much!
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

Reply | Threaded
Open this post in threaded view
|

Re: How to import CSV(Dynamic variable number) to SPSS

Albert-Jan Roskam
Hi!

I did something very similar a while ago and i just modified the code I used. It assumes that the first row contains the variable names, and that the file is tab-separated (it should be easy to use another delimiter), and that you have a dir called d:/temp.

It reads the first two rows of the each csv file in that dir and creates GET DATA syntaxes based on this, which are then applied using INSERT. Date vars are treated as strings. Fairly straightforward (but the devil was in the details ;-)

Cheers!!
Albert-Jan

* sample data.
set rng = mt mtindex = 12345.
input program.
loop #case = 1 to 100.
+compute n1 = rv.normal(0,1).
+compute n2 = rnd(rv.uniform(0,1)).
+compute n3 = rnd(rv.uniform(0,1)).
+string #alpha (a82) s1 to s3 (a4).
+compute #alpha  = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890~@#$%^&*()_+?><{}[]|'.
+compute s1 = substr(#alpha, trunc(rv.uniform(1, 82)), 1).
+compute s2 = substr(#alpha, trunc(rv.uniform(40, 82)), 1).
+compute s3 = substr(#alpha, trunc(rv.uniform(1, 40)), 1).
+end case.
+end loop.
+end file.
end input program.

save translate outfile='d:\temp\file1.csv' / type=tab /replace /fieldnames / keep = n1 to n3.
save translate outfile='d:\temp\file2.csv' / type=tab /replace /fieldnames / keep = s1 to s3.
save translate outfile='d:\temp\file3.csv' / type=tab /replace /fieldnames / keep = all.
save translate outfile='d:\temp\file4.csv' / type=tab /replace /fieldnames / keep = n1 s2 n2 n3 s3.

* actual code.
begin program.

import glob, csv, os.path, spss
in_files = glob.glob("d:/temp/*.csv")
sps_file = open("d:/temp/mysps.sps", "wb")
for in_file in in_files:
    reader = csv.reader(open(in_file, "rb"), delimiter="\t")
    fmts = []
    for k, row in enumerate(reader):
        if k == 0:
            fieldnames = row
        elif k == 1:
            for col in row:
                if col.isdigit() or col[0] == "-":
                    fmts.append("f" + str(len(col)) + ".2")
                else:
                    fmts.append("a" + str(len(col)))
    title = "\r\n** Syntax for file: %s." % os.path.basename(in_file)
    sps_file.write(title)
    header = r"""
    get data  /type = txt
     /file = '%s'
     /delcase = line
     /delimiters = '\t'
     /arrangement = delimited
     /firstcase = 2
     /importcase = all
     /variables = """ % in_file + "\r\n"
    sps_file.write(header)
    for var, fmt in zip (fieldnames, fmts):
        body = "\t" + var + " " + fmt + "\r\n"
        sps_file.write(body)
    footer = ".\r\ncache.\r\nexe.\r\ndataset name %s window=asis." \
               "\r\nvariable width all (12).\r\n" % os.path.basename(in_file)
    sps_file.write(footer)
sps_file.close()
spss.Submit("insert file = 'd:/temp/mysps.sps'.")

end program.



--- On Thu, 2/26/09, Oliver, Richard <[hidden email]> wrote:

> From: Oliver, Richard <[hidden email]>
> Subject: Re: How to import CSV(Dynamic variable number) to SPSS
> To: [hidden email]
> Date: Thursday, February 26, 2009, 3:08 AM
> There are probably better ways, but you only bring the data
> into Python to get the variable list (the contents of the
> first row of the CSV file). I haven't looked at it in a
> while, but I'm reasonably sure you can, in fact, just
> read in the first row of the CSV file. In Python you then
> construct GET DATA or DATA LIST syntax to actually read the
> data into SPSS. It's not a particularly robust solution,
> since it doesn't contain any information on data types;
> so unless the data are all numbers or all strings, it
> won't work very well. You could of course read the first
> few rows of data and try to evaluate data type. The simplest
> work around for the unknown data type problem is to read
> them all as very long strings, and then do the conversion in
> SPSS. Of course, if you know the data types  in advance (for
> example A is always numeric, B is always string, C is always
> a date, etc.), this problem goes away.
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion on behalf of Qunqun Xu
> Sent: Wed 2/25/2009 5:48 PM
> To: [hidden email]
> Subject:      Re: How to import CSV(Dynamic variable
> number) to SPSS
>
> Thank you, Richard. But how to save to SPSS files? I tried
> the csv reader in
> SPSS, seams the data is in memory, but not in active
> datasets. Do you have
> example? :)
>
> On Wed, Feb 25, 2009 at 5:26 PM, Oliver, Richard
> <[hidden email]> wrote:
>
> > Check out the Python CSV reader.
> >
> > -----Original Message-----
> > From: SPSSX(r) Discussion
> [mailto:[hidden email]] On Behalf Of
> > Xenia
> > Sent: Wednesday, February 25, 2009 3:10 PM
> > To: [hidden email]
> > Subject: How to import CSV(Dynamic variable number) to
> SPSS
> >
> >  Dear All,
> >
> > I'm working on one project to import batch of CSV
> files, whose number
> > maybe slightly different from each other. e.g, one
> file has variable A B C
> > D, the other one has A C D E, and I will analyze on A
> C D, and need to
> > save csv files as .sav files in the final.
> >
> > I tried many ways, even tried to use python and R in
> SPSS, but all failed.
> > I'm newer to Python and R. :P
> >
> > Does anyone know how to use syntax to import CSV files
> to SPSS without fix
> > the variable format/name/number? Then I may do the
> loop to read in
> > different csv files.
> >
> > Thanks very much!
> >
> > =====================
> > To manage your subscription to SPSSX-L, send a message
> to
> > [hidden email] (not to SPSSX-L), with no
> body text except the
> > command. To leave the list, send the command
> > SIGNOFF SPSSX-L
> > For a list of commands to manage subscriptions, send
> the command
> > INFO REFCARD
> >

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD