Read in mutliple .csv files

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Read in mutliple .csv files

David Millar
Dear All

I have about 100 data files for 100 stations - all with the same variables (I hope).  I've used a macro to do this sort of thing before and I need to do something similar this time.  My problem is that where I've done it before I've had nice, friendly file names like file1.csv, file2.csv etc.  This time the file names are all strings - some with spaces (and that is really throwing me)  like PARKVIEW LANE.csv, MILLWOOD.csv, NEWCASTLE ON WYE.csv.

So the following works for one file -


GET DATA /TYPE = TXT

 /FILE = 'C:\dmillar\PARKVIEW LANE.csv'

 /DELCASE = LINE

 /DELIMITERS = ","

 /ARRANGEMENT = DELIMITED

 /FIRSTCASE = 2

 /IMPORTCASE = all

 /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE = 'C:\dmillar\PARKVIEW LANE.sav'/

 KEEP=all.


However, I can't work out how to iterate it.  I've tried


DEFINE !bring (!POSITIONAL !CHAREND ('/')).

!DO 1 !TO 112

GET DATA /TYPE = TXT

 /FILE =!QUOTE(!CONCAT("C:\dmillar\",!1,".csv")).

 /DELCASE = LINE

 /DELIMITERS = ","

 /ARRANGEMENT = DELIMITED

 /FIRSTCASE = 2

 /IMPORTCASE = all

 /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE=!QUOTE(!CONCAT("C:\dmillar\",!1,".sav")).

EXECUTE.

!DOEND.

!ENDDEFINE.



!bring 'PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE'/



Any help would be very much appreciated.



Regards



David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Read in mutliple .csv files

Rick Oliver-3
I'd use Python:

set mprint on.
begin program python.
import spss
filename=('PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE')
for i in filename:
  spss.Submit("""
GET DATA /TYPE = TXT
 /FILE = 'C:\dmillar\%(i)s.csv'
 /DELCASE = LINE
 /DELIMITERS = ","
 /ARRANGEMENT = DELIMITED
 /FIRSTCASE = 2
 /IMPORTCASE = all
 /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
SAVE OUTFILE = 'C:\dmillar\%(i)s.sav'
  """ %locals())
end program.
set mprint off.


From: David Millar <[hidden email]>
To: [hidden email]
Date: 08/27/2010 11:56 AM
Subject: Read in mutliple .csv files
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Dear All

I have about 100 data files for 100 stations - all with the same variables (I hope).  I've used a macro to do this sort of thing before and I need to do something similar this time.  My problem is that where I've done it before I've had nice, friendly file names like file1.csv, file2.csv etc.  This time the file names are all strings - some with spaces (and that is really throwing me)  like PARKVIEW LANE.csv, MILLWOOD.csv, NEWCASTLE ON WYE.csv.

So the following works for one file -


GET DATA /TYPE = TXT

/FILE = 'C:\dmillar\PARKVIEW LANE.csv'

/DELCASE = LINE

/DELIMITERS = ","

/ARRANGEMENT = DELIMITED

/FIRSTCASE = 2

/IMPORTCASE = all

/VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE = 'C:\dmillar\PARKVIEW LANE.sav'/

KEEP=all.


However, I can't work out how to iterate it.  I've tried


DEFINE !bring (!POSITIONAL !CHAREND ('/')).

!DO 1 !TO 112

GET DATA /TYPE = TXT

/FILE =!QUOTE(!CONCAT("C:\dmillar\",!1,".csv")).

/DELCASE = LINE

/DELIMITERS = ","

/ARRANGEMENT = DELIMITED

/FIRSTCASE = 2

/IMPORTCASE = all

/VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE=!QUOTE(!CONCAT("C:\dmillar\",!1,".sav")).

EXECUTE.

!DOEND.

!ENDDEFINE.



!bring 'PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE'/



Any help would be very much appreciated.



Regards



David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Read in mutliple .csv files

Albert-Jan Roskam
I'd do all the preprocessing in Python:

import csv, glob

def mergeCsv(infile, outfile, skipHeader = True):
    with open(outfile, "ab") as f:
        csv_reader = csv.reader(open(infile, "rb"))
        csv_writer = csv.writer(f)
        if skipHeader:
            csv_reader.next()
        for row in csv_reader:
            csv_writer.writerow(row)
for infile in glob.glob("c:/temp/*.csv"):
    mergeCsv(infile, outfile = "c:/temp/out/outfile.csv")
    print "Processing file %s" % infile
 
Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



From: Rick Oliver <[hidden email]>
To: [hidden email]
Sent: Fri, August 27, 2010 8:04:23 PM
Subject: Re: [SPSSX-L] Read in mutliple .csv files

I'd use Python:

set mprint on.
begin program python.
import spss
filename=('PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE')
for i in filename:
  spss.Submit("""
GET DATA /TYPE = TXT
 /FILE = 'C:\dmillar\%(i)s.csv'
 /DELCASE = LINE
 /DELIMITERS = ","
 /ARRANGEMENT = DELIMITED
 /FIRSTCASE = 2
 /IMPORTCASE = all
 /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
SAVE OUTFILE = 'C:\dmillar\%(i)s.sav'
  """ %locals())
end program.
set mprint off.


From: David Millar <[hidden email]>
To: [hidden email]
Date: 08/27/2010 11:56 AM
Subject: Read in mutliple .csv files
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Dear All

I have about 100 data files for 100 stations - all with the same variables (I hope).  I've used a macro to do this sort of thing before and I need to do something similar this time.  My problem is that where I've done it before I've had nice, friendly file names like file1.csv, file2.csv etc.  This time the file names are all strings - some with spaces (and that is really throwing me)  like PARKVIEW LANE.csv, MILLWOOD.csv, NEWCASTLE ON WYE.csv.

So the following works for one file -


GET DATA /TYPE = TXT

/FILE = 'C:\dmillar\PARKVIEW LANE.csv'

/DELCASE = LINE

/DELIMITERS = ","

/ARRANGEMENT = DELIMITED

/FIRSTCASE = 2

/IMPORTCASE = all

/VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE = 'C:\dmillar\PARKVIEW LANE.sav'/

KEEP=all.


However, I can't work out how to iterate it.  I've tried


DEFINE !bring (!POSITIONAL !CHAREND ('/')).

!DO 1 !TO 112

GET DATA /TYPE = TXT

/FILE =!QUOTE(!CONCAT("C:\dmillar\",!1,".csv")).

/DELCASE = LINE

/DELIMITERS = ","

/ARRANGEMENT = DELIMITED

/FIRSTCASE = 2

/IMPORTCASE = all

/VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.

SAVE OUTFILE=!QUOTE(!CONCAT("C:\dmillar\",!1,".sav")).

EXECUTE.

!DOEND.

!ENDDEFINE.



!bring 'PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE'/



Any help would be very much appreciated.



Regards



David

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Reply | Threaded
Open this post in threaded view
|

Removing duplicate variable names (from Excel import)

Mike Pritchard

Is there a way to remove duplicate variable names (or rename so they aren’t duplicates)?

I imported a file from Excel into SPSS 16 (had to use this approach because the survey tool SPSS export is temporarily broken).  The import worked – which surprised me a bit because I knew the duplicates existed), but I can’t save it.

Manually deleting the variables should work, but there are such a lot that it seems like a better approach makes sense.   I’m going to see what I can do in Excel, but thought it might be worth asking here too.

Thanks!

Mike

 

Reply | Threaded
Open this post in threaded view
|

Re: Read in mutliple .csv files

Art Kendall
In reply to this post by David Millar
  if there is information in each file identifying it, and you are not
using the convention of having the first line be variable names, or if
you feel it is not worthwhile to know which station produced the file
and if you are using windows.

Put all the .csv file in a folder at the root of a device, e.g.,
c:\allcsv or J:\allcsv.
open the command prompt.
cd J:\allcsv
copy *.csv combo.asc
rename combo.asc combo.csv

read in combo.csv to SPSS.

Art Kendall

On 8/27/2010 12:57 PM, David Millar wrote:

> Dear All
>
> I have about 100 data files for 100 stations - all with the same variables (I hope).  I've used a macro to do this sort of thing before and I need to do something similar this time.  My problem is that where I've done it before I've had nice, friendly file names like file1.csv, file2.csv etc.  This time the file names are all strings - some with spaces (and that is really throwing me)  like PARKVIEW LANE.csv, MILLWOOD.csv, NEWCASTLE ON WYE.csv.
>
> So the following works for one file -
>
>
> GET DATA /TYPE = TXT
>
>   /FILE = 'C:\dmillar\PARKVIEW LANE.csv'
>
>   /DELCASE = LINE
>
>   /DELIMITERS = ","
>
>   /ARRANGEMENT = DELIMITED
>
>   /FIRSTCASE = 2
>
>   /IMPORTCASE = all
>
>   /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
>
> SAVE OUTFILE = 'C:\dmillar\PARKVIEW LANE.sav'/
>
>   KEEP=all.
>
>
> However, I can't work out how to iterate it.  I've tried
>
>
> DEFINE !bring (!POSITIONAL !CHAREND ('/')).
>
> !DO 1 !TO 112
>
> GET DATA /TYPE = TXT
>
>   /FILE =!QUOTE(!CONCAT("C:\dmillar\",!1,".csv")).
>
>   /DELCASE = LINE
>
>   /DELIMITERS = ","
>
>   /ARRANGEMENT = DELIMITED
>
>   /FIRSTCASE = 2
>
>   /IMPORTCASE = all
>
>   /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
>
> SAVE OUTFILE=!QUOTE(!CONCAT("C:\dmillar\",!1,".sav")).
>
> EXECUTE.
>
> !DOEND.
>
> !ENDDEFINE.
>
>
>
> !bring 'PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE'/
>
>
>
> Any help would be very much appreciated.
>
>
>
> Regards
>
>
>
> David
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Read in mutliple .csv files

Bruce Weaver
Administrator
For those who are not comfortable with DOS, you can accomplish the same thing with Alex Fauland's A.F.7 utility, found here:

   http://www.fauland.com/


Art Kendall wrote
  if there is information in each file identifying it, and you are not
using the convention of having the first line be variable names, or if
you feel it is not worthwhile to know which station produced the file
and if you are using windows.

Put all the .csv file in a folder at the root of a device, e.g.,
c:\allcsv or J:\allcsv.
open the command prompt.
cd J:\allcsv
copy *.csv combo.asc
rename combo.asc combo.csv

read in combo.csv to SPSS.

Art Kendall

On 8/27/2010 12:57 PM, David Millar wrote:
> Dear All
>
> I have about 100 data files for 100 stations - all with the same variables (I hope).  I've used a macro to do this sort of thing before and I need to do something similar this time.  My problem is that where I've done it before I've had nice, friendly file names like file1.csv, file2.csv etc.  This time the file names are all strings - some with spaces (and that is really throwing me)  like PARKVIEW LANE.csv, MILLWOOD.csv, NEWCASTLE ON WYE.csv.
>
> So the following works for one file -
>
>
> GET DATA /TYPE = TXT
>
>   /FILE = 'C:\dmillar\PARKVIEW LANE.csv'
>
>   /DELCASE = LINE
>
>   /DELIMITERS = ","
>
>   /ARRANGEMENT = DELIMITED
>
>   /FIRSTCASE = 2
>
>   /IMPORTCASE = all
>
>   /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
>
> SAVE OUTFILE = 'C:\dmillar\PARKVIEW LANE.sav'/
>
>   KEEP=all.
>
>
> However, I can't work out how to iterate it.  I've tried
>
>
> DEFINE !bring (!POSITIONAL !CHAREND ('/')).
>
> !DO 1 !TO 112
>
> GET DATA /TYPE = TXT
>
>   /FILE =!QUOTE(!CONCAT("C:\dmillar\",!1,".csv")).
>
>   /DELCASE = LINE
>
>   /DELIMITERS = ","
>
>   /ARRANGEMENT = DELIMITED
>
>   /FIRSTCASE = 2
>
>   /IMPORTCASE = all
>
>   /VARIABLES = name a20 id_code a8 county a20 province a20 monthly_n  f10.
>
> SAVE OUTFILE=!QUOTE(!CONCAT("C:\dmillar\",!1,".sav")).
>
> EXECUTE.
>
> !DOEND.
>
> !ENDDEFINE.
>
>
>
> !bring 'PARKVIEW LANE','MILLWOOD','NEWCASTLE ON WYE'/
>
>
>
> Any help would be very much appreciated.
>
>
>
> Regards
>
>
>
> David
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).