brief syntax to read CSV data also reading string vars in CSV files

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

brief syntax to read CSV data also reading string vars in CSV files

Art Kendall
In testing whether SPSS could read a 400 variable CSV file I used the first snippet of syntax below to generate a file.
However, despite having requested such a capability for years SPSS still needs a separate name and format for each variable. The wizard created very long syntax and it cannot be edited down to something using "TO" like
/variables x1 to x400 (400f3)
Andy W reminded me that data list can use comma as a separator.

*create file with 400 variables to see if SPSS can read a 400 variable CSV file.
new file.
input program.
  vector x (400,f3).
  loop id = 1 to 3.
     loop #p = 1 to 400.
        compute x(#p) = rnd(rv.normal(50,10)).
     end loop.
     end case.
  end loop.
  end file.
end input program.
SAVE TRANSLATE OUTFILE='C:\project\test400.csv'
 /TYPE=CSV
 /ENCODING='Locale'
 /MAP
 /REPLACE
 /FIELDNAMES
 /CELLS=VALUES.

this will read the file with short syntax wheras the wizard would produce over 400 lines
data list file='c:\project\test400big.csv'
 list skip =1/x1 to x400 (400f8), id (f1).
execute.
---------------
as long as the variables widths are long enough data list can read CSV files that have string variables quoted or not quoted

data list  list (",") skip=1/id(f1) plain(a20), quoted(a20).
begin data
id, plain, quoted
1, some words, 'quoted words'
2, a,b
end data.
list.

-- 
Art Kendall
Social Research Consultants
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: brief syntax to read CSV data also reading string vars in CSV files

Andy W
What I was trying to say in my prior response that if you have quotation delimiters for strings WITH COMMAS IN THEM data list for csv files won't work.

Example below,

*****************************************************.
File handle save /name = "C:\Documents and Settings\andrew.wheeler".
data list free / X1 to X3 (3A10).
begin data
willwork willwork 'wont,work'
'wont,work' where didIgo
end data.
dataset name orig.
SAVE TRANSLATE OUTFILE='save\test.csv'
  /TYPE=CSV /MAP /REPLACE /FIELDNAMES
  /CELLS=VALUES.
data list list (",") skip = 1 file = "save\test.csv" / X1 to X3 (3A10).
dataset name csv.
list.
ERASE FILE = "save\test.csv".
*****************************************************.

I guess I have been fortunate enough to not need to worry about this. Perhaps some string manipulation before hand could replace commas within quotations and then you could replace them after the fact if it is a problem.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: brief syntax to read CSV data also reading string vars in CSV files

Jon K Peck
In reply to this post by Art Kendall
It would be straightforward to create an extension command equivalent to GET DATA /TYPE=TXT that accepted TO and a compact format specification, but is this really a problem?


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Art Kendall <[hidden email]>
To:        [hidden email],
Date:        08/14/2013 05:21 AM
Subject:        [SPSSX-L] brief syntax to read CSV data also reading string vars              in CSV              files
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




In testing whether SPSS could read a 400 variable CSV file I used the first snippet of syntax below to generate a file.
However, despite having requested such a capability for years SPSS still needs a separate name and format for each variable. The wizard created very long syntax and it cannot be edited down to something using "TO" like

/variables x1 to x400 (400f3)

Andy W reminded me that data list can use comma as a separator.


*create file with 400 variables to see if SPSS can read a 400 variable CSV file.
new file.
input program.
 vector x (400,f3).
 loop id = 1 to 3.
    loop #p = 1 to 400.
       compute x(#p) = rnd(rv.normal(50,10)).
    end loop.
    end case.
 end loop.
 end file.
end input program.
SAVE TRANSLATE OUTFILE='C:\project\test400.csv'
/TYPE=CSV
/ENCODING='Locale'
/MAP
/REPLACE
/FIELDNAMES
/CELLS=VALUES.

this will read the file with short syntax
wheras the wizard would produce over 400 lines
data list file='c:\project\test400big.csv'
list skip =1/x1 to x400 (400f8), id (f1).
execute.
---------------

as long as the variables widths are long enough data list can read CSV files that have string variables quoted or not quoted


data list  list (",") skip=1/id(f1) plain(a20), quoted(a20).
begin data
id, plain, quoted
1, some words, 'quoted words'
2, a,b
end data.
list.

--
Art Kendall
Social Research Consultants

Art Kendall
Social Research Consultants



View this message in context: brief syntax to read CSV data also reading string vars in CSV files
Sent from the
SPSSX Discussion mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: brief syntax to read CSV data also reading string vars in CSV files

Art Kendall
It depends on how much emphasis is placed on readability.
if you run the syntax I posted to generate the CSV file the uses the wizard to paste syntax, you will notice that reading 400 variables results in a little over 400 lines of syntax. And you know what a bug I am about pasting syntax.

compare that to a single line saying something like
 /variables= x1 to x400 (400f2) Id (f1).


However, I often edit the syntax that is pasted. I would want to replace the 400 lines with one line.

IIRC the use of "TO" was there when SPSS became available on non-IBM machines in 1972.
The workarounds I posted using DATA LIST were intended to enhance readability.
As Andy W pointed out the workarounds would not always work with embedded commas, etc.

Art Kendall
Social Research Consultants
On 8/14/2013 8:57 AM, Jon K Peck [via SPSSX Discussion] wrote:
It would be straightforward to create an extension command equivalent to GET DATA /TYPE=TXT that accepted TO and a compact format specification, but is this really a problem?


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Art Kendall <[hidden email]>
To:        [hidden email],
Date:        08/14/2013 05:21 AM
Subject:        [SPSSX-L] brief syntax to read CSV data also reading string vars              in CSV              files
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




In testing whether SPSS could read a 400 variable CSV file I used the first snippet of syntax below to generate a file.
However, despite having requested such a capability for years SPSS still needs a separate name and format for each variable. The wizard created very long syntax and it cannot be edited down to something using "TO" like

/variables x1 to x400 (400f3)

Andy W reminded me that data list can use comma as a separator.


*create file with 400 variables to see if SPSS can read a 400 variable CSV file.
new file.
input program.
 vector x (400,f3).
 loop id = 1 to 3.
    loop #p = 1 to 400.
       compute x(#p) = rnd(rv.normal(50,10)).
    end loop.
    end case.
 end loop.
 end file.
end input program.
SAVE TRANSLATE OUTFILE='C:\project\test400.csv'
/TYPE=CSV
/ENCODING='Locale'
/MAP
/REPLACE
/FIELDNAMES
/CELLS=VALUES.

this will read the file with short syntax
wheras the wizard would produce over 400 lines
data list file='c:\project\test400big.csv'
list skip =1/x1 to x400 (400f8), id (f1).
execute.
---------------

as long as the variables widths are long enough data list can read CSV files that have string variables quoted or not quoted


data list  list (",") skip=1/id(f1) plain(a20), quoted(a20).
begin data
id, plain, quoted
1, some words, 'quoted words'
2, a,b
end data.
list.

--
Art Kendall
Social Research Consultants

Art Kendall
Social Research Consultants



View this message in context: brief syntax to read CSV data also reading string vars in CSV files
Sent from the
SPSSX Discussion mailing list archive at Nabble.com.



To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants