SPSSX Discussion

import csv-file wiht GET DATA or something

Classic

List

Threaded

8 messages Options

Moon Kid

import csv-file wiht GET DATA or something

I create and declare my variables (near 300) with all details by script.
Now I want to import the csv-file which containts the data for this
variables.

But using the "Read Text-Data" and reading the "GET DATA documentation"
it is not possible.

GET DATA want me to specify VARIABLES. But they are still there.
I want to import the data into the still existing variables.

How can I solve that?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: import csv-file wiht GET DATA or something

Administrator

Maybe take this afternoon to familiarize yourself with DATA LIST.
RTFM particularly the Universals section.

Moon Kid wrote

I create and declare my variables (near 300) with all details by script.
Now I want to import the csv-file which containts the data for this
variables.

But using the "Read Text-Data" and reading the "GET DATA documentation"
it is not possible.

GET DATA want me to specify VARIABLES. But they are still there.
I want to import the data into the still existing variables.

How can I solve that?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Jon K Peck

Re: import csv-file with GET DATA or something

In reply to this post by Moon Kid

Command such as GET DATA, DATA LIST etc define the variables as part of the process of reading them. Would you explain why you want to separate these tasks? You can, of course, separate out the metadata definition apart from the name and the fundamental type (string or numeric), and apply that after the data are read. Besides the various syntax for labels, missing values, etc, you can create an empty file - all the variables but no cases, and then apply those definitions using APPLY DICTIONARY as a template to the new dataset created by GET etc.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: Moon Kid <[hidden email]>
To: [hidden email],
Date: 03/22/2014 11:16 AM
Subject: [SPSSX-L] import csv-file wiht GET DATA or something
Sent by: "SPSSX(r) Discussion" <[hidden email]>

I create and declare my variables (near 300) with all details by script. Now I want to import the csv-file which containts the data for this variables. But using the "Read Text-Data" and reading the "GET DATA documentation" it is not possible. GET DATA want me to specify VARIABLES. But they are still there. I want to import the data into the still existing variables. How can I solve that? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Moon Kid

Re: import csv-file with GET DATA or something

Hi Kim

On 2014-03-22 12:38 Jon K Peck <[hidden email]> wrote:
> Command such as GET DATA, DATA LIST etc define the variables as part
> of the process of reading them.

I understand that. So I see there is no way with SPSS to read data
"inside" existing variables? Reading data from an external source in
SPSS means always creating new variables? Am I right?

> Would you explain why you want to separate these tasks?

It is a simple concept in data processing to seperate data from its
definition. Nearly all data processing software (databases, programming
languages, describing languages, statistic packages, ...) follow this
simple conecpt. It make the design more easier. btw: As a database
developer it is my expertise. But maybe this is my problem, too. I try
to handle the task my way - which is of course not the SPSS way.

> You can, of course, separate out the metadata
> definition apart from the name and the fundamental type (string or
> numeric), and apply that after the data are read.
> [...]
> definitions using APPLY DICTIONARY as a template to the new dataset

APPlY DICTIONARY sounds like a nice workaround. But now it is to late
for that. I think I will do the DATA LIST way. The modifications on my
script (SPSS would say "Syntax") a little bit smaller with this.
But I have to type nearly 300 field names again. ,(

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: import csv-file with GET DATA or something

At 04:17 PM 3/22/2014, Moon Kid wrote:

>It is a simple concept in data processing to separate data from its
>definition. Nearly all data processing software (databases,
>programming languages, describing languages, statistic packages,
>...) follow this simple concept. It make the design more easier.
>btw: As a database developer it is my expertise. But maybe this is
>my problem, too. I try to handle the task my way - which is of
>course not the SPSS way.

Of course, most of us on the list are used to working with SPSS, and
'thinking' its way: that declaring the variables is part of reading,
or importing, data.

You're right: most relational database systems, in particular, don't
work that way. They 'think' in terms of your defining a data
structure, then populating it. SPSS (and, last I was aware of, SAS,
and some others) 'think' in terms of files being dynamically defined
to represent their source -- in a way, taking data source as primary,
rather than structure as primary.

You'll run into this in other contexts than reading data. For
example, MATCH FILES, which is distantly analogous to a join, doesn't
read data into a file; it creates the file, structure and all, from
the input files and the parameters to the command.

I'm not surprised that this takes some getting used to; but it is a
workable paradigm, and can accomplish quite a range of data-manipulation tasks.

If I were moving from a relational system to SPSS, I think I'd write
out the design, but not try to realize the design in SPSS; then write
the SPSS syntax (or code) to read whatever data I had, in a way that
matches the design. Basically, ETL taking primacy over design, rather
than the other way around.

Which leads to topics like controlling normalization. A lot of times,
one works with SPSS files that aren't normalized: that have, for
example, patient data in the same records as medical visit data. But
in any case, you'll want to learn tricks for normalizing and
de-normalizing. Relevant commands include,

. MATCH FILES, particularly for building the kind of denormalized
file I've just described
. VARSTOCASES (more commonly now) and XSAVE with LOOP, to put data in
first normal form when it isn't; CASESTOVARS, to get data out of
first normal form
. AGGREGATE, to reduce sets of records to summary values.

Best of luck with it!

-Richard Ristow

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jon K Peck

Re: import csv-file with GET DATA or something

In reply to this post by Moon Kid

It's true that some languages are declaration heavy while others are nearly declaration free - typical of "scripting" languages. The closest you could come to a complete separation in SPSS Statistics would be to define your variables and all their metadata but add no cases. Then use ADD FILES to supply the cases - but that dataset would have to have consistent variable declarations, too.

BTW, if some of your variable names follow a numerical suffix pattern such as var01, var02 ..., you can in many contexts refer to them as var01 TO var100 etc, including in DATA LIST. And to refer to a bunch of existing variables in file order you can use x TO y.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621

From: Moon Kid <[hidden email]>
To: [hidden email],
Date: 03/22/2014 02:19 PM
Subject: Re: [SPSSX-L] import csv-file with GET DATA or something
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hi Kim On 2014-03-22 12:38 Jon K Peck <[hidden email]> wrote: > Command such as GET DATA, DATA LIST etc define the variables as part > of the process of reading them. I understand that. So I see there is no way with SPSS to read data "inside" existing variables? Reading data from an external source in SPSS means always creating new variables? Am I right? > Would you explain why you want to separate these tasks? It is a simple concept in data processing to seperate data from its definition. Nearly all data processing software (databases, programming languages, describing languages, statistic packages, ...) follow this simple conecpt. It make the design more easier. btw: As a database developer it is my expertise. But maybe this is my problem, too. I try to handle the task my way - which is of course not the SPSS way. > You can, of course, separate out the metadata > definition apart from the name and the fundamental type (string or > numeric), and apply that after the data are read. > [...] > definitions using APPLY DICTIONARY as a template to the new dataset APPlY DICTIONARY sounds like a nice workaround. But now it is to late for that. I think I will do the DATA LIST way. The modifications on my script (SPSS would say "Syntax") a little bit smaller with this. But I have to type nearly 300 field names again. ,( ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Moon Kid

Re: import csv-file wiht GET DATA or something

In reply to this post by Moon Kid

Ok, I am trying it like this

[code]
GET DATA /TYPE=TXT
/FILE="C:\Users\vuser\vbox_share\Fb.csv"
/DELCASE=LINE
/DELIMITERS="\t"
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
v01 A64
v02 F2.0
v03 F2.0
* ...CACHE.
EXECUTE.
DATASET NAME Create2 WINDOW=FRONT.

* VAR LABLES
* VAR LEVEL
* MISSING VALUES
* COMPUTE
* ...
[/code]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: import csv-file wiht GET DATA or something

Administrator

The .csv extension suggests that commas are used to delimit the fields in your file, but your GET DATA syntax says it's tab-delimited. From the FM:

"To specify a tab as a delimiter use "\t". This must be the first delimiter specified."

For a CSV file, change it to:

/DELIMITERS=","

Also, you want LABELS, not LABLES. ;-)

Moon Kid wrote

Ok, I am trying it like this

[code]
GET DATA /TYPE=TXT
/FILE="C:\Users\vuser\vbox_share\Fb.csv"
/DELCASE=LINE
/DELIMITERS="\t"
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
v01 A64
v02 F2.0
v03 F2.0
* ...CACHE.
EXECUTE.
DATASET NAME Create2 WINDOW=FRONT.

* VAR LABLES
* VAR LEVEL
* MISSING VALUES
* COMPUTE
* ...
[/code]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).