I create and declare my variables (near 300) with all details by script.
Now I want to import the csv-file which containts the data for this variables. But using the "Read Text-Data" and reading the "GET DATA documentation" it is not possible. GET DATA want me to specify VARIABLES. But they are still there. I want to import the data into the still existing variables. How can I solve that? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Maybe take this afternoon to familiarize yourself with DATA LIST.
RTFM particularly the Universals section.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Moon Kid
Command such as GET DATA, DATA LIST etc
define the variables as part of the process of reading them. Would
you explain why you want to separate these tasks? You can, of course,
separate out the metadata definition apart from the name and the fundamental
type (string or numeric), and apply that after the data are read. Besides
the various syntax for labels, missing values, etc, you can create an empty
file - all the variables but no cases, and then apply those definitions
using APPLY DICTIONARY as a template to the new dataset created by GET
etc.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Moon Kid <[hidden email]> To: [hidden email], Date: 03/22/2014 11:16 AM Subject: [SPSSX-L] import csv-file wiht GET DATA or something Sent by: "SPSSX(r) Discussion" <[hidden email]> I create and declare my variables (near 300) with all details by script. Now I want to import the csv-file which containts the data for this variables. But using the "Read Text-Data" and reading the "GET DATA documentation" it is not possible. GET DATA want me to specify VARIABLES. But they are still there. I want to import the data into the still existing variables. How can I solve that? ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hi Kim
On 2014-03-22 12:38 Jon K Peck <[hidden email]> wrote: > Command such as GET DATA, DATA LIST etc define the variables as part > of the process of reading them. I understand that. So I see there is no way with SPSS to read data "inside" existing variables? Reading data from an external source in SPSS means always creating new variables? Am I right? > Would you explain why you want to separate these tasks? It is a simple concept in data processing to seperate data from its definition. Nearly all data processing software (databases, programming languages, describing languages, statistic packages, ...) follow this simple conecpt. It make the design more easier. btw: As a database developer it is my expertise. But maybe this is my problem, too. I try to handle the task my way - which is of course not the SPSS way. > You can, of course, separate out the metadata > definition apart from the name and the fundamental type (string or > numeric), and apply that after the data are read. > [...] > definitions using APPLY DICTIONARY as a template to the new dataset APPlY DICTIONARY sounds like a nice workaround. But now it is to late for that. I think I will do the DATA LIST way. The modifications on my script (SPSS would say "Syntax") a little bit smaller with this. But I have to type nearly 300 field names again. ,( ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
At 04:17 PM 3/22/2014, Moon Kid wrote:
>It is a simple concept in data processing to separate data from its >definition. Nearly all data processing software (databases, >programming languages, describing languages, statistic packages, >...) follow this simple concept. It make the design more easier. >btw: As a database developer it is my expertise. But maybe this is >my problem, too. I try to handle the task my way - which is of >course not the SPSS way. Of course, most of us on the list are used to working with SPSS, and 'thinking' its way: that declaring the variables is part of reading, or importing, data. You're right: most relational database systems, in particular, don't work that way. They 'think' in terms of your defining a data structure, then populating it. SPSS (and, last I was aware of, SAS, and some others) 'think' in terms of files being dynamically defined to represent their source -- in a way, taking data source as primary, rather than structure as primary. You'll run into this in other contexts than reading data. For example, MATCH FILES, which is distantly analogous to a join, doesn't read data into a file; it creates the file, structure and all, from the input files and the parameters to the command. I'm not surprised that this takes some getting used to; but it is a workable paradigm, and can accomplish quite a range of data-manipulation tasks. If I were moving from a relational system to SPSS, I think I'd write out the design, but not try to realize the design in SPSS; then write the SPSS syntax (or code) to read whatever data I had, in a way that matches the design. Basically, ETL taking primacy over design, rather than the other way around. Which leads to topics like controlling normalization. A lot of times, one works with SPSS files that aren't normalized: that have, for example, patient data in the same records as medical visit data. But in any case, you'll want to learn tricks for normalizing and de-normalizing. Relevant commands include, . MATCH FILES, particularly for building the kind of denormalized file I've just described . VARSTOCASES (more commonly now) and XSAVE with LOOP, to put data in first normal form when it isn't; CASESTOVARS, to get data out of first normal form . AGGREGATE, to reduce sets of records to summary values. Best of luck with it! -Richard Ristow ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Moon Kid
It's true that some languages are declaration
heavy while others are nearly declaration free - typical of "scripting"
languages. The closest you could come to a complete separation in
SPSS Statistics would be to define your variables and all their metadata
but add no cases. Then use ADD FILES to supply the cases - but that
dataset would have to have consistent variable declarations, too.
BTW, if some of your variable names follow a numerical suffix pattern such as var01, var02 ..., you can in many contexts refer to them as var01 TO var100 etc, including in DATA LIST. And to refer to a bunch of existing variables in file order you can use x TO y. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Moon Kid <[hidden email]> To: [hidden email], Date: 03/22/2014 02:19 PM Subject: Re: [SPSSX-L] import csv-file with GET DATA or something Sent by: "SPSSX(r) Discussion" <[hidden email]> Hi Kim On 2014-03-22 12:38 Jon K Peck <[hidden email]> wrote: > Command such as GET DATA, DATA LIST etc define the variables as part > of the process of reading them. I understand that. So I see there is no way with SPSS to read data "inside" existing variables? Reading data from an external source in SPSS means always creating new variables? Am I right? > Would you explain why you want to separate these tasks? It is a simple concept in data processing to seperate data from its definition. Nearly all data processing software (databases, programming languages, describing languages, statistic packages, ...) follow this simple conecpt. It make the design more easier. btw: As a database developer it is my expertise. But maybe this is my problem, too. I try to handle the task my way - which is of course not the SPSS way. > You can, of course, separate out the metadata > definition apart from the name and the fundamental type (string or > numeric), and apply that after the data are read. > [...] > definitions using APPLY DICTIONARY as a template to the new dataset APPlY DICTIONARY sounds like a nice workaround. But now it is to late for that. I think I will do the DATA LIST way. The modifications on my script (SPSS would say "Syntax") a little bit smaller with this. But I have to type nearly 300 field names again. ,( ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Moon Kid
Ok, I am trying it like this
[code] GET DATA /TYPE=TXT /FILE="C:\Users\vuser\vbox_share\Fb.csv" /DELCASE=LINE /DELIMITERS="\t" /ARRANGEMENT=DELIMITED /FIRSTCASE=2 /IMPORTCASE=ALL /VARIABLES= v01 A64 v02 F2.0 v03 F2.0 * ...CACHE. EXECUTE. DATASET NAME Create2 WINDOW=FRONT. * VAR LABLES * VAR LEVEL * MISSING VALUES * COMPUTE * ... [/code] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
The .csv extension suggests that commas are used to delimit the fields in your file, but your GET DATA syntax says it's tab-delimited. From the FM:
"To specify a tab as a delimiter use "\t". This must be the first delimiter specified." For a CSV file, change it to: /DELIMITERS="," Also, you want LABELS, not LABLES. ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |