Here's a little data-reading code, from another posting:
DATA LIST LIST /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). BEGIN DATA 1220 08/17/05 1 2 . 1220 03/09/06 3 2 2 1390 11/09/05 1 1.67 . 1390 02/08/06 1 1.67 0 END DATA. Notice the system-missing values for ESDiff. They're read as desired, but by a backwards route: SPSS doesn't recognize "." as a code for "missing", but as an invalid numeric field. Since the field is invalid, SPSS makes the result system-missing. And there are lengthy warnings for every one, until MXWARNS is reached: DATA LIST LIST SKIP=2 /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). BEGIN DATA caseID date_emplymnt ES ESmean ESDiff 1220 08/17/05 1 2 . >Warning # 1111 >A numeric field contained no digits. The result has been set to the >system-missing value. >Command line: 414 Current case: 1 Current splitfile group: 1 >Field contents: '.' >Record number: 3 Starting column: 48 Record length: 48 Does anybody have advice how to read the missing values, without all the warning messages? One solution, often suggested, is to replace the '.' fields by '-1', or some other value that can't occur in real data. When the data has been read, either declare that value user-missing, or recode it to system-missing. I don't like that, very much. It means an extra data-preparation step preceding SPSS, to change '.' to '-1' globally. (Or rather, ' . ' or ' .<CR>' to '-1', so you won't change legitimate decimal points.) And I think it makes the file less readable. A '.' looks missing. A '-1' stands out less, visually; and unless you know the project well, it's hard to be sure that it isn't a data value. |
My apologies in advance if this is not relevant. I'm not sure how this thread started; so I'm not sure how the periods got into the data source in the first place, but Data List is now far more flexible with reading delimited files that contain missing data than in the older versions (prior to SPSS 10, I think), provided there is a consistent delimiter between values:
A simple example: data list list (",") /var1 var2 var3. begin data ,12,13 21,,23 31,32,, ,,43 end data. In this context, spaces as delimiters can be a bit problematic since multiple spaces will be interpreted as multiple missing values. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow Sent: Tuesday, March 20, 2007 12:15 PM To: [hidden email] Subject: [BULK] Missing data, with DATA LIST FREE or LIST Importance: Low Here's a little data-reading code, from another posting: DATA LIST LIST /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). BEGIN DATA 1220 08/17/05 1 2 . 1220 03/09/06 3 2 2 1390 11/09/05 1 1.67 . 1390 02/08/06 1 1.67 0 END DATA. Notice the system-missing values for ESDiff. They're read as desired, but by a backwards route: SPSS doesn't recognize "." as a code for "missing", but as an invalid numeric field. Since the field is invalid, SPSS makes the result system-missing. And there are lengthy warnings for every one, until MXWARNS is reached: DATA LIST LIST SKIP=2 /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). BEGIN DATA caseID date_emplymnt ES ESmean ESDiff 1220 08/17/05 1 2 . >Warning # 1111 >A numeric field contained no digits. The result has been set to the >system-missing value. >Command line: 414 Current case: 1 Current splitfile group: 1 >Field contents: '.' >Record number: 3 Starting column: 48 Record length: 48 Does anybody have advice how to read the missing values, without all the warning messages? One solution, often suggested, is to replace the '.' fields by '-1', or some other value that can't occur in real data. When the data has been read, either declare that value user-missing, or recode it to system-missing. I don't like that, very much. It means an extra data-preparation step preceding SPSS, to change '.' to '-1' globally. (Or rather, ' . ' or ' .<CR>' to '-1', so you won't change legitimate decimal points.) And I think it makes the file less readable. A '.' looks missing. A '-1' stands out less, visually; and unless you know the project well, it's hard to be sure that it isn't a data value. |
In reply to this post by Richard Ristow
Try this:
set errors = off. DATA LIST LIST /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). meljr |
In reply to this post by Oliver, Richard
At 02:06 PM 3/20/2007, Oliver, Richard wrote:
>I'm not sure how this thread started; so I'm not sure how the periods >got into the data source in the first place, Ah, I started the thread. And the periods got in there, in test data posted to the list. (This is read with SKIP=2 on the DATA LIST): >BEGIN DATA > caseID date_emplymnt ES ESmean ESDiff > > 1220 08/17/05 1 2 . > 1220 03/09/06 3 2 2 > 1390 11/09/05 1 1.67 . > 1390 02/08/06 1 1.67 0 > 1390 05/19/06 3 1.67 2 > 1445 08/15/05 3 2 . > 1445 12/13/05 1 2 -2 > 1518 11/09/05 1 1.67 . > 1518 02/08/06 3 1.67 2 > 1518 05/19/06 1 1.67 -2 >END DATA. It's not too rare too see data like this. It's what LIST output looks like, so maybe that's how it was generated. To answer my question, with your answer, there's using an editor that supports regular expressions, to get into a form like this: >data list list (",") /var1 var2 var3. >begin data >,12,13 >end data. Or, since a comma is a field delimiter by default, if you're using LIST OR FREE input, changing periods to commas seems to work. (Use context matching, so as not to change decimal points to commas.) Thank you! Demo, after the change; SPSS 15 draft output: DATA LIST LIST SKIP=2 /caseID (N) DateEmpl (ADATE) ES ESmean ESDiff (3F). BEGIN DATA caseID date_emplymnt ES ESmean ESDiff 1220 08/17/05 1 2 , 1220 03/09/06 3 2 2 1390 11/09/05 1 1.67 , 1390 02/08/06 1 1.67 0 1390 05/19/06 3 1.67 2 1445 08/15/05 3 2 , 1445 12/13/05 1 2 -2 1518 11/09/05 1 1.67 , 1518 02/08/06 3 1.67 2 1518 05/19/06 1 1.67 -2 END DATA. FORMATS caseID (N4) /DateEmpl (ADATE10) /ES ESmean ESDiff (F3). LIST. List |-----------------------------|---------------------------| |Output Created |21-MAR-2007 15:33:55 | |-----------------------------|---------------------------| caseID DateEmpl ES ESmean ESDiff 1220 08/17/2005 1 2 . 1220 03/09/2006 3 2 2 1390 11/09/2005 1 2 . 1390 02/08/2006 1 2 0 1390 05/19/2006 3 2 2 1445 08/15/2005 3 2 . 1445 12/13/2005 1 2 -2 1518 11/09/2005 1 2 . 1518 02/08/2006 3 2 2 1518 05/19/2006 1 2 -2 Number of cases read: 10 Number of cases listed: 10 |
Free forum by Nabble | Edit this page |