Hi,
I need to remove a number of cases from a large (well, 3,500 cases) data file. The respondent id variable is entryid, and I want to be able to remove about 100 cases (that have bad data), and they aren't consecutive. So for example, I want to remove case number 1, 6, 20, 28, 48, and so forth. It looks like the filter cases entryid ~= 1 would have to be repeated for each value? The syntax "entryid ~= 1 or 6 or 20" doesn't work. I'm wondering if $casenum is the right thing, but I don't know how to use it either. so if anyone can help, I'd be grateful! Either syntax or GUI instructions welcome. thanks Leora Dr. Leora Lawton TechSociety Research "Custom Social Science and Consumer Behavior Research" 2342 Shattuck Avenue PMB 362, Berkeley, CA 94704 (510) 548-6174; fax (510) 548-6175; cell (510) 928-7572 [hidden email] www.techsociety.com |
You don't need $CASENUM since you already have entryid. If you did not have
entryid you could create using COMPUTE entryid=$CASENUM. This will create a variable (entryid) with consecutive numbers corresponding to the number of cases in your data file up to n. But that's not going to help you delete cases with bad data. You need to create a filter variable based on the criteria that constitute bad data. Then you can use the SELECT IF statement to keep only those observations with "good" data. The complexity of creating a filter depends on the number of variables having bad data. You might want to check Raynald Levesque's site at spsstools.net. There may just be a solution to your problem there. Another option is to check this list's archives at http://listserv.uga.edu/archives/spssx-l.html. Good luck. Dominic Lusinchi Statistician Far West Research Statistical Consulting San Francisco, California 415-664-3032 www.farwestresearch.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of leora lawton Sent: Sunday, September 17, 2006 10:06 PM To: [hidden email] Subject: filtering out cases Hi, I need to remove a number of cases from a large (well, 3,500 cases) data file. The respondent id variable is entryid, and I want to be able to remove about 100 cases (that have bad data), and they aren't consecutive. So for example, I want to remove case number 1, 6, 20, 28, 48, and so forth. It looks like the filter cases entryid ~= 1 would have to be repeated for each value? The syntax "entryid ~= 1 or 6 or 20" doesn't work. I'm wondering if $casenum is the right thing, but I don't know how to use it either. so if anyone can help, I'd be grateful! Either syntax or GUI instructions welcome. thanks Leora Dr. Leora Lawton TechSociety Research "Custom Social Science and Consumer Behavior Research" 2342 Shattuck Avenue PMB 362, Berkeley, CA 94704 (510) 548-6174; fax (510) 548-6175; cell (510) 928-7572 [hidden email] www.techsociety.com |
sel if not(any(entryid,1, 6, 20, 28, 48)).
exe. At 03:32 PM 18/09/2006, Dominic Lusinchi wrote: >You don't need $CASENUM since you already have entryid. If you did not have >entryid you could create using COMPUTE entryid=$CASENUM. This will create a >variable (entryid) with consecutive numbers corresponding to the number of >cases in your data file up to n. > >But that's not going to help you delete cases with bad data. >You need to create a filter variable based on the criteria that constitute >bad data. Then you can use the SELECT IF statement to keep only those >observations with "good" data. > >The complexity of creating a filter depends on the number of variables >having bad data. > >You might want to check Raynald Levesque's site at spsstools.net. There may >just be a solution to your problem there. Another option is to check this >list's archives at http://listserv.uga.edu/archives/spssx-l.html. > >Good luck. > >Dominic Lusinchi >Statistician >Far West Research >Statistical Consulting >San Francisco, California >415-664-3032 >www.farwestresearch.com > >-----Original Message----- >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of >leora lawton >Sent: Sunday, September 17, 2006 10:06 PM >To: [hidden email] >Subject: filtering out cases > >Hi, > >I need to remove a number of cases from a large (well, >3,500 cases) data file. The respondent id variable is >entryid, and I want to be able to remove about 100 >cases (that have bad data), and they aren't >consecutive. So for example, I want to remove case >number 1, 6, 20, 28, 48, and so forth. It looks like >the filter cases entryid ~= 1 would have to be >repeated for each value? The syntax "entryid ~= 1 or >6 or 20" doesn't work. > >I'm wondering if $casenum is the right thing, but I >don't know how to use it either. > >so if anyone can help, I'd be grateful! Either syntax >or GUI instructions welcome. > >thanks >Leora > > > > > >Dr. Leora Lawton >TechSociety Research >"Custom Social Science and Consumer Behavior Research" >2342 Shattuck Avenue PMB 362, Berkeley, CA 94704 >(510) 548-6174; fax (510) 548-6175; cell (510) 928-7572 >[hidden email] >www.techsociety.com Research Database Manager and Analyst Melbourne Institute of Applied Economic and Social Research The University of Melbourne Melbourne VIC 3010 Australia New Tel: (03) 8344 2085 New Fax: (03) 8344 2111 http://www.melbourneinstitute.com/hilda/ |
Backup your original file first, and with your working file use a temporary
command before any select statement to make sure it's going to do what you want it to if you're not certain it's set up right. temp. select if (SOME LOGICAL SET OF CONDITIONS). freq SOME USEFUL VAR FOR SEEING GOOD DATA. After the frequencies command is run, everything is set back the the pre-select state. A useful technique for those of us who really want to be SURE! The first part of Raynauld's book treats these best-practices axioms well. Best! -Gary On 9/17/06, Simon Freidin <[hidden email]> wrote: > > sel if not(any(entryid,1, 6, 20, 28, 48)). > exe. > > At 03:32 PM 18/09/2006, Dominic Lusinchi wrote: > >You don't need $CASENUM since you already have entryid. If you did not > have > >entryid you could create using COMPUTE entryid=$CASENUM. This will create > a > >variable (entryid) with consecutive numbers corresponding to the number > of > >cases in your data file up to n. > > > >But that's not going to help you delete cases with bad data. > >You need to create a filter variable based on the criteria that > constitute > >bad data. Then you can use the SELECT IF statement to keep only those > >observations with "good" data. > > > >The complexity of creating a filter depends on the number of variables > >having bad data. > > > >You might want to check Raynald Levesque's site at spsstools.net. There > may > >just be a solution to your problem there. Another option is to check this > >list's archives at http://listserv.uga.edu/archives/spssx-l.html. > > > >Good luck. > > > >Dominic Lusinchi > >Statistician > >Far West Research > >Statistical Consulting > >San Francisco, California > >415-664-3032 > >www.farwestresearch.com > > > >-----Original Message----- > >From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > >leora lawton > >Sent: Sunday, September 17, 2006 10:06 PM > >To: [hidden email] > >Subject: filtering out cases > > > >Hi, > > > >I need to remove a number of cases from a large (well, > >3,500 cases) data file. The respondent id variable is > >entryid, and I want to be able to remove about 100 > >cases (that have bad data), and they aren't > >consecutive. So for example, I want to remove case > >number 1, 6, 20, 28, 48, and so forth. It looks like > >the filter cases entryid ~= 1 would have to be > >repeated for each value? The syntax "entryid ~= 1 or > >6 or 20" doesn't work. > > > >I'm wondering if $casenum is the right thing, but I > >don't know how to use it either. > > > >so if anyone can help, I'd be grateful! Either syntax > >or GUI instructions welcome. > > > >thanks > >Leora > > > > > > > > > > > >Dr. Leora Lawton > >TechSociety Research > >"Custom Social Science and Consumer Behavior Research" > >2342 Shattuck Avenue PMB 362, Berkeley, CA 94704 > >(510) 548-6174; fax (510) 548-6175; cell (510) 928-7572 > >[hidden email] > >www.techsociety.com > > > Research Database Manager and Analyst > Melbourne Institute of Applied Economic and Social Research > The University of Melbourne > Melbourne VIC 3010 Australia > New Tel: (03) 8344 2085 New Fax: (03) 8344 2111 > http://www.melbourneinstitute.com/hilda/ > |
Free forum by Nabble | Edit this page |