> Hi list,
> > I am wondering whether anyone has an idea as to how I might be able to > select the first 'n' cases for each change in some value from a > dataset? For example, if I have data in one file from 1960 through > 1970, and the year is one of the values, I want to select the first > ten cases for each year(?) > > > Mike > > |
Mike,
One way to go at this is to number cases within each level of your 'by' variable, which in your case is year. Compute rec=1. If (year eq lag(year)) rec=lag(rec)+1. Select if (rec lt 11). Gene Maguin |
In reply to this post by Roberts, Michael
Gene,
Thank you for the help... The code works perfectly for my purposes. Regards Mike -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin Sent: Wednesday, June 21, 2006 5:15 PM To: [hidden email] Subject: Re: Select subset of cases based on change in value Mike, One way to go at this is to number cases within each level of your 'by' variable, which in your case is year. Compute rec=1. If (year eq lag(year)) rec=lag(rec)+1. Select if (rec lt 11). Gene Maguin |
In reply to this post by Roberts, Michael
Here is one way:
** Sample data. DATA LIST FREE /year (f8.0). BEGIN DATA 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 END DATA. *** create a numeric variale and find the rank within year. COMPUTE caseord = $CASENUM. RANK caseord BY YEAR /rank into first_n. Select or filter as required. Note: this gives you the 1st n cases based on the current file order. If you want to randomly select 10 cases per year, use this statement to compute the caseord variable. like COMPUTE caseord = UNOFORM (1). --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Roberts, Michael Sent: Wednesday, June 21, 2006 3:56 PM To: [hidden email] Subject: FW: Select subset of cases based on change in value > Hi list, > > I am wondering whether anyone has an idea as to how I might be able to > select the first 'n' cases for each change in some value from a > dataset? For example, if I have data in one file from 1960 through > 1970, and the year is one of the values, I want to select the first > ten cases for each year(?) > > > Mike > > |
In reply to this post by Roberts, Michael
Thank you to all who responded with advice on doing this. I found it
very informative and helpful. Best Regards Mike -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marks, Jim Sent: Wednesday, June 21, 2006 5:39 PM To: [hidden email] Subject: Re: FW: Select subset of cases based on change in value Here is one way: ** Sample data. DATA LIST FREE /year (f8.0). BEGIN DATA 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 END DATA. *** create a numeric variale and find the rank within year. COMPUTE caseord = $CASENUM. RANK caseord BY YEAR /rank into first_n. Select or filter as required. Note: this gives you the 1st n cases based on the current file order. If you want to randomly select 10 cases per year, use this statement to compute the caseord variable. like COMPUTE caseord = UNOFORM (1). --jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Roberts, Michael Sent: Wednesday, June 21, 2006 3:56 PM To: [hidden email] Subject: FW: Select subset of cases based on change in value > Hi list, > > I am wondering whether anyone has an idea as to how I might be able to > select the first 'n' cases for each change in some value from a > dataset? For example, if I have data in one file from 1960 through > 1970, and the year is one of the values, I want to select the first > ten cases for each year(?) > > > Mike > > |
Free forum by Nabble | Edit this page |