FW: Select subset of cases based on change in value

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

FW: Select subset of cases based on change in value

Roberts, Michael
> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to
> select the first 'n' cases for each change in some value from a
> dataset?  For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Select subset of cases based on change in value

Maguin, Eugene
Mike,

One way to go at this is to number cases within each level of your 'by'
variable, which in your case is year.

Compute rec=1.
If (year eq lag(year)) rec=lag(rec)+1.

Select if (rec lt 11).


Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Select subset of cases based on change in value

Roberts, Michael
In reply to this post by Roberts, Michael
Gene,

Thank you for the help... The code works perfectly for my purposes.

Regards

Mike

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Wednesday, June 21, 2006 5:15 PM
To: [hidden email]
Subject: Re: Select subset of cases based on change in value

Mike,

One way to go at this is to number cases within each level of your 'by'
variable, which in your case is year.

Compute rec=1.
If (year eq lag(year)) rec=lag(rec)+1.

Select if (rec lt 11).


Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: FW: Select subset of cases based on change in value

Marks, Jim
In reply to this post by Roberts, Michael
Here is one way:

** Sample data.


DATA LIST FREE /year (f8.0).
BEGIN DATA
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
END DATA.


*** create a numeric variale and find the rank within year.
COMPUTE caseord = $CASENUM.

RANK caseord  BY YEAR /rank into first_n.

Select or filter as required.

Note: this gives you the 1st n cases based on the current file order.
If you want to randomly select 10 cases per year, use this statement to
compute the caseord variable.
like

        COMPUTE caseord = UNOFORM (1).


--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Roberts, Michael
Sent: Wednesday, June 21, 2006 3:56 PM
To: [hidden email]
Subject: FW: Select subset of cases based on change in value

> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to

> select the first 'n' cases for each change in some value from a
> dataset?  For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>
Reply | Threaded
Open this post in threaded view
|

Re: FW: Select subset of cases based on change in value

Roberts, Michael
In reply to this post by Roberts, Michael
Thank you to all who responded with advice on doing this.  I found it
very informative and helpful.

Best Regards

Mike


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Marks, Jim
Sent: Wednesday, June 21, 2006 5:39 PM
To: [hidden email]
Subject: Re: FW: Select subset of cases based on change in value

Here is one way:

** Sample data.


DATA LIST FREE /year (f8.0).
BEGIN DATA
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960 1960 1960 1960 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 END DATA.


*** create a numeric variale and find the rank within year.
COMPUTE caseord = $CASENUM.

RANK caseord  BY YEAR /rank into first_n.

Select or filter as required.

Note: this gives you the 1st n cases based on the current file order.
If you want to randomly select 10 cases per year, use this statement to
compute the caseord variable.
like

        COMPUTE caseord = UNOFORM (1).


--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Roberts, Michael
Sent: Wednesday, June 21, 2006 3:56 PM
To: [hidden email]
Subject: FW: Select subset of cases based on change in value

> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to

> select the first 'n' cases for each change in some value from a
> dataset?  For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>