SPSSX Discussion

FW: Select subset of cases based on change in value

Classic

List

Threaded

5 messages Options

Roberts, Michael

FW: Select subset of cases based on change in value

> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to
> select the first 'n' cases for each change in some value from a
> dataset? For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>

Maguin, Eugene

Re: Select subset of cases based on change in value

Mike,

One way to go at this is to number cases within each level of your 'by'
variable, which in your case is year.

Compute rec=1.
If (year eq lag(year)) rec=lag(rec)+1.

Select if (rec lt 11).

Gene Maguin

Roberts, Michael

Re: Select subset of cases based on change in value

In reply to this post by Roberts, Michael

Gene,

Thank you for the help... The code works perfectly for my purposes.

Regards

Mike

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Wednesday, June 21, 2006 5:15 PM
To: [hidden email]
Subject: Re: Select subset of cases based on change in value

Mike,

One way to go at this is to number cases within each level of your 'by'
variable, which in your case is year.

Compute rec=1.
If (year eq lag(year)) rec=lag(rec)+1.

Select if (rec lt 11).

Gene Maguin

Marks, Jim

Re: FW: Select subset of cases based on change in value

In reply to this post by Roberts, Michael

Here is one way:

** Sample data.

DATA LIST FREE /year (f8.0).
BEGIN DATA
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
END DATA.

*** create a numeric variale and find the rank within year.
COMPUTE caseord = $CASENUM.

RANK caseord BY YEAR /rank into first_n.

Select or filter as required.

Note: this gives you the 1st n cases based on the current file order.
If you want to randomly select 10 cases per year, use this statement to
compute the caseord variable.
like

COMPUTE caseord = UNOFORM (1).

--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Roberts, Michael
Sent: Wednesday, June 21, 2006 3:56 PM
To: [hidden email]
Subject: FW: Select subset of cases based on change in value

> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to

> select the first 'n' cases for each change in some value from a
> dataset? For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>

Roberts, Michael

Re: FW: Select subset of cases based on change in value

In reply to this post by Roberts, Michael

Thank you to all who responded with advice on doing this. I found it
very informative and helpful.

Best Regards

Mike

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Marks, Jim
Sent: Wednesday, June 21, 2006 5:39 PM
To: [hidden email]
Subject: Re: FW: Select subset of cases based on change in value

Here is one way:

** Sample data.

DATA LIST FREE /year (f8.0).
BEGIN DATA
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960
1960 1960 1960 1960 1960 1960 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970
1970 1970 END DATA.

*** create a numeric variale and find the rank within year.
COMPUTE caseord = $CASENUM.

RANK caseord BY YEAR /rank into first_n.

Select or filter as required.

Note: this gives you the 1st n cases based on the current file order.
If you want to randomly select 10 cases per year, use this statement to
compute the caseord variable.
like

COMPUTE caseord = UNOFORM (1).

--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Roberts, Michael
Sent: Wednesday, June 21, 2006 3:56 PM
To: [hidden email]
Subject: FW: Select subset of cases based on change in value

> Hi list,
>
> I am wondering whether anyone has an idea as to how I might be able to

> select the first 'n' cases for each change in some value from a
> dataset? For example, if I have data in one file from 1960 through
> 1970, and the year is one of the values, I want to select the first
> ten cases for each year(?)
>
>
> Mike
>
>