Get the n final cases

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Get the n final cases

Carlos Renato (www.estatistico.org)
Dear friends of the list
 
      I have a dataset that contains a million more cases but
I want exclude the first cases and to use the 100 final cases
in the order of the dataset.
 
      Exists forms of select the 100 final cases in a dataset?
 
Carlos Renato
Statistician
Recife - Brazil
Reply | Threaded
Open this post in threaded view
|

Re: Get the n final cases

Carlos Renato (www.estatistico.org)
Friend Roberts
 
     I don't know the exact number of cases, but I nedd to use
only the N final cases. The total number of cases is variable
and I can't to determine a superior and/or inferior limit to the
analysis.
   
Regardless of the number of cases, I want to use only the N
final cases.
 
Carlos Renato
Statistician
Recife - Brazil
 
Reply | Threaded
Open this post in threaded view
|

Re: Get the n final cases

Richard Ristow
In reply to this post by Carlos Renato (www.estatistico.org)
At 10:27 AM 6/1/2009, Carlos Renato wrote:

      I have a dataset that contains a million more cases but I want exclude the first cases and to use the 100 final cases in the order of the dataset.
 
      Exists forms of select the 100 final cases in a dataset?

There's no very good way. As an off-list correspondent seems to have written you, you do need the number of cases in the dataset. (You could do a SORT CASES to reverse the order, too, but I wouldn't recommend that with a million cases.)  You can calculate the number of cases, though, and that may be the way to do it:

|-----------------------------|---------------------------|
|Output Created               |01-JUN-2009 14:04:08       |
|-----------------------------|---------------------------|
SERIAL Greek

    1  Alpha
    2  Beta
    3  Gamma
    4  Delta
    5  Epsilon
    6  Zeta
    7  Eta
    8  Theta
    9  Iota
   10  Kappa
   11  Lambda
   12  Mu
   13  Nu
   14  Xi
   15  Omikron
   16  Pi
   17  Rho
   18  Sigma
   19  Tau
   20  Upsilon
   21  Phi
   22  Chi
   23  Psi
   24  Omega

Number of cases read:  24    Number of cases listed:  24

 
*  ......   Select the last five cases:        ..................... .

COMPUTE    SeqNumb = $CASENUM.
COMPUTE    NoBreak = 1.

FORMATS    NoBreak (F2)
           SeqNumb (COMMA11).

VAR LABELS NoBreak  'Constant value 1 throughout the file'
           SeqNumb  'Record number, in the file before selection'.

AGGREGATE OUTFILE = * MODE=ADDVARIABLES
   /BREAK=NoBreak
   /Number 'No. of cases in the file' = NU.

SELECT IF SeqNumb GT Number - 5.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |01-JUN-2009 14:04:09       |
|-----------------------------|---------------------------|
SERIAL Greek        SeqNumb NoBreak  Number

   20  Upsilon           20     1        24
   21  Phi               21     1        24
   22  Chi               22     1        24
   23  Psi               23     1        24
   24  Omega             24     1        24

Number of cases read:  5    Number of cases listed:  5
=============================
APPENDIX: Test data, and code
=============================
*  ................................................................. .
*  .................   Test data               ..................... .

*  The most standard test data:  the Greek alphabet  .
DATA LIST FIXED
   /SERIAL  01-03
    Greek   04-11 (A).
BEGIN DATA
1  Alpha
2  Beta
3  Gamma
4  Delta
5  Epsilon
6  Zeta
7  Eta
8  Theta
9  Iota
10 Kappa
11 Lambda
12 Mu
13 Nu
14 Xi
15 Omikron
16 Pi
17 Rho
18 Sigma
19 Tau
20 Upsilon
21 Phi
22 Chi
23 Psi
24 Omega
END DATA.

*  .................   Post after this point   ..................... .
*  ................................................................. .
LIST.

*  ......   Select the last five cases:        ..................... .

COMPUTE    SeqNumb = $CASENUM.
COMPUTE    NoBreak = 1.

FORMATS    NoBreak (F2)
           SeqNumb (COMMA11).

VAR LABELS NoBreak  'Constant value 1 throughout the file'
           SeqNumb  'Record number, in the file before selection'.

AGGREGATE OUTFILE = * MODE=ADDVARIABLES
   /BREAK=NoBreak
   /Number 'No. of cases in the file' = NU.

SELECT IF SeqNumb GT Number - 5.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Get the n final cases

Peck, Jon

If the dataset is an SPSS sav file or, I think, if the data have been passed, the number of cases is available via programmability using spss.GetCaseCount.  So the following syntax would set the sample to the last 100.

 

begin program.

import spss

spss.Submit("USE %s THRU LAST" % (spss.GetCaseCount() – 99))

end program.

 

HTH,

Jon Peck

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Monday, June 01, 2009 12:07 PM
To: [hidden email]
Subject: Re: [SPSSX-L] Get the n final cases

 

At 10:27 AM 6/1/2009, Carlos Renato wrote:


      I have a dataset that contains a million more cases but I want exclude the first cases and to use the 100 final cases in the order of the dataset.
 
      Exists forms of select the 100 final cases in a dataset?


There's no very good way. As an off-list correspondent seems to have written you, you do need the number of cases in the dataset. (You could do a SORT CASES to reverse the order, too, but I wouldn't recommend that with a million cases.)  You can calculate the number of cases, though, and that may be the way to do it:

|-----------------------------|---------------------------|
|Output Created               |01-JUN-2009 14:04:08       |
|-----------------------------|---------------------------|
SERIAL Greek

    1  Alpha
    2  Beta
    3  Gamma
    4  Delta
    5  Epsilon
    6  Zeta
    7  Eta
    8  Theta
    9  Iota
   10  Kappa
   11  Lambda
   12  Mu
   13  Nu
   14  Xi
   15  Omikron
   16  Pi
   17  Rho
   18  Sigma
   19  Tau
   20  Upsilon
   21  Phi
   22  Chi
   23  Psi
   24  Omega

Number of cases read:  24    Number of cases listed:  24

 
*  ......   Select the last five cases:        ..................... .

COMPUTE    SeqNumb = $CASENUM.
COMPUTE    NoBreak = 1.

FORMATS    NoBreak (F2)
           SeqNumb (COMMA11).

VAR LABELS NoBreak  'Constant value 1 throughout the file'
           SeqNumb  'Record number, in the file before selection'.

AGGREGATE OUTFILE = * MODE=ADDVARIABLES
   /BREAK=NoBreak
   /Number 'No. of cases in the file' = NU.

SELECT IF SeqNumb GT Number - 5.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |01-JUN-2009 14:04:09       |
|-----------------------------|---------------------------|
SERIAL Greek        SeqNumb NoBreak  Number

   20  Upsilon           20     1        24
   21  Phi               21     1        24
   22  Chi               22     1        24
   23  Psi               23     1        24
   24  Omega             24     1        24

Number of cases read:  5    Number of cases listed:  5
=============================
APPENDIX: Test data, and code
=============================
*  ................................................................. .
*  .................   Test data               ..................... .

*  The most standard test data:  the Greek alphabet  .
DATA LIST FIXED
   /SERIAL  01-03
    Greek   04-11 (A).
BEGIN DATA
1  Alpha
2  Beta
3  Gamma
4  Delta
5  Epsilon
6  Zeta
7  Eta
8  Theta
9  Iota
10 Kappa
11 Lambda
12 Mu
13 Nu
14 Xi
15 Omikron
16 Pi
17 Rho
18 Sigma
19 Tau
20 Upsilon
21 Phi
22 Chi
23 Psi
24 Omega
END DATA.

*  .................   Post after this point   ..................... .
*  ................................................................. .
LIST.

*  ......   Select the last five cases:        ..................... .

COMPUTE    SeqNumb = $CASENUM.
COMPUTE    NoBreak = 1.

FORMATS    NoBreak (F2)
           SeqNumb (COMMA11).

VAR LABELS NoBreak  'Constant value 1 throughout the file'
           SeqNumb  'Record number, in the file before selection'.

AGGREGATE OUTFILE = * MODE=ADDVARIABLES
   /BREAK=NoBreak
   /Number 'No. of cases in the file' = NU.

SELECT IF SeqNumb GT Number - 5.

LIST.


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD