REPORT & SPLIT FILE don't get along very well....

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

REPORT & SPLIT FILE don't get along very well....

Marta García-Granero
Hi my friends

As part of a never ending macro development for best subsets in
multiple regression, I need a restricted list of a very long file (can
have several thousand rows): the first 3 cases for every value another
variable takes (see file below). I have been using SPLIT FILE with
LIST /CASES=FROM 1 TO 3, and it works, but I wanted a nicer looking
table (easier to modify afterwards with Word), and I tried to use the
same idea, but with REPORT instead of LIST... It doesn't work :(

Any ideas?

Thanks
Marta

* Small sample of cumbersome dataset *.
DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc (7 F5.3).
BEGIN DATA
1 0 1 0 0 0 .599  .590 422.6 7.421 546.1  .438 549.8
1 0 0 0 0 1 .390  .376 521.5 32.74 565.1  .667 568.7
1 0 0 0 1 0 .074  .052 642.5 70.91 583.8 1.012 587.5
1 0 0 1 0 0 .042  .020 653.6 74.80 585.4 1.047 589.0
1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
2 0 1 0 1 0 .658  .642 394.9 2.282 541.0  .390 546.4
2 0 1 1 0 0 .648  .631 401.1 3.594 542.4  .403 547.8
2 0 1 0 0 1 .608  .590 422.8 8.343 547.1  .448 552.5
2 1 1 0 0 0 .603  .585 425.5 8.933 547.7  .453 553.1
2 1 0 0 0 1 .553  .531 451.9 15.07 553.1  .511 558.5
2 0 0 0 1 1 .430  .403 510.1 29.89 564.0  .651 569.4
2 0 0 1 0 1 .415  .387 516.7 31.68 565.2  .668 570.6
2 1 0 0 1 0 .078  .034 648.9 72.48 585.7 1.054 591.1
2 0 0 1 1 0 .074  .030 650.0 72.87 585.8 1.058 591.2
2 1 0 1 0 0 .053  .008 657.3 75.40 586.8 1.082 592.3
3 0 1 0 1 1 .662  .638 397.3 3.796 542.4  .403 549.7
3 0 1 1 1 0 .660  .636 398.5 4.049 542.7  .406 549.9
3 1 1 0 1 0 .659  .634 399.3 4.207 542.9  .407 550.1
3 1 1 1 0 0 .652  .627 403.3 5.037 543.8  .416 551.0
3 0 1 1 0 1 .652  .627 403.3 5.048 543.8  .416 551.0
3 1 1 0 0 1 .637  .610 412.2 6.915 545.7  .434 553.0
3 1 0 1 0 1 .576  .546 445.0 14.19 552.6  .506 559.9
3 1 0 0 1 1 .564  .533 451.3 15.64 553.9  .521 561.1
3 0 0 1 1 1 .430  .388 516.2 31.89 566.0  .681 573.2
3 1 0 1 1 0 .078  .010 656.7 74.48 587.7 1.102 594.9
4 1 1 1 0 1 .675  .642 394.8 4.296 542.7  .406 551.8
4 1 1 0 1 1 .672  .639 396.6 4.670 543.2  .410 552.2
4 0 1 1 1 1 .664  .631 401.2 5.588 544.2  .420 553.2
4 1 1 1 1 0 .662  .628 402.7 5.886 544.5  .423 553.6
4 1 0 1 1 1 .577  .535 450.1 16.09 554.6  .528 563.6
5 1 1 1 1 1 .677  .636 398.3 6.000 544.4  .422 555.2
END DATA.

* Listing 3 best models for every number of predictors *.

* This works (but it's ugly) *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
SPLIT FILE OFF.

* This doesn't work *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
SUMMARIZE
  /TABLES=x1 TO sbc
  /FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
  /TITLE='Best subsets models'
  /MISSING=VARIABLE
  /CELLS=NONE.
SPLIT FILE OFF.
Reply | Threaded
Open this post in threaded view
|

Re: REPORT & SPLIT FILE don't get along very well....

Marta García-Granero
Hi Simon,

Simon (Freidin) says ;)

SF> Sort then add a cumulative counter of number of cases in each
SF> group. Filter by counter < 4, then split and summarize.

It worked goooody good! It looks like my MACRO is finished at last :)

RANK  VARIABLES=Cp (A) BY nvars /RANK /PRINT=NO. /* See note *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
TEMPORARY.
SELECT IF (RCp LE 3).
SUMMARIZE
  /TABLES=x1 TO sbc
  /FORMAT=LIST NOCASENUM NOTOTAL
  /TITLE='Best subsets models'
  /MISSING=VARIABLE
  /CELLS=NONE.
SPLIT FILE OFF.

Note: There is no risk of tied ranks, because all Cp values have a lot
of decimal places and there aren't two with the same value.

Thanks a lot and happy weekend!

Marta.

>> DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc
>> (7 F5.3).
>> BEGIN DATA
>> 1 0 1 0 0 0 .599  .590 422.6 7.421 546.1  .438 549.8
>> 1 0 0 0 0 1 .390  .376 521.5 32.74 565.1  .667 568.7
>> 1 0 0 0 1 0 .074  .052 642.5 70.91 583.8 1.012 587.5
>> 1 0 0 1 0 0 .042  .020 653.6 74.80 585.4 1.047 589.0
>> 1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
>> 2 0 1 0 1 0 .658  .642 394.9 2.282 541.0  .390 546.4
>> 2 0 1 1 0 0 .648  .631 401.1 3.594 542.4  .403 547.8
>> 2 0 1 0 0 1 .608  .590 422.8 8.343 547.1  .448 552.5
>> 2 1 1 0 0 0 .603  .585 425.5 8.933 547.7  .453 553.1
>> 2 1 0 0 0 1 .553  .531 451.9 15.07 553.1  .511 558.5
>> 2 0 0 0 1 1 .430  .403 510.1 29.89 564.0  .651 569.4
>> 2 0 0 1 0 1 .415  .387 516.7 31.68 565.2  .668 570.6
>> 2 1 0 0 1 0 .078  .034 648.9 72.48 585.7 1.054 591.1
>> 2 0 0 1 1 0 .074  .030 650.0 72.87 585.8 1.058 591.2
>> 2 1 0 1 0 0 .053  .008 657.3 75.40 586.8 1.082 592.3
>> 3 0 1 0 1 1 .662  .638 397.3 3.796 542.4  .403 549.7
>> 3 0 1 1 1 0 .660  .636 398.5 4.049 542.7  .406 549.9
>> 3 1 1 0 1 0 .659  .634 399.3 4.207 542.9  .407 550.1
>> 3 1 1 1 0 0 .652  .627 403.3 5.037 543.8  .416 551.0
>> 3 0 1 1 0 1 .652  .627 403.3 5.048 543.8  .416 551.0
>> 3 1 1 0 0 1 .637  .610 412.2 6.915 545.7  .434 553.0
>> 3 1 0 1 0 1 .576  .546 445.0 14.19 552.6  .506 559.9
>> 3 1 0 0 1 1 .564  .533 451.3 15.64 553.9  .521 561.1
>> 3 0 0 1 1 1 .430  .388 516.2 31.89 566.0  .681 573.2
>> 3 1 0 1 1 0 .078  .010 656.7 74.48 587.7 1.102 594.9
>> 4 1 1 1 0 1 .675  .642 394.8 4.296 542.7  .406 551.8
>> 4 1 1 0 1 1 .672  .639 396.6 4.670 543.2  .410 552.2
>> 4 0 1 1 1 1 .664  .631 401.2 5.588 544.2  .420 553.2
>> 4 1 1 1 1 0 .662  .628 402.7 5.886 544.5  .423 553.6
>> 4 1 0 1 1 1 .577  .535 450.1 16.09 554.6  .528 563.6
>> 5 1 1 1 1 1 .677  .636 398.3 6.000 544.4  .422 555.2
>> END DATA.
>>
>> * This works (but it's ugly) *.
>> SORT CASES BY nvars(A) Cp(A) .
>> SPLIT FILE LAYERED BY nvars .
>> LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
>> SPLIT FILE OFF.
>>
>> * This doesn't work *.
>> SORT CASES BY nvars(A) Cp(A) .
>> SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
>> SUMMARIZE
>>   /TABLES=x1 TO sbc
>>   /FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
>>   /TITLE='Best subsets models'
>>   /MISSING=VARIABLE
>>   /CELLS=NONE.
>> SPLIT FILE OFF.
Reply | Threaded
Open this post in threaded view
|

(no subject)

Frederic Villamayor Forcada
In reply to this post by Marta García-Granero
Hi, Marta,

I think this works...

SORT CASES BY nvars(A) Cp(A) .
RANK
  VARIABLES=cp  (A) BY nvars  /RANK /PRINT=YES
  /TIES=CONDENSE .
SPLIT FILE LAYERED BY nvars .
TEMPORARY.
SELECT IF rcp LE 3.
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.



Greetings


Frederic

%%%%%%%%%%%%%%%%%%%%%%%
Frederic Villamayor
Unitat de Bioestadística
Àrea de Desenvolupament Preclínic
CIDF Ferrer Grupo
Juan de Sada, 32
08028 - Barcelona
Espanya

E-mail: [hidden email]
Tel: +34 935093236
Fax: +34 934112764
WWW: www.ferrergrupo.com

%%%%%%%%%%%%%%%%%%%%%%%
"Sanity is not statistical"
1984 (George Orwell)



Marta García-Granero <[hidden email]>
Enviado por: "SPSSX(r) Discussion" <[hidden email]>
21/07/2006 18:18
Por favor, responda a
Marta García-Granero              <[hidden email]>


Para
[hidden email]
cc

Asunto






Hi my friends

As part of a never ending macro development for best subsets in
multiple regression, I need a restricted list of a very long file (can
have several thousand rows): the first 3 cases for every value another
variable takes (see file below). I have been using SPLIT FILE with
LIST /CASES=FROM 1 TO 3, and it works, but I wanted a nicer looking
table (easier to modify afterwards with Word), and I tried to use the
same idea, but with REPORT!instead of LIST... It doesn't work :(

Any ideas?

Thanks
Marta

* Small sample of cumbersome dataset *.
DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc (7
F5.3).
BEGIN DATA
1 0 1 0 0 0 .599  .590 422.6 7.421 546.1  .438 549.8
1 0 0 0 0 1 .390  .376 521.5 32.74 565.1  .667 568.7
1 0 0 0 1 0 .074  .052 642.5 70.91 583.8 1.012 587.5
1 0 0 1 0 0 .042  .020 653.6 74.80 585.4 1.047 589.0
1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
2 0 1 0 1 0 .658  .642 394.9 2.282 541.0  .390 546.4
2 0 1 1 0 0 .648  .631 401.1 3.594 542.4  .403 547.8
2 0 1 0 0 1 .608  .590 422.8 8.343 547.1  .448 552.5
2 1 1 0 0 0 .603  .585 425.5 8.933 547.7  .453 553.1
2 1 0 0 0 1 .553  .531 451.9 15.07 553.1  .511 558.5
2 0 0 0 1 1 .430  .403 510.1 29.89 564.0  .651 569.4
2 0 0 1 0 1 .415  .387 516.7 31.68 565.2  .668 570.6
2 1 0 0 1 0 .078  .034 648.9 72.48 585.7 1.054 591.1
2 0 0 1 1 0 .074  .030 650.0 72.87 585.8 1.058 591.2
2 1 0 1 0 0 .053  .008 657.3 75.40 586.8 1.082 592.3
3 0 1 0 1 1 .662  .638 397.3 3.796 542.4  .403 549.7
3 0 1 1 1 0 .660  .636 398.5 4.049 542.7  .406 549.9
3 1 1 0 1 0 .659  .634 399.3 4.207 542.9  .407 550.1
3 1 1 1 0 0 .652  .627 403.3 5.037 543.8  .416 551.0
3 0 1 1 0 1 .652  .627 403.3 5.048 543.8  .416 551.0
3 1 1 0 0 1 .637  .610 412.2 6.915 545.7  .434 553.0
3 1 0 1 0 1 .576  .546 445.0 14.19 552.6  .506 559.9
3 1 0 0 1 1 .564  .533 451.3 15.64 553.9  .521 561.1
3 0 0 1 1 1 .430  .388 516.2 31.89 566.0  .681 573.2
3 1 0 1 1 0 .078  .010 656.7 74.48 587.7 1.102 594.9
4 1 1 1 0 1 .675  .642 394.8 4.296 542.7  .406 551.8
4 1 1 0 1 1 .672  .639 396.6 4.670 543.2  .410 552.2
4 0 1 1 1 1 .664  .631 401.2 5.588 544.2  .420 553.2
4 1 1 1 1 0 .662  .628 402.7 5.886 544.5  .423 553.6
4 1 0 1 1 1 .577  .535 450.1 16.09 554.6  .528 563.6
5 1 1 1 1 1 .677  .636 398.3 6.000 544.4  .422 555.2
END DATA.

* Listing 3 best models for every number of predictors *.

* This works (but it's ugly) *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
SPLIT FILE OFF.

* This doesn't work *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.
SPLIT FILE OFF.