SPSSX Discussion

REPORT & SPLIT FILE don't get along very well....

Classic

List

Threaded

3 messages Options

Marta García-Granero

REPORT & SPLIT FILE don't get along very well....

Hi my friends

As part of a never ending macro development for best subsets in
multiple regression, I need a restricted list of a very long file (can
have several thousand rows): the first 3 cases for every value another
variable takes (see file below). I have been using SPLIT FILE with
LIST /CASES=FROM 1 TO 3, and it works, but I wanted a nicer looking
table (easier to modify afterwards with Word), and I tried to use the
same idea, but with REPORT instead of LIST... It doesn't work :(

Any ideas?

Thanks
Marta

* Small sample of cumbersome dataset *.
DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc (7 F5.3).
BEGIN DATA
1 0 1 0 0 0 .599 .590 422.6 7.421 546.1 .438 549.8
1 0 0 0 0 1 .390 .376 521.5 32.74 565.1 .667 568.7
1 0 0 0 1 0 .074 .052 642.5 70.91 583.8 1.012 587.5
1 0 0 1 0 0 .042 .020 653.6 74.80 585.4 1.047 589.0
1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
2 0 1 0 1 0 .658 .642 394.9 2.282 541.0 .390 546.4
2 0 1 1 0 0 .648 .631 401.1 3.594 542.4 .403 547.8
2 0 1 0 0 1 .608 .590 422.8 8.343 547.1 .448 552.5
2 1 1 0 0 0 .603 .585 425.5 8.933 547.7 .453 553.1
2 1 0 0 0 1 .553 .531 451.9 15.07 553.1 .511 558.5
2 0 0 0 1 1 .430 .403 510.1 29.89 564.0 .651 569.4
2 0 0 1 0 1 .415 .387 516.7 31.68 565.2 .668 570.6
2 1 0 0 1 0 .078 .034 648.9 72.48 585.7 1.054 591.1
2 0 0 1 1 0 .074 .030 650.0 72.87 585.8 1.058 591.2
2 1 0 1 0 0 .053 .008 657.3 75.40 586.8 1.082 592.3
3 0 1 0 1 1 .662 .638 397.3 3.796 542.4 .403 549.7
3 0 1 1 1 0 .660 .636 398.5 4.049 542.7 .406 549.9
3 1 1 0 1 0 .659 .634 399.3 4.207 542.9 .407 550.1
3 1 1 1 0 0 .652 .627 403.3 5.037 543.8 .416 551.0
3 0 1 1 0 1 .652 .627 403.3 5.048 543.8 .416 551.0
3 1 1 0 0 1 .637 .610 412.2 6.915 545.7 .434 553.0
3 1 0 1 0 1 .576 .546 445.0 14.19 552.6 .506 559.9
3 1 0 0 1 1 .564 .533 451.3 15.64 553.9 .521 561.1
3 0 0 1 1 1 .430 .388 516.2 31.89 566.0 .681 573.2
3 1 0 1 1 0 .078 .010 656.7 74.48 587.7 1.102 594.9
4 1 1 1 0 1 .675 .642 394.8 4.296 542.7 .406 551.8
4 1 1 0 1 1 .672 .639 396.6 4.670 543.2 .410 552.2
4 0 1 1 1 1 .664 .631 401.2 5.588 544.2 .420 553.2
4 1 1 1 1 0 .662 .628 402.7 5.886 544.5 .423 553.6
4 1 0 1 1 1 .577 .535 450.1 16.09 554.6 .528 563.6
5 1 1 1 1 1 .677 .636 398.3 6.000 544.4 .422 555.2
END DATA.

* Listing 3 best models for every number of predictors *.

* This works (but it's ugly) *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
SPLIT FILE OFF.

* This doesn't work *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.
SPLIT FILE OFF.

Marta García-Granero

Re: REPORT & SPLIT FILE don't get along very well....

Hi Simon,

Simon (Freidin) says ;)

SF> Sort then add a cumulative counter of number of cases in each
SF> group. Filter by counter < 4, then split and summarize.

It worked goooody good! It looks like my MACRO is finished at last :)

RANK VARIABLES=Cp (A) BY nvars /RANK /PRINT=NO. /* See note *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
TEMPORARY.
SELECT IF (RCp LE 3).
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.
SPLIT FILE OFF.

Note: There is no risk of tied ranks, because all Cp values have a lot
of decimal places and there aren't two with the same value.

Thanks a lot and happy weekend!

Marta.

>> DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc
>> (7 F5.3).
>> BEGIN DATA
>> 1 0 1 0 0 0 .599 .590 422.6 7.421 546.1 .438 549.8
>> 1 0 0 0 0 1 .390 .376 521.5 32.74 565.1 .667 568.7
>> 1 0 0 0 1 0 .074 .052 642.5 70.91 583.8 1.012 587.5
>> 1 0 0 1 0 0 .042 .020 653.6 74.80 585.4 1.047 589.0
>> 1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
>> 2 0 1 0 1 0 .658 .642 394.9 2.282 541.0 .390 546.4
>> 2 0 1 1 0 0 .648 .631 401.1 3.594 542.4 .403 547.8
>> 2 0 1 0 0 1 .608 .590 422.8 8.343 547.1 .448 552.5
>> 2 1 1 0 0 0 .603 .585 425.5 8.933 547.7 .453 553.1
>> 2 1 0 0 0 1 .553 .531 451.9 15.07 553.1 .511 558.5
>> 2 0 0 0 1 1 .430 .403 510.1 29.89 564.0 .651 569.4
>> 2 0 0 1 0 1 .415 .387 516.7 31.68 565.2 .668 570.6
>> 2 1 0 0 1 0 .078 .034 648.9 72.48 585.7 1.054 591.1
>> 2 0 0 1 1 0 .074 .030 650.0 72.87 585.8 1.058 591.2
>> 2 1 0 1 0 0 .053 .008 657.3 75.40 586.8 1.082 592.3
>> 3 0 1 0 1 1 .662 .638 397.3 3.796 542.4 .403 549.7
>> 3 0 1 1 1 0 .660 .636 398.5 4.049 542.7 .406 549.9
>> 3 1 1 0 1 0 .659 .634 399.3 4.207 542.9 .407 550.1
>> 3 1 1 1 0 0 .652 .627 403.3 5.037 543.8 .416 551.0
>> 3 0 1 1 0 1 .652 .627 403.3 5.048 543.8 .416 551.0
>> 3 1 1 0 0 1 .637 .610 412.2 6.915 545.7 .434 553.0
>> 3 1 0 1 0 1 .576 .546 445.0 14.19 552.6 .506 559.9
>> 3 1 0 0 1 1 .564 .533 451.3 15.64 553.9 .521 561.1
>> 3 0 0 1 1 1 .430 .388 516.2 31.89 566.0 .681 573.2
>> 3 1 0 1 1 0 .078 .010 656.7 74.48 587.7 1.102 594.9
>> 4 1 1 1 0 1 .675 .642 394.8 4.296 542.7 .406 551.8
>> 4 1 1 0 1 1 .672 .639 396.6 4.670 543.2 .410 552.2
>> 4 0 1 1 1 1 .664 .631 401.2 5.588 544.2 .420 553.2
>> 4 1 1 1 1 0 .662 .628 402.7 5.886 544.5 .423 553.6
>> 4 1 0 1 1 1 .577 .535 450.1 16.09 554.6 .528 563.6
>> 5 1 1 1 1 1 .677 .636 398.3 6.000 544.4 .422 555.2
>> END DATA.
>>
>> * This works (but it's ugly) *.
>> SORT CASES BY nvars(A) Cp(A) .
>> SPLIT FILE LAYERED BY nvars .
>> LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
>> SPLIT FILE OFF.
>>
>> * This doesn't work *.
>> SORT CASES BY nvars(A) Cp(A) .
>> SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
>> SUMMARIZE
>> /TABLES=x1 TO sbc
>> /FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
>> /TITLE='Best subsets models'
>> /MISSING=VARIABLE
>> /CELLS=NONE.
>> SPLIT FILE OFF.

Frederic Villamayor Forcada

(no subject)

In reply to this post by Marta García-Granero

Hi, Marta,

I think this works...

SORT CASES BY nvars(A) Cp(A) .
RANK
VARIABLES=cp (A) BY nvars /RANK /PRINT=YES
/TIES=CONDENSE .
SPLIT FILE LAYERED BY nvars .
TEMPORARY.
SELECT IF rcp LE 3.
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.

Greetings

Frederic

%%%%%%%%%%%%%%%%%%%%%%%
Frederic Villamayor
Unitat de Bioestadística
Àrea de Desenvolupament Preclínic
CIDF Ferrer Grupo
Juan de Sada, 32
08028 - Barcelona
Espanya

E-mail: [hidden email]
Tel: +34 935093236
Fax: +34 934112764
WWW: www.ferrergrupo.com

%%%%%%%%%%%%%%%%%%%%%%%
"Sanity is not statistical"
1984 (George Orwell)

Marta García-Granero <[hidden email]>
Enviado por: "SPSSX(r) Discussion" <[hidden email]>
21/07/2006 18:18
Por favor, responda a
Marta García-Granero <[hidden email]>

Para
[hidden email]
cc

Asunto

Hi my friends

As part of a never ending macro development for best subsets in
multiple regression, I need a restricted list of a very long file (can
have several thousand rows): the first 3 cases for every value another
variable takes (see file below). I have been using SPLIT FILE with
LIST /CASES=FROM 1 TO 3, and it works, but I wanted a nicer looking
table (easier to modify afterwards with Word), and I tried to use the
same idea, but with REPORT!instead of LIST... It doesn't work :(

Any ideas?

Thanks
Marta

* Small sample of cumbersome dataset *.
DATA LIST LIST/nvars x1 to x5 (6 F3) r2 rho2 res_sd cp aic apc sbc (7
F5.3).
BEGIN DATA
1 0 1 0 0 0 .599 .590 422.6 7.421 546.1 .438 549.8
1 0 0 0 0 1 .390 .376 521.5 32.74 565.1 .667 568.7
1 0 0 0 1 0 .074 .052 642.5 70.91 583.8 1.012 587.5
1 0 0 1 0 0 .042 .020 653.6 74.80 585.4 1.047 589.0
1 1 0 0 0 0 .011 -.012 664.1 78.57 586.8 1.081 590.4
2 0 1 0 1 0 .658 .642 394.9 2.282 541.0 .390 546.4
2 0 1 1 0 0 .648 .631 401.1 3.594 542.4 .403 547.8
2 0 1 0 0 1 .608 .590 422.8 8.343 547.1 .448 552.5
2 1 1 0 0 0 .603 .585 425.5 8.933 547.7 .453 553.1
2 1 0 0 0 1 .553 .531 451.9 15.07 553.1 .511 558.5
2 0 0 0 1 1 .430 .403 510.1 29.89 564.0 .651 569.4
2 0 0 1 0 1 .415 .387 516.7 31.68 565.2 .668 570.6
2 1 0 0 1 0 .078 .034 648.9 72.48 585.7 1.054 591.1
2 0 0 1 1 0 .074 .030 650.0 72.87 585.8 1.058 591.2
2 1 0 1 0 0 .053 .008 657.3 75.40 586.8 1.082 592.3
3 0 1 0 1 1 .662 .638 397.3 3.796 542.4 .403 549.7
3 0 1 1 1 0 .660 .636 398.5 4.049 542.7 .406 549.9
3 1 1 0 1 0 .659 .634 399.3 4.207 542.9 .407 550.1
3 1 1 1 0 0 .652 .627 403.3 5.037 543.8 .416 551.0
3 0 1 1 0 1 .652 .627 403.3 5.048 543.8 .416 551.0
3 1 1 0 0 1 .637 .610 412.2 6.915 545.7 .434 553.0
3 1 0 1 0 1 .576 .546 445.0 14.19 552.6 .506 559.9
3 1 0 0 1 1 .564 .533 451.3 15.64 553.9 .521 561.1
3 0 0 1 1 1 .430 .388 516.2 31.89 566.0 .681 573.2
3 1 0 1 1 0 .078 .010 656.7 74.48 587.7 1.102 594.9
4 1 1 1 0 1 .675 .642 394.8 4.296 542.7 .406 551.8
4 1 1 0 1 1 .672 .639 396.6 4.670 543.2 .410 552.2
4 0 1 1 1 1 .664 .631 401.2 5.588 544.2 .420 553.2
4 1 1 1 1 0 .662 .628 402.7 5.886 544.5 .423 553.6
4 1 0 1 1 1 .577 .535 450.1 16.09 554.6 .528 563.6
5 1 1 1 1 1 .677 .636 398.3 6.000 544.4 .422 555.2
END DATA.

* Listing 3 best models for every number of predictors *.

* This works (but it's ugly) *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars .
LIST /VARS=x1 TO sbc /FORMAT=SINGLE /CASES=FROM 1 TO 3.
SPLIT FILE OFF.

* This doesn't work *.
SORT CASES BY nvars(A) Cp(A) .
SPLIT FILE LAYERED BY nvars . /* I've also tried with SEPARATE *.
SUMMARIZE
/TABLES=x1 TO sbc
/FORMAT=LIST NOCASENUM NOTOTAL LIMIT=3
/TITLE='Best subsets models'
/MISSING=VARIABLE
/CELLS=NONE.
SPLIT FILE OFF.