SPSSX Discussion

Re: selecting lowest scores if missing data or ties

Classic

List

Threaded

2 messages Options

hillel vardi

Re: selecting lowest scores if missing data or ties

Shalom

Using restructure it is vary easy to find min max location in a file .
Here is a example of finding the minimum of 5 quiz and deleting the
lowest one .
If you need more control on the deleting procces you can add more
rolls like if quiz3 and quiz5 are the lowest select quez5 .

input program .
loop ii=1 to 10 .
compute quiz1= trunc(uniform(6)).
compute quiz2= trunc(uniform(6)).
compute quiz3= trunc(uniform(6)).
compute quiz4= trunc(uniform(6)).
compute quiz5= trunc(uniform(6)).
end case .
end loop .
end file .
end input program .

VARSTOCASES /MAKE quiz FROM quiz1 quiz2 quiz3 quiz4 quiz5
/INDEX = Index1(5)
/KEEP = ii
/NULL = KEEP.
sort cases by ii quiz .
add files file=* /by ii/ first=start .
if start eq 1 #seq=0 .
compute #seq=sum(#seq,1).
compute seq=#seq.
select if seq gt 1 .
sort cases by ii index1 .
CASESTOVARS
/ID = ii
/INDEX = Index1
/drop=seq start
/GROUPBY = VARIABLE .
list .

Hillel Vardi
Ben Gurion U
Israel

Dale Glaser wrote:

> Hi all....based on Levesques' syntax for selecting maximum score given the case, I was trying same for the case when there are 5 quizzes and isolating (and deleting) the lowest score.......so assuming one has sorted by ID I used the following syntax (appended below)......then I did an incredibly inelegant way of flagging the cases with the lowest scores (which will be deleted in the total summed scoring of the quizzes) by just recreating the initial raw score and then basically mapping the created ranked variable (where a ranked value of 5 is the lowest score) with the raw score and then using some implausible integer (e.g., -1) and code for missing......so though a little cumbersome this works fine.........however, if there is missing data, say a student takes only four of the quizzes, and given they get to drop one quiz, that student will just get a summed score for all four quizzes...the lowest score for that student will not be deleted....so any suggestions as to not
> coding for the lowest score if there is any missing data (akin to using a compute statement such as: 'sum.4' when at least four scores must be answered to compute a score).
>
> Also, what if there is the full complement of data for the five quizzes, but there are ties
> for the lowest scores:
>
> 1
> 1
> 4
> 6
> 7
> ......when I construct the vector for the ranked variable, as you would guess, it will show up as (for now, the value of 1 being the lowest score):
>
> 1
> 1
> 3
> 4
> 5
>
> ............what I would like to do is somehow have a unique number and delete only one of the lowest numbers.................any suggestions?
>
> thank you very much for your time.....dale
>
>
>
> *********SYNTAX***
>
> ***five quizzes****
> vector quiz = q1 to q5 .
> loop quizvar = 1 to 5.
> compute quizrate = quiz(quizvar) .
> xsave outfile = 'C:\temp1.sav'
> / keep = id quizvar quizrate.
> end loop.
> execute.
>
> ***get the temp file****
>
> rank variables = quizrate (d) by id / ties = low / rank into quizrank .
> numeric RANKq1 RANKq2 RANKq3 RANKq4 rankq5 (f4.1).
> vector quizr = rankq1 to rankq5 .
> compute quizr(quizvar) = quizrank.
> execute.
> aggregate outfile = *
> /presorted / break = id /RANKq1 RANKq2 RANKq3 RANKq4 rankq5 = min(rankq1 to rankq5).
> execute.
> MATCH FILES /FILE = 'C:\Documents\lowscore.sav'
> /FILE = * /BY id .
> execute.
>
> **converts lowest score (with value of '5')***and can do this for each variable**
>
> ****best to autorecode or rename so don't write over old variables....****
>
> compute q1new=q1.
> compute q2new=q2.
> compute q3new=q3.
> compute q4new=q4.
> compute q5new=q5.
> execute.
> if (rankq1 eq 5) q1new=-1.
> if (rankq2 eq 5) q2new=-1.
> if (rankq3 eq 5) q3new=-1.
> if (rankq4 eq 5) q4new=-1.
> if (rankq5 eq 5) q5new=-1.
> execute.
> missing values q1new to q5new (-1).
> execute.
> freq var=q1new to q5new.
> compute totquiz=sum(q1new to q5new).
> list var=q1new to q5new totquiz.
>
>
>
>
>
> Dale Glaser, Ph.D.
> Principal--Glaser Consulting
> Lecturer--SDSU/USD/CSUSM/AIU
> 4003 Goldfinch St, Suite G
> San Diego, CA 92103
> phone: 619-220-0602
> fax: 619-220-0412
> email: [hidden email]
> website: www.glaserconsult.com
>
>

Richard Ristow

Re: selecting lowest scores if missing data or ties

At 07:22 PM 6/14/2006, Dale Glaser asked, but it's hard to quote. Let
me see if I understand:

* Students are given 5 quizzes, and a score on each. (In the test data,
the scores are from 1 to 9.) Quizzes may be missed, in which case the
corresponding score is missing.

* You want to know
- The lowest score each student received, counting 'missing' as the
lowest possible - if the student missed any quiz, the 'lowest' score is
"missing".
- The first quiz on which that student received that lowest score
- The student's mean score, after (one instance of) the lowest score
has been dropped. (This is simply the mean score, if any quiz has been
skipped.)

Hillel Vardi posted a neat wide-> long-> wide solution. (That is, from
each student record, it creates a separate record for each quiz, drops
the one with the lowest score, and reassembles the student record.)

It replaces the lowest score with system-missing. That may be what you
want, but I'm not sure it'll handle missing quiz scores the way you
want to. And it loses information: you no longer know what was the
lowest score, only that (one instance of it) is no longer in the list.

Anyway, here's a 'wide' solution, processing within each student's
record. (It uses VECTOR/LOOP logic, which is a common alternative to
VARSTOCASES/CASESTOVAR logic.) It does not eliminate or change the
lowest score. However, it calculates,
- On which quiz the lowest (or missing) score first occurred
- What that lowest score was - missing, if that's what it was
- The mean of all quizzes the student took (variable MEAN.5)
- The mean of four quizzes, dropping one instance of the lowest score,
if the student took all five. (If the student missed any quiz, the two
scores are the same.)

This is SPSS draft output:

* ....................................................... .
LIST.

List
|-------------------------|------------------------|
|Output Created |16-JUN-2006 13:40:22 |
|-------------------------|------------------------|
KID QUIZ1 QUIZ2 QUIZ3 QUIZ4 QUIZ5

01 7 5 6 5 5
02 7 7 5 6 6
03 4 5 5 3 .
04 4 7 6 5 7
05 2 2 2 2 3
06 4 2 . 4 2
07 6 6 6 3 4
08 6 4 5 3 2
09 7 5 8 4 4
10 6 8 7 7 4

Number of cases read: 10 Number of cases listed: 10

* Find the lowest score, and where it first occurs ..... .

NUMERIC Lo_Quiz (N2)
/Lo_Score (F2).
VAR LABEL
Lo_Quiz 'First quiz where lowest score occurs'
Lo_Score 'Lowest observed quiz score'.

/* Old trick: Start with a "minimum" larger than any */
/* value which could occur */
COMPUTE Lo_Score = 99.
COMPUTE Lo_Quiz = 0.

VECTOR Quizes=Quiz1 TO Quiz5.

LOOP #QNum = 1 TO 5.
. DO IF MISSING(Quizes(#QNum)).
. COMPUTE Lo_Quiz = #QNum.
. COMPUTE LO_Score = $SYSMIS.
. ELSE IF Quizes(#QNum) LT Lo_Score.
. COMPUTE Lo_Quiz = #QNum.
. COMPUTE Lo_Score = Quizes(#QNum).
. END IF.
END LOOP IF MISSING(Lo_Score).

* Average, dropping (1 instance of) lowest score ..... .
NUMERIC Mean.5 Mean.4 (F5.2).
VAR LABEL
Mean.5 'Mean of all five quiz scores'
Mean.4 'Mean, dropping the lowest quiz score'.

VECTOR #QAdj (5,F2).
LOOP #QNum = 1 To 5.
. COMPUTE #QAdj(#QNum) = Quizes(#QNum).
END LOOP.
COMPUTE #Qadj(LO_Quiz) = $SYSMIS.

COMPUTE Mean.5 = MEAN(Quiz1 TO Quiz5).
COMPUTE Mean.4 = MEAN(#QAdj1 TO #QAdj5).

LIST.

List
|-------------------------|------------------------|
|Output Created |16-JUN-2006 13:40:22 |
|-------------------------|------------------------|
KID QUIZ1 QUIZ2 QUIZ3 QUIZ4 QUIZ5 Lo_Quiz Lo_Score Mean.5 Mean.4

01 7 5 6 5 5 02 5 5.60 5.75
02 7 7 5 6 6 03 5 6.20 6.50
03 4 5 5 3 . 05 . 4.25 4.25
04 4 7 6 5 7 01 4 5.80 6.25
05 2 2 2 2 3 01 2 2.20 2.25
06 4 2 . 4 2 03 . 3.00 3.00
07 6 6 6 3 4 04 3 5.00 5.50
08 6 4 5 3 2 05 2 4.00 4.50
09 7 5 8 4 4 04 4 5.60 6.00
10 6 8 7 7 4 05 4 6.40 7.00

Number of cases read: 10 Number of cases listed: 10