Statistical significance without sampling?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Statistical significance without sampling?

SB-9
Hi,

I have recently produced a report of student retention rates at a
university. I have data on all students, hence no sampling has taken place.
However, I have been asked if a change in student retention rates from one
year to the next is statistically significance. To me, this question does
not make sense, as statistical significance is a measurement of the
probability that a sample is representative of a population; and in this
case we have information on all students. Am I missing something? Can in be
meaningful to test for statistical significance in this situation?

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical significance without sampling?

rich reeves
Hi Scott,
Many institutions do compare these rates without regard to the sampling
issues you mentioned.  There is a growing body of evidence on the inaccuracy
that arises when doing this.  Recently, Titus, 2007, Research in Higher
Education, Vol 48 wrote a paper on this "Detecting Selection Bias, Using
Propensity Score Matching, and Estimating Treatment Effects: Application to
the Private returns to a Master's Degree.  If you can get a copy it would be
worth your time.  Also, there are a number of medical publications that have
been published on this topic over the past decade or so.

Good luck.
rich

On 5/7/07, SB <[hidden email]> wrote:

>
> Hi,
>
> I have recently produced a report of student retention rates at a
> university. I have data on all students, hence no sampling has taken
> place.
> However, I have been asked if a change in student retention rates from one
> year to the next is statistically significance. To me, this question does
> not make sense, as statistical significance is a measurement of the
> probability that a sample is representative of a population; and in this
> case we have information on all students. Am I missing something? Can in
> be
> meaningful to test for statistical significance in this situation?
>
> Thanks.
>
Reply | Threaded
Open this post in threaded view
|

Re: Statistical significance without sampling?

Richard Ristow
In reply to this post by SB-9
At 01:11 PM 5/7/2007, SB wrote:

>I have recently produced a report of student retention rates at a
>university. I have data on all students, hence no sampling has taken
>place. However, I have been asked if a change in student retention
>rates from one year to the next is statistically significance. To me,
>this question does not make sense, as statistical significance is a
>measurement of the probability that a sample is representative of a
>population; and in this case we have information on all students. Am I
>missing something? Can in be meaningful to test for statistical
>significance in this situation?

If I understand correctly, rich reeves is addressing a different
question: whether the test is commonly done correctly, given that you
accept it can be done at all.

The question you ask is a recurring one and a fairly deep one, so it's
worth an occasional revisit. You may want to look at thread
"Significant difference - Means", on this list Tue, 19 Dec 2006
<09:05:53 +0100> ff., a discussion that went over many issues including
this one.

Your data can be regarded from two points of view, both of which are
legitimate but which have different implications. I can only say,
firmly, you should know what point of view you are taking; what you
think it means; and why its implications for inferential analysis are,
what they are.

One point of view is, "[we] have data on all students, hence no
sampling has taken place." Here, you're taking your universe of
discourse as the students at your university. Then, there is no
question of comparing using inferential statistics. You have the exact
values (presumably) for this year and last year; and their difference
is, definitively, whatever it is.

But another point of view is to regard each year's experience as a
(multi-dimensional) sample point, in a space of possible experience.
That is, the set of students enrolled each year is a sample, subject to
random fluctuations, from the population that might be considered for
enrollment. Their experiences at the school are a sample, subject to
random fluctuations, from the possible experiences and happenings to
students at the school. And the outcome, retention or not, is
influenced by these factors with random elements. So the outcome
becomes a measure subject to random influence, and a legitimate subject
for inferential statistics.

If you take this view (it's one I have a good deal of sympathy for),
you must remember you are not comparing this year and last year as
exact experiences, but as samples in the probability space of likely
experience, given conditions bearing on the students of each year.

Then, of course, once you've decided this is a legitimate subject for
inferential statistical analysis, you have to get into methodology -
what I take to be rich reeves's questions. Among other things, your
enrolled students are certainly not a sample selected at random
equi-probably from a definable population of candidates. But here we
get from *whether* to *how*, and many others can do better than I, for
your problem.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical significance without sampling?

statisticsdoc
Richard, as usual, is spot on here.  In order to conduct significance tests,
one has to work within an inferential framework, and decide that students in
specific years do not exhaust the population of interest.

Best,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com


-----Original Message-----
From: Richard Ristow [mailto:[hidden email]]
Sent: Tuesday, May 08, 2007 4:49 PM
To: SB; [hidden email]
Cc: Statisticsdoc; rich reeves
Subject: Re: Statistical significance without sampling?


At 01:11 PM 5/7/2007, SB wrote:

>I have recently produced a report of student retention rates at a
>university. I have data on all students, hence no sampling has taken
>place. However, I have been asked if a change in student retention
>rates from one year to the next is statistically significance. To me,
>this question does not make sense, as statistical significance is a
>measurement of the probability that a sample is representative of a
>population; and in this case we have information on all students. Am I
>missing something? Can in be meaningful to test for statistical
>significance in this situation?

If I understand correctly, rich reeves is addressing a different
question: whether the test is commonly done correctly, given that you
accept it can be done at all.

The question you ask is a recurring one and a fairly deep one, so it's
worth an occasional revisit. You may want to look at thread
"Significant difference - Means", on this list Tue, 19 Dec 2006
<09:05:53 +0100> ff., a discussion that went over many issues including
this one.

Your data can be regarded from two points of view, both of which are
legitimate but which have different implications. I can only say,
firmly, you should know what point of view you are taking; what you
think it means; and why its implications for inferential analysis are,
what they are.

One point of view is, "[we] have data on all students, hence no
sampling has taken place." Here, you're taking your universe of
discourse as the students at your university. Then, there is no
question of comparing using inferential statistics. You have the exact
values (presumably) for this year and last year; and their difference
is, definitively, whatever it is.

But another point of view is to regard each year's experience as a
(multi-dimensional) sample point, in a space of possible experience.
That is, the set of students enrolled each year is a sample, subject to
random fluctuations, from the population that might be considered for
enrollment. Their experiences at the school are a sample, subject to
random fluctuations, from the possible experiences and happenings to
students at the school. And the outcome, retention or not, is
influenced by these factors with random elements. So the outcome
becomes a measure subject to random influence, and a legitimate subject
for inferential statistics.

If you take this view (it's one I have a good deal of sympathy for),
you must remember you are not comparing this year and last year as
exact experiences, but as samples in the probability space of likely
experience, given conditions bearing on the students of each year.

Then, of course, once you've decided this is a legitimate subject for
inferential statistical analysis, you have to get into methodology -
what I take to be rich reeves's questions. Among other things, your
enrolled students are certainly not a sample selected at random
equi-probably from a definable population of candidates. But here we
get from *whether* to *how*, and many others can do better than I, for
your problem.
Reply | Threaded
Open this post in threaded view
|

Re: Statistical significance without sampling?

Bob Schacht-3
In reply to this post by Richard Ristow
At 10:49 AM 5/8/2007, Richard Ristow wrote:

>At 01:11 PM 5/7/2007, SB wrote:
>
>>I have recently produced a report of student retention rates at a
>>university. I have data on all students, hence no sampling has taken
>>place. However, I have been asked if a change in student retention
>>rates from one year to the next is statistically significance. To me,
>>this question does not make sense, as statistical significance is a
>>measurement of the probability that a sample is representative of a
>>population; and in this case we have information on all students. Am I
>>missing something? Can in be meaningful to test for statistical
>>significance in this situation?
>
>If I understand correctly, rich reeves is addressing a different
>question: whether the test is commonly done correctly, given that you
>accept it can be done at all.
>
>The question you ask is a recurring one and a fairly deep one, so it's
>worth an occasional revisit. You may want to look at thread
>"Significant difference - Means", on this list Tue, 19 Dec 2006
><09:05:53 +0100> ff., a discussion that went over many issues including
>this one.
>
>Your data can be regarded from two points of view, both of which are
>legitimate but which have different implications. I can only say,
>firmly, you should know what point of view you are taking; what you
>think it means; and why its implications for inferential analysis are,
>what they are.
>
>One point of view is, "[we] have data on all students, hence no
>sampling has taken place." Here, you're taking your universe of
>discourse as the students at your university. Then, there is no
>question of comparing using inferential statistics. You have the exact
>values (presumably) for this year and last year; and their difference
>is, definitively, whatever it is.
>
>But another point of view is to regard each year's experience as a
>(multi-dimensional) sample point, in a space of possible experience.
>That is, the set of students enrolled each year is a sample, subject to
>random fluctuations, from the population that might be considered for
>enrollment. Their experiences at the school are a sample, subject to
>random fluctuations, from the possible experiences and happenings to
>students at the school. And the outcome, retention or not, is
>influenced by these factors with random elements. So the outcome
>becomes a measure subject to random influence, and a legitimate subject
>for inferential statistics.

Richard,
Thank you for your thoughtful review of these questions. May I suggest that
there is a THIRD possibility to consider?
That is, that if one measured retention rates on a different day, the
answers might be different? In other words, this might be a measurement
issue. Of course, if the date of ascertainment is a foolproof date
(perhaps, last date of the semester?), then the issue is moot. But since
we're discussing a potentially larger set of circumstances, I thought it
might be well to bring this up. So, for example, if the scores in question
were not retention rates, but, say, SAT scores, one might argue that the
"sample" might be defined not only by which students took it, but what day
they took it on, such that an individual's test score might differ from one
day to the next (due to circumstantial factors such as hangovers,
relationship problems, extracurricular events, etc.)

Bob Schacht


>If you take this view (it's one I have a good deal of sympathy for),
>you must remember you are not comparing this year and last year as
>exact experiences, but as samples in the probability space of likely
>experience, given conditions bearing on the students of each year.
>
>Then, of course, once you've decided this is a legitimate subject for
>inferential statistical analysis, you have to get into methodology -
>what I take to be rich reeves's questions. Among other things, your
>enrolled students are certainly not a sample selected at random
>equi-probably from a definable population of candidates. But here we
>get from *whether* to *how*, and many others can do better than I, for
>your problem.

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Stats in administrative settings

SB-9
In reply to this post by statisticsdoc
Many thanks to everyone for their helpful responses to my recent question regarding "Statistical significance without sampling?". Assuming we take the position that this 'population' is actually a sample, I have a related question, which begins to venture from the "*whether* to *how*":

What are best practices for applying tests of statistical significance within an administrative rather than an academic or research setting? Any resources that can be recommended for this issue?

To give a concrete example, co-workers simply want to get a sense of if the  following changes in retention are statistically significant or due to random fluctuations:  83.4 (1998)    83.3(1999)    86.2(2000)    85.4(2001)    87.5(2001)    86.9(2002)    85.8(2003)    81.7(2004). Each cohort's n = 1100.

In this situation I have said that "a 2% or greater change in unlikely to have been due to mere chance, and is due at least in part to a change in one of the underlying variables that influence retention." I have based this on the fact that in this situation a 2% difference will produce Pearson’s Chi-Square Asymp. Sig (2-sided) value of .337. Is this a sensible approach?

Any other ideas for dealing with reporting stats in administrative settings; i.e. (1) the consumers have little or no knowledge of statistics (not that I am an expert myself), (2) there is little time that can be put into producing the report, (3) consumers are so rushed that they have little time to look at reports with any depth....In other words, I am looking for something workable and simple given the circumstances. However, am also interested in how to 'do this the right way', assuming these constraints did not exist.

Regards,

Scott


Statisticsdoc <[hidden email]> wrote: Richard, as usual, is spot on here.  In order to conduct significance tests,
one has to work within an inferential framework, and decide that students in
specific years do not exhaust the population of interest.

Best,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com


-----Original Message-----
From: Richard Ristow [mailto:[hidden email]]
Sent: Tuesday, May 08, 2007 4:49 PM
To: SB; [hidden email]
Cc: Statisticsdoc; rich reeves
Subject: Re: Statistical significance without sampling?


At 01:11 PM 5/7/2007, SB wrote:

>I have recently produced a report of student retention rates at a
>university. I have data on all students, hence no sampling has taken
>place. However, I have been asked if a change in student retention
>rates from one year to the next is statistically significance. To me,
>this question does not make sense, as statistical significance is a
>measurement of the probability that a sample is representative of a
>population; and in this case we have information on all students. Am I
>missing something? Can in be meaningful to test for statistical
>significance in this situation?

If I understand correctly, rich reeves is addressing a different
question: whether the test is commonly done correctly, given that you
accept it can be done at all.

The question you ask is a recurring one and a fairly deep one, so it's
worth an occasional revisit. You may want to look at thread
"Significant difference - Means", on this list Tue, 19 Dec 2006
<09:05:53 +0100> ff., a discussion that went over many issues including
this one.

Your data can be regarded from two points of view, both of which are
legitimate but which have different implications. I can only say,
firmly, you should know what point of view you are taking; what you
think it means; and why its implications for inferential analysis are,
what they are.

One point of view is, "[we] have data on all students, hence no
sampling has taken place." Here, you're taking your universe of
discourse as the students at your university. Then, there is no
question of comparing using inferential statistics. You have the exact
values (presumably) for this year and last year; and their difference
is, definitively, whatever it is.

But another point of view is to regard each year's experience as a
(multi-dimensional) sample point, in a space of possible experience.
That is, the set of students enrolled each year is a sample, subject to
random fluctuations, from the population that might be considered for
enrollment. Their experiences at the school are a sample, subject to
random fluctuations, from the possible experiences and happenings to
students at the school. And the outcome, retention or not, is
influenced by these factors with random elements. So the outcome
becomes a measure subject to random influence, and a legitimate subject
for inferential statistics.

If you take this view (it's one I have a good deal of sympathy for),
you must remember you are not comparing this year and last year as
exact experiences, but as samples in the probability space of likely
experience, given conditions bearing on the students of each year.

Then, of course, once you've decided this is a legitimate subject for
inferential statistical analysis, you have to get into methodology -
what I take to be rich reeves's questions. Among other things, your
enrolled students are certainly not a sample selected at random
equi-probably from a definable population of candidates. But here we
get from *whether* to *how*, and many others can do better than I, for
your problem.




---------------------------------
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.
Reply | Threaded
Open this post in threaded view
|

Statistical tests for check lists

Bob Schacht-3
In reply to this post by Bob Schacht-3
I have some check-list questions that are of the "check all that apply"
variety, such as "Do you need any of the following help with work?" The
list consists of half a dozen items, plus "other" and "None of these".

Suppose I now want to compare the responses to this question by two groups.
According to my statistical training, I can't really consider this a one
independent variable, one dependent variable analysis because one of the
variables can have multiple responses. My training says that, in essence, I
have to treat each response in the check list as a separate variable, and
test the association of each response by group. Is this still the case, or
have new multiple response methods been developed so that I can treat the
multiple response variable as a single variable?

A second question also bedevils me: the meaning of an unchecked box, with a
multiple response question. There is a difference between a blank box, if
all other boxes are also blank, and a blank box, if at least one other box
is checked. In the former case, it would appear that the question has been
skipped, and the blanks should be treated as missing data; but in the
second case, a blank has an actual negative value as a rejected
alternative. I neglected to define coding rules to differentiate between
these two possibilities. Any suggestions as to snazzy ways to deal with
this situation so as to avoid this ambiguity?

Bob Schacht

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Re: Statistical tests for check lists

Beadle, ViAnn
There are some tests of proportions within custom tables which might apply to your situation. Many marketing research organizations have come up with their own analysis techniques which are proprietary and confidential. You might do a search in the Marketing Research literature for some ideas.

Unless your survey instrument has a check box for the equivalent of None in a set of check boxes you cannot be sure that "no checks" really means no checks. An alternative is to pose the question as a series of yes/no choices for each subset.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bob Schacht
Sent: Thursday, May 10, 2007 7:46 PM
To: [hidden email]
Subject: Statistical tests for check lists

I have some check-list questions that are of the "check all that apply"
variety, such as "Do you need any of the following help with work?" The
list consists of half a dozen items, plus "other" and "None of these".

Suppose I now want to compare the responses to this question by two groups.
According to my statistical training, I can't really consider this a one
independent variable, one dependent variable analysis because one of the
variables can have multiple responses. My training says that, in essence, I
have to treat each response in the check list as a separate variable, and
test the association of each response by group. Is this still the case, or
have new multiple response methods been developed so that I can treat the
multiple response variable as a single variable?

A second question also bedevils me: the meaning of an unchecked box, with a
multiple response question. There is a difference between a blank box, if
all other boxes are also blank, and a blank box, if at least one other box
is checked. In the former case, it would appear that the question has been
skipped, and the blanks should be treated as missing data; but in the
second case, a blank has an actual negative value as a rejected
alternative. I neglected to define coding rules to differentiate between
these two possibilities. Any suggestions as to snazzy ways to deal with
this situation so as to avoid this ambiguity?

Bob Schacht

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Re: Statistical tests for check lists

Dennis Deck
In reply to this post by Bob Schacht-3
There is no one right answer to your questions as it depends on the
purpose of the analysis, the content of the items, and the distribution.

For descriptive purposes I typically treat each as a separate question.
However, for inferential analysis it may be appropriate to create a
scale by summing them (assuming they are coded 0/1) or counting the
checks.  But check the distribution of the derived scores as it may not
be anything close to normal.

As for distinguishing unchecked from missing, you have mentioned that
"None of the above" was a choice.  One approach (assuming default is
missing and a check is coded as 1:

COUNT #Nchecked = List1 to List10 (1) .
DO IF (None=1 or #NChecked>1) .
+  RECODE List1 to List10 (missing=0) .
END IF .

But for some surveys

Dennis Deck, PhD
RMC Research Corporation
[hidden email]

-----Original Message-----
From: Bob Schacht [mailto:[hidden email]]
Sent: Thursday, May 10, 2007 6:46 PM
Subject: Statistical tests for check lists

I have some check-list questions that are of the "check all that apply"
variety, such as "Do you need any of the following help with work?" The
list consists of half a dozen items, plus "other" and "None of these".

Suppose I now want to compare the responses to this question by two
groups.
According to my statistical training, I can't really consider this a one
independent variable, one dependent variable analysis because one of the
variables can have multiple responses. My training says that, in
essence, I
have to treat each response in the check list as a separate variable,
and
test the association of each response by group. Is this still the case,
or
have new multiple response methods been developed so that I can treat
the
multiple response variable as a single variable?

A second question also bedevils me: the meaning of an unchecked box,
with a
multiple response question. There is a difference between a blank box,
if
all other boxes are also blank, and a blank box, if at least one other
box
is checked. In the former case, it would appear that the question has
been
skipped, and the blanks should be treated as missing data; but in the
second case, a blank has an actual negative value as a rejected
alternative. I neglected to define coding rules to differentiate between
these two possibilities. Any suggestions as to snazzy ways to deal with
this situation so as to avoid this ambiguity?

Bob Schacht

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Re: Stats in administrative settings

Richard Ristow
In reply to this post by SB-9
(Marta and Stephen: I'm copying this to you,
because I'd be fascinated by your reactions.)

At 06:13 PM 5/10/2007, Scott Bucher wrote:

>What are best practices for applying tests of
>statistical significance within an
>administrative rather than an academic or
>research setting? Any resources that can be recommended for this issue?

The short answer is, be careful with them. In
particular, don't sell colleagues on the idea a
result is meaningful, simply because a test
produces a very low p-value. First, consider
sources of error not accounted for by the test
you're using. Second, treat PRACTICAL, not just
statistical, significance: The differences you
see may be real, but be small enough to have no
bearing on, say, how the institution is running.

A side issue: I'm not checking your arithmetic, but you write,

>I have said [below] that "a 2% or greater change
>in unlikely to have been due to mere chance." I
>have based this on the fact that in this
>situation a 2% difference will produce Pearson’s
>Chi-Square Asymp. Sig (2-sided) value of .337.
>Is this a sensible approach?

The ".337" is probably the wrong number.
Certainly, you wouldn't report significance at
p=.337; on the other hand, with your sample size,
a 2% difference probably would be observable.

Going on -

>Co-workers want to get a sense of if
>the  following changes in retention are
>statistically significant or due to random fluctuations:
>   83.4(1998)     83.3(1999)    86.2(2000)    85.4(2001)
>   87.5(2001)     86.9(2002)    85.8(2003)    81.7(2004).
>Each cohort's n = 1100.

To illustrate a principle: it's always good to
format your data so it's easy to read, and then
*read* it. For example, above, year 2001 appears
twice; you probably wouldn't catch that, without
lining the figures up like this, and looking.
(Anyway, I didn't.) From here on, I'm 'promoting'
the second row one year, like this:

   83.4(1998)     83.3(1999)    86.2(2000)    85.4(2001)
   87.5(2002)     86.9(2003)    85.8(2004)    81.7(2005).

First, always look at a statistic, as well as
testing it. My reaction to those numbers is, "It
doesn't look like much is happening. Well, 2004
may be low; I wonder if that means anything?"

In your best year (promoted 2002), retention was
7% better than in your worst (promoted 2005):
87.5% vs. 81.7%. Is that large enough that you
think it reflects on the institution's relative
success in those years? (Without "2005", the best
is 5% higher than the worst: 87.5% vs. 83.3%.)

Then, you write "each cohort's n = 1100".
Immediately I get suspicious: you have EXACTLY
the same class size every year? Or, where did
that '1100' come from? (Illustrating, that is,
that one part of data analysis is, looking for
things that there's question, simply, whether to believe.)

Going on: I've converted your data so it can be
cross-tabulated. (Syntax to do that is in the
Appendix, below.) Here's how it comes out (and
this is a good many lines, for a posting):


CROSSTABS
   /TABLES=Year  BY Outcome
   /FORMAT= AVALUE TABLES
   /STATISTIC=CHISQ CTAU
   /CELLS= COUNT ROW ASRESID
   /COUNT ROUND CELL .

Crosstabs
|-----------------------------|---------------------------|
|Output Created               |15-MAY-2007 20:31:56       |
|-----------------------------|---------------------------|
Case Processing Summary [suppressed - no missing data]

   Year
* Outcome  Whether retained, or lost
Crosstabulation
|----|----|---------------|-----------------------|-----------|
|    |    |               |Outcome  Whether       |Total      |
|    |    |               |retained, or lost      |           |
|    |    |               |---------------|-------|-----------|
|    |    |               |1  Retained    |2  Lost|1  Retained|
|----|----|---------------|---------------|-------|-----------|
|Year|1998|Count          |917            |183    |1100       |
|    |    |% within Year  |83.4%          |16.6%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |-1.6           |1.6    |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |1999|Count          |916            |184    |1100       |
|    |    |% within Year  |83.3%          |16.7%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |-1.7           |1.7    |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2000|Count          |948            |152    |1100       |
|    |    |% within Year  |86.2%          |13.8%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |1.2            |-1.2   |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2001|Count          |939            |161    |1100       |
|    |    |% within Year  |85.4%          |14.6%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |.3             |-.3    |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2002|Count          |963            |138    |1101       |
|    |    |% within Year  |87.5%          |12.5%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |2.4            |-2.4   |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2003|Count          |956            |144    |1100       |
|    |    |% within Year  |86.9%          |13.1%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |1.9            |-1.9   |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2004|Count          |944            |156    |1100       |
|    |    |% within Year  |85.8%          |14.2%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |.8             |-.8    |           |
|    |    |Residual       |               |       |           |
|    |----|---------------|---------------|-------|-----------|
|    |2005|Count          |899            |201    |1100       |
|    |    |% within Year  |81.7%          |18.3%  |100.0%     |
|    |    |---------------|---------------|-------|-----------|
|    |    |Adjusted       |-3.3           |3.3    |           |
|    |    |Residual       |               |       |           |
|----|----|---------------|---------------|-------|-----------|
|Total    |Count          |7482           |1319   |8801       |
|         |% within Year  |85.0%          |15.0%  |100.0%     |
|---------|---------------|---------------|-------|-----------|

Chi-Square Tests
|---------------|---------|--|---------------|
|               |Value    |df|Asymp. Sig.    |
|               |         |  |(2-sided)      |
|---------------|---------|--|---------------|
|Pearson        |24.433(a)|7 |.001           |
|Chi-Square     |         |  |               |
|---------------|---------|--|---------------|
|Likelihood     |24.189   |7 |.001           |
|Ratio          |         |  |               |
|---------------|---------|--|---------------|
|Linear-by-Linea|.159     |1 |.690           |
|r Association  |         |  |               |
|---------------|---------|--|---------------|
|N of Valid     |8801     |  |               |
|Cases          |         |  |               |
|---------------|---------|--|---------------|
a 0 cells (.0%) have expected count less than 5.
   The minimum expected count is 164.86.

Symmetric Measures
|---------------|---------|-----|-----------|------------|-------|
|               |         |Value|Asymp. Std.|Approx. T(b)|Approx.|
|               |         |     |Error(a)   |            |Sig.   |
|---------------|---------|-----|-----------|------------|-------|
|Ordinal by     |Kendall's|-.003|.009       |-.384       |.701   |
|Ordinal        |tau-c    |     |           |            |       |
|---------------|---------|-----|-----------|------------|-------|
|N of Valid Cases         |8801 |           |            |       |
|-------------------------|-----|-----------|------------|-------|
a Not assuming the null hypothesis.
b Using the asymptotic standard error assuming the null hypothesis.
..........................................
Here we go:

A. The chi-square test certainly is significant: p=.001. Great, but
1. The chi-square test is based on an assumption
of random effects for each student. Watch for
'common-mode' random effects: real effects that
can raise or suppress retention one year, but
which can themselves be put down to random
variation between years. (Were economic
conditions a bit worse in 2005? Was the cohort of
potential students small that year, and other
institutions grabbed more of the best candidates?)
2. The ordinal test (Kendall's tau-c) shows no
hint of significance, i.e. there's no evidence of
a trend. That's certainly a reason to suspect
random year-to-year fluctuations. (I don't know
whether tau-c is the best ordinal test in this
instance. I didn't look up Marta García-Granero's
tutorials on non-parametric tests, but that's the way to go.)

B. A better criterion than "2% difference" is the
adjusted standardized residuals for the cells:
look for absolute values greater than 2. By that
standard, retention in "2002" is high, and in "2005" notably low.

C. "2005" keeps standing out. Now, a true
post-hoc test for differences is the right way to
go; I don't know how you do one, on contingency
tables. But as a simple-minded one, I re-ran without "2005":

Chi-Square Tests
|---------------|---------|--|---------------|
|               |Value    |df|Asymp. Sig.    |
|               |         |  |(2-sided)      |
|---------------|---------|--|---------------|
|Pearson        |14.148(a)|6 |.028           |
|Chi-Square     |         |  |               |
|---------------|---------|--|---------------|
|Likelihood     |14.030   |6 |.029           |
|Ratio          |         |  |               |
|---------------|---------|--|---------------|
|Linear-by-Linea|8.024    |1 |.005           |
|r Association  |         |  |               |
|---------------|---------|--|---------------|
|N of Valid     |7701     |  |               |
|Cases          |         |  |               |
|---------------|---------|--|---------------|
a 0 cells (.0%) have expected count less than 5.
   The minimum expected count is 159.69.

Symmetric Measures
|---------------|---------|-----|-----------|------------|-------|
|               |         |Value|Asymp. Std.|Approx. T(b)|Approx.|
|               |         |     |Error(a)   |            |Sig.   |
|---------------|---------|-----|-----------|------------|-------|
|Ordinal by     |Kendall's|-.026|.009       |-2.788      |.005   |
|Ordinal        |tau-c    |     |           |            |       |
|---------------|---------|-----|-----------|------------|-------|
|N of Valid Cases         |7701 |           |            |       |
|-------------------------|-----|-----------|------------|-------|
a Not assuming the null hypothesis.
b Using the asymptotic standard error assuming the null hypothesis.

Well, well, well. Now the chi-square doesn't look
like much (p=.028 is pretty much nothing, with
this sample size), but now the ordinal-by-ordinal
measure (tau-c) is looking real: p=.005. (The
negative value means a trend toward BETTER
retention, since "lost" is coded higher than "retained".)

SO: recognizing that all this analysis raises
questions of multiple comparisons (though
probably the p-values are good enough to stand up
after corrections), it looks like: there's a
trend of improving retention, interrupted by year
"2005", the poorest retention on record.

NOW: what does this mean? And here, you can't do
it just by statistics; you have to start knowing
what seems to have been happening to the
institution, and the environment.

Fun, isn't it? Go forth, be fruitful and multiply
(and divide, and take means and standard errors),
-Richard


=======================================================
APPENDIX: Test data, conversion to counts, and listings
=======================================================
*  ................................................................. .
*  .................   Test data               ..................... .
DATA LIST FREE / InString (A12).
BEGIN DATA
    83.4(1998)    83.3(1999)    86.2(2000)    85.4(2001)
    87.5(2002)    86.9(2003)    85.8(2004)    81.7(2005)
END DATA.

NUMERIC Year   (F4)
         Retent (PCT6.1).

COMPUTE Year   = NUMBER(SUBSTR(InString,6,4),F4).
COMPUTE Retent = NUMBER(SUBSTR(InString,1,4),F4.1).

NUMERIC Retained Lost    (F6).

*  "Each cohort's n = 1100".
COMPUTE Retained = 1100*(Retent/100).
COMPUTE Lost     = 1100*(1-Retent/100).

COMPUTE Retained = RND(Retained).
COMPUTE Lost     = RND(Lost).

.  /*--  LIST  /*-*/.

*  .................   Post after this point   ..................... .
*  ................................................................. .
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |15-MAY-2007 20:31:56       |
|-----------------------------|---------------------------|
InString     Year Retent Retained   Lost

83.4(1998)   1998  83.4%     917     183
83.3(1999)   1999  83.3%     916     184
86.2(2000)   2000  86.2%     948     152
85.4(2001)   2001  85.4%     939     161
87.5(2002)   2002  87.5%     963     138
86.9(2003)   2003  86.9%     956     144
85.8(2004)   2004  85.8%     944     156
81.7(2005)   2005  81.7%     899     201

Number of cases read:  8    Number of cases listed:  8


VARSTOCASES
  /MAKE Number FROM Retained Lost
  /INDEX = Outcome "Whether retained, or lost"(2)
  /KEEP  = Year Retent
  /NULL  = KEEP.

Variables to Cases
|-----------------------------|---------------------------|
|Output Created               |15-MAY-2007 20:31:56       |
|-----------------------------|---------------------------|
Generated Variables
|-------|---------------|
|Name   |Label          |
|-------|---------------|
|Outcome|Whether        |
|       |retained, or   |
|       |lost           |
|-------|---------------|
|Number |<none>         |
|-------|---------------|

Processing Statistics
|-------------|-|
|Variables In |5|
|Variables Out|4|
|-------------|-|


VALUE LABEL Outcome (1) 'Retained' (2) 'Lost'.

.  /**/  LIST  /*-*/.

List
|-----------------------------|---------------------------|
|Output Created               |15-MAY-2007 20:31:56       |
|-----------------------------|---------------------------|
Year Retent Outcome Number

1998  83.4%      1     917
1998  83.4%      2     183
1999  83.3%      1     916
1999  83.3%      2     184
2000  86.2%      1     948
2000  86.2%      2     152
2001  85.4%      1     939
2001  85.4%      2     161
2002  87.5%      1     963
2002  87.5%      2     138
2003  86.9%      1     956
2003  86.9%      2     144
2004  85.8%      1     944
2004  85.8%      2     156
2005  81.7%      1     899
2005  81.7%      2     201

Number of cases read:  16    Number of cases listed:  16


WEIGHT BY Number.