Fisher Exact Test and Monte Carlo

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Fisher Exact Test and Monte Carlo

Glenys Lafrance
Hi listers,

The following questions are related so I hope this post does not
contravene the protocol of one topic per post.  If so, let me know and in
future I will break the questions up.

a) My study involves 245 cases.  When I try to compute Fisher Exact test
for tables (none larger than 4 x 4), it sometimes fails for lack of memory
resources, or for lack of time.  I have 1 gig of RAM, 1.15 ghz processor,
6.5 gigs free space, SPSS workspace set to 4096 kb (same as swap file)
currently.  How much memory is enough?  How much time is enough, I am
using 10 minutes...is it possible to override the default time limit?

b) Is there any macro that will instruct SPSS to use Monte Carlo
simulation if the requested Fishers Exact tests times out or fails for
lack of memory?

c) When I select Monte Carlo the dialogue box says it will use Fishers
when resources permit.  But it never seems to do this...(even when I know
the computation is successful when I select Fisher instead of MC).  I set
the MC specs to 99.9 and 50,000...it computes very quickly...so I am not
sure why the FET does not.

d) When I ask for MC, I usually get a value for Fishers, but no
significance stat.  There is a significance value in the MC column, and a
CI.  What is the best way to report this (APA if possible)?

e) With exact tests installed, is there a list of tests/procedures that
use them?

Much thanks as always,

Glenys Lafrance
Reply | Threaded
Open this post in threaded view
|

Re: Fisher Exact Test and Monte Carlo

Anthony Babinec
I created a 4x4 table with a table total of
245 and ran it in StatXact, from which SPSS Exact Tests
gets these routines. On my problem, StatXact ran for almost
an hour and then quit with an insufficient memory message.

This situation calls for the Monte Carlo approach. You can
set iterations and the Monte Carlo p-value to your liking,
and get an answer relatively quickly. The Pearson, Likelihood ratio, and
Fisher's (and its generalization beyond the 2x2 table, called the
Fisher-Freeman-Halton test)chi-square are all asymptotically
equivalent but differ in finite samples. You can obtain their
values and Monte Carlo significance levels relatively quickly.

Whether it is possible to "trap" the SPSS memory message or time
out message from the exact approach and re-invoke CROSSTABS with Monte
Carlo, I don't know.

I don't know why SPSS omits the asymptotic df and p-value for Fisher, but
the asymptotic df is (r-1)*(c-1) just as it is for the Pearson
or Likelihood ratio chi-square. The significance level for the
chi-square can be obtained from a table lookup of the chi-square value
and the degrees of freedom.

In SPSS, CROSSTABS and NPAR TESTS are where exact test versions
are found.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Glenys Lafrance
Sent: Wednesday, December 13, 2006 10:22 AM
To: [hidden email]
Subject: Fisher Exact Test and Monte Carlo

Hi listers,

The following questions are related so I hope this post does not
contravene the protocol of one topic per post.  If so, let me know and in
future I will break the questions up.

a) My study involves 245 cases.  When I try to compute Fisher Exact test
for tables (none larger than 4 x 4), it sometimes fails for lack of memory
resources, or for lack of time.  I have 1 gig of RAM, 1.15 ghz processor,
6.5 gigs free space, SPSS workspace set to 4096 kb (same as swap file)
currently.  How much memory is enough?  How much time is enough, I am
using 10 minutes...is it possible to override the default time limit?

b) Is there any macro that will instruct SPSS to use Monte Carlo
simulation if the requested Fishers Exact tests times out or fails for
lack of memory?

c) When I select Monte Carlo the dialogue box says it will use Fishers
when resources permit.  But it never seems to do this...(even when I know
the computation is successful when I select Fisher instead of MC).  I set
the MC specs to 99.9 and 50,000...it computes very quickly...so I am not
sure why the FET does not.

d) When I ask for MC, I usually get a value for Fishers, but no
significance stat.  There is a significance value in the MC column, and a
CI.  What is the best way to report this (APA if possible)?

e) With exact tests installed, is there a list of tests/procedures that
use them?

Much thanks as always,

Glenys Lafrance
Reply | Threaded
Open this post in threaded view
|

Re: Fisher Exact Test and Monte Carlo

David Hitchin
In reply to this post by Glenys Lafrance
Quoting Glenys Lafrance <[hidden email]>:

> d) When I ask for MC, I usually get a value for Fishers, but no
> significance stat.  There is a significance value in the MC column,
> and a CI.  What is the best way to report this (APA if possible)?

Most statistical tests work something like this:
Do some calculations on your data, and get a test statistic, such as a
t-value, and F-value, or a chi-squared value. Note the degrees of
freedom and look up the test statistic in a table (or get the computer
to do the equivalent) to find the p-value (signficance).

People used to this approach sometimes find the Fisher's exact test
rather confusing, because there is no test statistic, we don't make
explicit use of the number of degrees of freedom, and what the procedure
does is to deliver a p-value (significance)directly.

Nothing more to do, nothing to look up in a table, the p-value is given
to you directly.

Now there are problems with Fisher's exact test, especially on tables
larger than two by two. In effect the computer has to compare the table
that you have with all other possible tables with the same marginal
totals - which may be an astronomically large number. If you allow the
computer sufficient time and space it will generate all of those tables
and compare them with your observed data.

There are some cunning algorithms which take short cuts, but even they
can't fully elaborate some large problems. In this case the computer
looks at a random sample of all possible tables, and the larger this
sample is, the closer you expect the calculated p-value to be to the
true value calculated from all tables.  Use only a few of the possible
tables, and you can't expect to be very close to the true p-value, so
there is a large confidence interval around the printed p-value. As you
examine more and more, the confidence interval shrinks and the p-value
approaches (asymptotically) the true value.

Usually the p-value from the Fisher exact test will be close to the
p-value obtained from a chi-square, and where it differs this is because
of the inadequacies of the chi-squared test as applied to tables,
especially those of the 2 x 2 variety, and those with small expected
values in some cells. The controversial Yates' approximation tries to
get this right, but Fisher's exact test does - given sufficient time and
space.

David Hitchin