Chi suqare problem - large cells number and significant test validity

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Chi suqare problem - large cells number and significant test validity

chengfoh
Hi everyone:

I am not a very experience user of SPSS and currently performing chi square test on my data to assess the association. Here is my case:

Row and column = 20 x 6 (different groups, I have try hard but there is not way to collapse them)
Variables: All are categorical (nominal)
Test perform = chi square test of independence
Sample size total = 150

due to the large cells number (row x column), the chi square test cannot give valid p value as too many cells have less than expected frequency <5 and violated one of the test's assumtion. In this case:

a) since the p value is not useful, can i pick the highly prevalence % from the cells and discuss only the result using descriptive statistic? will that be acceptable for write up?

b) in another data i have low cells number (3x3) and the p value is significant (p<0.05). for identifying which cell is the one giving the association, should i look at the adjusted standardized value at >2 or <2 to specifically pick cells that is actually significant?

Really need help from you guys who familiar with this.
thanks thousands.
Foh


Reply | Threaded
Open this post in threaded view
|

Re: Chi suqare problem - large cells number and significant test validity

Bruce Weaver
Administrator
For tables larger than 2x2, the chi-square approximation is still pretty good so long as all expected counts are greater than 1, and no more than 20% are less than 5.  

   https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes/chisqr_assumptions

In the event that condition is not met, you could use the "exact test" option if you have that module.

For the 3x3 table, looking at the standardized residual is one option.  Alternatively, you can partition the overall table into orthogonal components -- I have an example of this in my notes on categorical data (chapter 3 at the link below).  And finally, I understand that multiple comparisons via z-tests have been added to CROSSTABS in v19.  (I'm still on v18, so have not seen them firsthand.)

 https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes -- see chapter 3

HTH.

chengfoh wrote
Hi everyone:

I am not a very experience user of SPSS and currently performing chi square test on my data to assess the association. Here is my case:

Row and column = 20 x 6 (different groups, I have try hard but there is not way to collapse them)
Variables: All are categorical (nominal)
Test perform = chi square test of independence
Sample size total = 150

due to the large cells number (row x column), the chi square test cannot give valid p value as too many cells have less than expected frequency <5 and violated one of the test's assumtion. In this case:

a) since the p value is not useful, can i pick the highly prevalence % from the cells and discuss only the result using descriptive statistic? will that be acceptable for write up?

b) in another data i have low cells number (3x3) and the p value is significant (p<0.05). for identifying which cell is the one giving the association, should i look at the adjusted standardized value at >2 or <2 to specifically pick cells that is actually significant?

Really need help from you guys who familiar with this.
thanks thousands.
Foh
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Chi suqare problem - large cells number and significant test validity

chengfoh
In reply to this post by chengfoh
Hi Bruce:

Thanx so much for your advices, it is very helpful for me. Now i shall proceed with the statistics.

Also, they do apply fisher\s exact test when the sample size is small, but does it matter if i use it on large sample size at n =150?


Cheng Foh
Reply | Threaded
Open this post in threaded view
|

Re: Chi suqare problem - large cells number and significant test validity

Bruce Weaver
Administrator
Glad to hear you found it helpful.  

A couple points re Fisher's exact test (aka the Fisher-Irwin test).  First, it is designed for the situation where all of the marginal totals are fixed.  If all marginal totals are fixed in advance, then use FET regardless of sample size.  But in practice, this situation does not arise very often, it seems to me.  One example would be where you have a fixed number of cases in two groups (e.g., males and females), and then do a median split on some continuous variable to obtain the other dichotomy.  But doing a median split throws away a lot of info, and some other method (e.g., t-test or logistic regression) would generally be better.  

The more common use of FET is for dealing with expected counts < 5.  But in most of those cases, you'd be better off using the N-1 chi-square, IMO.  For more info, scroll down to "2x2 Tables: Advice from Campbell (2007)" on that webpage I gave last time.  See also Campbell's nice website, that includes a calculator for the N-1 chi-square.

  http://www.iancampbell.co.uk/twobytwo/calculator.htm

HTH.


chengfoh wrote
Hi Bruce:

Thanx so much for your advices, it is very helpful for me. Now i shall proceed with the statistics.

Also, they do apply fisher\s exact test when the sample size is small, but does it matter if i use it on large sample size at n =150?


Cheng Foh
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).