Re: Chi suqare problem - large cells number and significant test validity
Posted by
Bruce Weaver on
Mar 21, 2011; 1:07pm
URL: http://spssx-discussion.165.s1.nabble.com/Chi-suqare-problem-large-cells-number-and-significant-test-validity-tp4210454p4227294.html
For tables larger than 2x2, the chi-square approximation is still pretty good so long as all expected counts are greater than 1, and no more than 20% are less than 5.
https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes/chisqr_assumptionsIn the event that condition is not met, you could use the "exact test" option if you have that module.
For the 3x3 table, looking at the standardized residual is one option. Alternatively, you can partition the overall table into orthogonal components -- I have an example of this in my notes on categorical data (chapter 3 at the link below). And finally, I understand that multiple comparisons via z-tests have been added to CROSSTABS in v19. (I'm still on v18, so have not seen them firsthand.)
https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes -- see chapter 3
HTH.
chengfoh wrote
Hi everyone:
I am not a very experience user of SPSS and currently performing chi square test on my data to assess the association. Here is my case:
Row and column = 20 x 6 (different groups, I have try hard but there is not way to collapse them)
Variables: All are categorical (nominal)
Test perform = chi square test of independence
Sample size total = 150
due to the large cells number (row x column), the chi square test cannot give valid p value as too many cells have less than expected frequency <5 and violated one of the test's assumtion. In this case:
a) since the p value is not useful, can i pick the highly prevalence % from the cells and discuss only the result using descriptive statistic? will that be acceptable for write up?
b) in another data i have low cells number (3x3) and the p value is significant (p<0.05). for identifying which cell is the one giving the association, should i look at the adjusted standardized value at >2 or <2 to specifically pick cells that is actually significant?
Really need help from you guys who familiar with this.
thanks thousands.
Foh
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM."
PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (
https://listserv.uga.edu/).