Hi all,
I have completed a 5x2 contingency table in SPSS which returned a significant chi-square value (P<0.001). My columns are species detected/not detected in surveys and the rows are different habitats (see data below). I expected the proportion of detected to not-detected to differ between habitats. So all good here. I followed this up with z-tests (under the 'Custom Table' option) which detected significant differences in column proportions for three of the habitats (=rows) (B,C,D). B and C had lower proportions of surveys detecting the species and D had greater than expected. However, the 'reporting rate' of the species (i.e. the number of surveys that detected the species as a percentage of total surveys, which equals the row percentage in the contingency table) was highest in habitat D (20.4% - no surprises there), AND in E (12.8%) which showed no significant difference in proportions. All other row percentages were below 4%. Furthermore, in a 2x2 table just comparing habitat D and E (which from a priori reasons were the only habitats I expected to have a greater proportion of detected surveys - which they did as row percentages but not in z-tests) there was no difference between them (exact p = 0.079). I'm confused about how to interpret these results. What exactly does a significant z-test tell me about the proportions of detected/not-detected? Does a significantly greater than expected proportion of detected in habitat X not equate to a significantly lower than expected proportion of not-detected in habitat X? Can I still take note of the fact that in the habitat of row 5 the 'reporting rate' was still high relative to other habitats except D? The data are as follows (standardised residuals in parentheses) Detected Not Detected % detected (row %) A Obs 1 (-1.6) 45 (0.5) 2.2 Exp 4.4 41.6 B Obs 4 (-2.3) 123 (0.8) 3.1 Exp 12.2 114.8 C Obs 3 (-3.4) 175 1.7 Exp 17.1 160.9 D Obs 40 (4.9) 156 (-1.6) 20.4 Exp 18.8 177.2 E Obs 18 (1.2) 123 (-0.4) 12.8 Exp 13.5 127.5 Hope anyone can help! :-) Thanks in advance, Dean |
Yes, the z's for the cells are a remark on the relation to the 'average' difference. So, acacia/euc is not extreme because it falls in the middle. Further, z for the low proportion for wetland is not as extreme because the N is smaller and thus the power is weaker. I'm going to call the groups A B C D and E, for the ordered means (1.7, 2.2, 3.1, 12.8, 20.4). The first thing to conclude is that (A B C) are the same and that all the cases are in (D E). Using the z's as tests, (A B C) (D E) is the way that you could show differences, in one style of post-hoc reporting. If that was the full result. I don't remember if you stated it, but it is possible that D is not "different" from (A B C). In that case, the post-hoc report could be (A B C D) (D E). This would say that D is "not different" from the first three, and it also is not different from the last. Traditionally, these have often been shown by underlining the means that "do not differ." - This style of report was designed for ANOVA, using tests based on the pooled variance, for groups with equal Ns; it does not work consistently for 2x2 tables, for grossly unequal Ns, or for other paired comparisons that are may have different error terms for various comparisons. (re: smallest sample here. But it seems like it might work here. MOST of the detections are in D and E. The simple difference between them is not (I think you report) tested as different in a 2x2 table. Okay. So, at the conventional test level, they do not differ. However, the "effect size" of their measured difference is still somewhat large. It also is "sensible" in that the mixed environment is less extreme than the pure one. So you might expect a difference to be confirmed if the sampling was more extensive. -- Rich Ulrich From: [hidden email] To: [hidden email]; [hidden email] Subject: RE: Interpreting Contingency table analysis & z-tests results - PLEASE HELP! Date: Fri, 27 May 2011 18:43:24 +1000 Thanks very much for your reply Rich, I'm still a little confused as to how to interpret the z-tests - do they indicate which rows (habitats in my case) have the greatest difference between the numbers of the categories of the columns (no. of surveys that detected/didn't detect in my case)? If so, is this relative to the 'average' difference between column categories across all rows? ... But I had interpreted the z-tests as indicating that Acacia was contributing to the significance of the chi-square but not Acacia/Euc - despite 12.8% being >4 times the three other habitats (1.7-3.1%), and in a 2x2 table Acacia vs Acacia/Euc was not significant. In summary, the species is detected significantly more frequently in two habitats (Acacia + Acacia/Euc) and these are frequented either equally OR Acacia more so than Acacia/Euc. It is the latter part that is troubling me (hence the follow up 2x2 table). |
Administrator
|
In reply to this post by DP_Sydney
I've never used the z-tests you refer to, so gave it a try as follows:
data list list / habitat detected kount (3f5.0). begin data 1 1 1 1 2 45 2 1 4 2 2 123 3 1 3 3 2 175 4 1 40 4 2 156 5 1 18 5 2 123 end data. var lab habitat "Habitat" detected "Detected" . val lab habitat 1 "A" 2 "B" 3 "C" 4 "D" 5 "E" / detected 1 "Yes" 2 "No" . weight by kount. * Custom Tables. CTABLES /VLABELS VARIABLES=habitat detected DISPLAY=DEFAULT /TABLE habitat [COUNT F40.0] BY detected /CATEGORIES VARIABLES=habitat detected ORDER=A KEY=VALUE EMPTY=INCLUDE /SIGTEST TYPE=CHISQUARE ALPHA=0.05 INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE /COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MERGE=NO. Does this generate the same z-test results you have? For those who cannot run the syntax, the z-test output looks like this: Comparisons of Column Proportions^a Yes No A B A C A D B E Results are based on two-sided tests with significance level 0.05. For each significant pair, the key of the category with the smaller column proportion appears under the category with the larger column proportion. a. Tests are adjusted for all pairwise comparisons within a row of each innermost subtable using the Bonferroni correction. It may be a Friday afternoon thing, but it is not immediately clear to me how to read those results. Here are the column percentages, by the way: Yes No A 1.52% 7.23% B 6.06% 19.77% C 4.55% 28.14% D 60.61% 25.08% E 27.27% 19.77% If you do have specific a priori contrasts in mind, you might be better off partitioning the overall table in a way that addresses those questions. I have some examples of that in a chapter of notes on chi-square analysis -- item 3 here: https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes. Notice that for this approach, the likelihood ratio chi-square works out better than Pearson's statistic, because orthogonal components that should add up to a whole DO add up to the whole. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Rich Ulrich
Thanks Rich!
I think I'm beginning to understand these z-tests. Let me test whether this is the case and the following is a correct interpretation: from the results it can be concluded that detection in Euc (your A) and Grass/Forb (your C) is significantly lower than in Acacia/Euc (your D) and Acacia (your E). Furthermore, the overall test suggests that detection in Acacia (your E) is significantly greater than Acacia/Euc (your D), but this difference does not bear out when partitioning the contingency table into a 2x2 table with these two categories. Last, the small sample size for Wetland (your B) makes it difficult to determine if the detection in this habitat differs from the others (but nevertheless the detection percentage is as low as your 'A' and 'C'). Have I got it right? I wonder if you could clarify a couple of points from your email. "I'm going to call the groups A B C D and E, for the ordered means (1.7, 2.2, 3.1, 12.8, 20.4). The first thing to conclude is that (A B C) are the same and that all the cases are in (D E)". What did you mean by "...all the cases are in (D E)"? Does this refer to the majority of detection cases are in D and E? "Using the z's as tests, (A B C) (D E) is the way that you could show differences, in one style of post-hoc reporting..." How is it that it D is not different to E, and both are different to A/B/C, (as opposed to A/B/C/D vs E) - is it because of the greater similarity between D and E relative to D and A/B/C? Thank you very much for all your help with this - it is making the haze seem clearer! Cheers, Dean
Date: Fri, 27 May 2011 11:13:40 -0700 From: [hidden email] To: [hidden email] Subject: Re: Interpreting Contingency table analysis & z-tests results - PLEASE HELP! Yes, the z's for the cells are a remark on the relation to the 'average' difference. So, acacia/euc is not extreme because it falls in the middle. Further, z for the low proportion for wetland is not as extreme because the N is smaller and thus the power is weaker. I'm going to call the groups A B C D and E, for the ordered means (1.7, 2.2, 3.1, 12.8, 20.4). The first thing to conclude is that (A B C) are the same and that all the cases are in (D E). Using the z's as tests, (A B C) (D E) is the way that you could show differences, in one style of post-hoc reporting. If that was the full result. I don't remember if you stated it, but it is possible that D is not "different" from (A B C). In that case, the post-hoc report could be (A B C D) (D E). This would say that D is "not different" from the first three, and it also is not different from the last. Traditionally, these have often been shown by underlining the means that "do not differ." - This style of report was designed for ANOVA, using tests based on the pooled variance, for groups with equal Ns; it does not work consistently for 2x2 tables, for grossly unequal Ns, or for other paired comparisons that are may have different error terms for various comparisons. (re: smallest sample here. But it seems like it might work here. MOST of the detections are in D and E. The simple difference between them is not (I think you report) tested as different in a 2x2 table. Okay. So, at the conventional test level, they do not differ. However, the "effect size" of their measured difference is still somewhat large. It also is "sensible" in that the mixed environment is less extreme than the pure one. So you might expect a difference to be confirmed if the sampling was more extensive. -- Rich Ulrich From: [hidden email] To: [hidden email]; [hidden email] Subject: RE: Interpreting Contingency table analysis & z-tests results - PLEASE HELP! Date: Fri, 27 May 2011 18:43:24 +1000 Thanks very much for your reply Rich, I'm still a little confused as to how to interpret the z-tests - do they indicate which rows (habitats in my case) have the greatest difference between the numbers of the categories of the columns (no. of surveys that detected/didn't detect in my case)? If so, is this relative to the 'average' difference between column categories across all rows? ... But I had interpreted the z-tests as indicating that Acacia was contributing to the significance of the chi-square but not Acacia/Euc - despite 12.8% being >4 times the three other habitats (1.7-3.1%), and in a 2x2 table Acacia vs Acacia/Euc was not significant. In summary, the species is detected significantly more frequently in two habitats (Acacia + Acacia/Euc) and these are frequented either equally OR Acacia more so than Acacia/Euc. It is the latter part that is troubling me (hence the follow up 2x2 table). If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Interpreting-Contingency-table-analysis-z-tests-results-PLEASE-HELP-tp4412140p4432889.html
To unsubscribe from Interpreting Contingency table analysis & z-tests results - PLEASE HELP!, click here.
|
In reply to this post by Bruce Weaver
Thanks Bruce.
I did get the same result (see earlier post) with the Custom Table. I have printed out your notes and will read them over the next couple of days and then digest the material!
Hopefully I'm getting there. Date: Fri, 27 May 2011 13:37:10 -0700 From: [hidden email] To: [hidden email] Subject: Re: Interpreting Contingency table analysis & z-tests results - PLEASE HELP! I've never used the z-tests you refer to, so gave it a try as follows: data list list / habitat detected kount (3f5.0). begin data 1 1 1 1 2 45 2 1 4 2 2 123 3 1 3 3 2 175 4 1 40 4 2 156 5 1 18 5 2 123 end data. var lab habitat "Habitat" detected "Detected" . val lab habitat 1 "A" 2 "B" 3 "C" 4 "D" 5 "E" / detected 1 "Yes" 2 "No" . weight by kount. * Custom Tables. CTABLES /VLABELS VARIABLES=habitat detected DISPLAY=DEFAULT /TABLE habitat [COUNT F40.0] BY detected /CATEGORIES VARIABLES=habitat detected ORDER=A KEY=VALUE EMPTY=INCLUDE /SIGTEST TYPE=CHISQUARE ALPHA=0.05 INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE /COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MERGE=NO. Does this generate the same z-test results you have? For those who cannot run the syntax, the z-test output looks like this: Comparisons of Column Proportions^a Yes No A B A C A D B E Results are based on two-sided tests with significance level 0.05. For each significant pair, the key of the category with the smaller column proportion appears under the category with the larger column proportion. a. Tests are adjusted for all pairwise comparisons within a row of each innermost subtable using the Bonferroni correction. It may be a Friday afternoon thing, but it is not immediately clear to me how to read those results. Here are the column percentages, by the way: Yes No A 1.52% 7.23% B 6.06% 19.77% C 4.55% 28.14% D 60.61% 25.08% E 27.27% 19.77% If you do have specific a priori contrasts in mind, you might be better off partitioning the overall table in a way that addresses those questions. I have some examples of that in a chapter of notes on chi-square analysis -- item 3 here: https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes. Notice that for this approach, the likelihood ratio chi-square works out better than Pearson's statistic, because orthogonal components that should add up to a whole DO add up to the whole. HTH.
--
Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Interpreting-Contingency-table-analysis-z-tests-results-PLEASE-HELP-tp4412140p4433351.html
To unsubscribe from Interpreting Contingency table analysis & z-tests results - PLEASE HELP!, click here.
|
Free forum by Nabble | Edit this page |