Dear Listserv members:
A colleague is interested in examining the relationship between two sets of variables (dependent variables and independent variables). Additionally she would like to investigate whether the relationship varies between men and women. The total sample size is somewhat moderate (about 200 participants) with 70% being women. What might be a good approach to use when analyze these data in light of the objectives? Any suggestions are most appreciated. Sincerely, Susan Sereika |
You state your friend's goals in a very sketchy way, so it is very difficult
to give an opinion. For instance, how many variables are involved? 200 cases may be way too few if the variables happen to be (even moderately) numerous. Is he/she interested in bivariate or multivariate relations between these variables? For instance, one may be interested in crossing pairs of variables such as X BY Z BY sex, and see whether the association/correlation of X and Z varies with sex, and this may be feasible with 200 cases (140 women, 60 men), only if one has, say, K variables there would be K*(K-1)/2 pairs of variables, which rapidly goes into the hundreds or the thousands as K grows. For K=50, there are 1225 pairs of variables to consider. If one is interested in models involving many variables, such as regression, the number of possible models grows exponentially and, besides, the small number of cases in the sample becomes rapidly a limitation. Another consideration is whether your friend has any theory or conceptual approach or problem-oriented goal when facing these data, or is just exploring blindly around. What is he/she looking for? Just mining around for any kind of non-random-looking patterns, like an astronomer searching for signs of extra-terrestrial intelligence among random electromagnetic cosmic noise, or like John Nash, he of the beautiful mind, parsing newspapers in the worst of his madness? In a sample of 200 she/he may find many promising patterns, but they may be nothing but sample flukes. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Susan M. Sereika Enviado el: Monday, August 21, 2006 10:44 AM Para: [hidden email] Asunto: Relationship between Sets of Dependent and Independent Variables Dear Listserv members: A colleague is interested in examining the relationship between two sets of variables (dependent variables and independent variables). Additionally she would like to investigate whether the relationship varies between men and women. The total sample size is somewhat moderate (about 200 participants) with 70% being women. What might be a good approach to use when analyze these data in light of the objectives? Any suggestions are most appreciated. Sincerely, Susan Sereika |
In reply to this post by Susan M. Sereika
It is all the same. You may use, for instance, factor analysis to derive a
scale representing the 18 variables of one of the sets, one scale for women and another similar scale for men, but the scale for women will be built with data from a sample of 60 women, i.e. 2.33 women per variable, and that is hardly statistically significant. Take any other sample of 60 women from the same population and the results are likely to be completely different. The margin or error in the factor loadings and the regression coefficients will be very wide, especially if some of the variables are not very highly correlated (r>0.90 or r>0.95) with some of the others. Data, as the old saying goes, can always be tortured till they confess. But you better don't. There are better ways to get to the truth. Hector -----Mensaje original----- De: Susan M. Sereika [mailto:[hidden email]] Enviado el: Tuesday, August 22, 2006 1:45 PM Para: 'Hector Maletta' Asunto: RE: Relationship between Sets of Dependent and Independent Variables Dear Hector: I agree that sample size is problematic. Is there anything that can be savaged from this? Would it be reasonable to present the work as exploratory? Or perhaps to apply principal components analysis to derive a smaller number of derived variables and conduct the analyses with these derived variables using regression analysis? Thank you very much for your thoughts on this. Sincerely, Susan -----Original Message----- From: Hector Maletta [mailto:[hidden email]] Sent: Tuesday, August 22, 2006 12:09 PM To: 'Susan M. Sereika' Subject: RE: Relationship between Sets of Dependent and Independent Variables Now the situation is clearer, and the answer more definitely negative. The 36 variables are far too many for just 200 cases (below 6 per variable), let alone for 30% of them, i.e. for about 60 women (which is about 1.8 cases per variable, when the old rule of thumb, now discredited for insufficiency, was at least 10; nowadays far more than 10 is usually required, depending on the variance of variables and the strength of the relationship). Hector -----Mensaje original----- De: Susan M. Sereika [mailto:[hidden email]] Enviado el: Tuesday, August 22, 2006 12:54 PM Para: 'Hector Maletta' Asunto: RE: Relationship between Sets of Dependent and Independent Variables Dear Hector: Thank you for your very quick and thoughtful reply. The investigation is theoretically driven for the most part with respect to the relationship between the two sets of variables. The idea that the relationship may vary by gender/sex is a little more exploratory, although there is some literature to support some relationships. Each set of dependent and independent variables consists of 18 variables and the variables are subscale scores believed to measure two concepts: beliefs about depression (18 variables) and coping (18 variables). The initial investigation focused on just examining the relationships between the two variables sets and given the complexity of the data, canonical correlation analysis (CCA) was used. Then the investigation was expanded to consider differences between men and women and a CCA was conducted within each gender subsample. The smaller subsample sizes are problematic, especially for the male subsample. Sincerely, Susan -----Original Message----- From: Hector Maletta [mailto:[hidden email]] Sent: Monday, August 21, 2006 12:05 PM To: 'Susan M. Sereika'; [hidden email] Subject: RE: Relationship between Sets of Dependent and Independent Variables You state your friend's goals in a very sketchy way, so it is very difficult to give an opinion. For instance, how many variables are involved? 200 cases may be way too few if the variables happen to be (even moderately) numerous. Is he/she interested in bivariate or multivariate relations between these variables? For instance, one may be interested in crossing pairs of variables such as X BY Z BY sex, and see whether the association/correlation of X and Z varies with sex, and this may be feasible with 200 cases (140 women, 60 men), only if one has, say, K variables there would be K*(K-1)/2 pairs of variables, which rapidly goes into the hundreds or the thousands as K grows. For K=50, there are 1225 pairs of variables to consider. If one is interested in models involving many variables, such as regression, the number of possible models grows exponentially and, besides, the small number of cases in the sample becomes rapidly a limitation. Another consideration is whether your friend has any theory or conceptual approach or problem-oriented goal when facing these data, or is just exploring blindly around. What is he/she looking for? Just mining around for any kind of non-random-looking patterns, like an astronomer searching for signs of extra-terrestrial intelligence among random electromagnetic cosmic noise, or like John Nash, he of the beautiful mind, parsing newspapers in the worst of his madness? In a sample of 200 she/he may find many promising patterns, but they may be nothing but sample flukes. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Susan M. Sereika Enviado el: Monday, August 21, 2006 10:44 AM Para: [hidden email] Asunto: Relationship between Sets of Dependent and Independent Variables Dear Listserv members: A colleague is interested in examining the relationship between two sets of variables (dependent variables and independent variables). Additionally she would like to investigate whether the relationship varies between men and women. The total sample size is somewhat moderate (about 200 participants) with 70% being women. What might be a good approach to use when analyze these data in light of the objectives? Any suggestions are most appreciated. Sincerely, Susan Sereika |
Free forum by Nabble | Edit this page |