Thank you all for your suggestions. Sorry for not being fully clear about the data involved my question.
I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control. Given such new information, have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them? Nicola -----Original Message----- I have a population of N=12000. I want to know the mean (and possibly the standard deviation) of a variable x, bounded between 1 and 7. I took a (let's suppose random) sample of n=400 and estimated mean = 3.14 (standard error = .15) and standard deviation = 2.28. The sample strongly departs from normality. Can I trust such estimates? How much? Can I attach a p-value to them? I need to state formally that, despite a ridicolous response rate, my research is not that bad. |
At 08:04 AM 12/10/2006, Nicola Baldini wrote:
>I sent the questionnaire to the entire population (N=12000). I >received only n=400 responses. Respondents resemble the population in >terms of all known characteristics, but - obviously - they might >differ on some variables for which I couldn't/didn't control. > >Have I chances to get published with some statistical analysis on my >data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping >variables instead of full observations) or should I trash them? Whatever is to be done, can't be done by analytic techniques. You need to argue that your results are meaningful, and you can make sense of what they mean. At the best, with this response rate, the interpretation must include words like 'tentative' or 'preliminary'. >Respondents might differ [for the popuulation' on some variables for >which I couldn't/didn't control. Yes, and on one very big one: their answers to the question you're asking. I'm not a survey analyst, but I believe it's well known that the likelihood that people will answer a question, varies with their answers. For example, it's my understanding that, . If it's a question of satisfaction, the most dissatisfied are most likely to respond, and the mid-scale ones the least. . People are less likely to respond, the less they respect and trust whomever is asking them. (And your response rate doesn't suggest a lot of respect and trust, overall.) 'Respect and trust' for the questioner are probably related to the responses on your question. That's just to start with. I could probably think of more; participants with extensive experience with surveys, will be able to think of many more. You haven't said anything about your survey. If it's really just one or a few questions, you simply don't have much information. If it's a survey of some richness, there may be patterns that stand out. You won't be able to say much about the prevalence of those patterns, but sometimes showing their existence is informative. (Of course, you can estimate a MINIMUM incidence, on the assumption that none of the 97% non-respondents match the pattern.) Your questions are beyond what can be addressed with statistical methodology. I think methodology, by itself, can say only what we have said: Your results must be viewed with deep caution. There's a heavy burden of proof on you, to argue why they shouldn't be ignored. Beyond that, we come to questions of meaning, and that's for subject specialists, not methodologists. |
In reply to this post by nicola.baldini2
Stephen Brand
www.statisticsdoc.com Nicola, I would be very concerned about the low response rate - I would be very concerned that the sample is not representative of the larger population. How does it compare with other studies in your area of investigation? How were responses elicited? How would you account for the reponse rate? HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Nicola Baldini Sent: Sunday, December 10, 2006 8:05 AM To: [hidden email] Subject: Re: guessing mean of bounded variable with 1:30 sampling ratio (4) Thank you all for your suggestions. Sorry for not being fully clear about the data involved my question. I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control. Given such new information, have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them? Nicola -----Original Message----- I have a population of N=12000. I want to know the mean (and possibly the standard deviation) of a variable x, bounded between 1 and 7. I took a (let's suppose random) sample of n=400 and estimated mean = 3.14 (standard error = .15) and standard deviation = 2.28. The sample strongly departs from normality. Can I trust such estimates? How much? Can I attach a p-value to them? I need to state formally that, despite a ridicolous response rate, my research is not that bad. |
In reply to this post by Richard Ristow
The 12-item scale which I used is the same of another survey of mine (with 40% response rate), and a PCA gives the same results for both of them (same items loads on same factors). Would this give you more confidence in my data, despite the 1:30 response rate?
Respondeses were elicited by e-mail (two recalls) in the less successful case and by e-mail (two recalls) plus a phone call in the other one (the sample with 40% response rate counted only 600 persons, it was much easier to phone them begging to complete my survey). At 14.30 12/12/2006 -0500, Richard Ristow wrote: >At 08:04 AM 12/10/2006, Nicola Baldini wrote: > >>I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control. >> >>Have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them? > >Whatever is to be done, can't be done by analytic techniques. You need to argue that your results are meaningful, and you can make sense of what they mean. At the best, with this response rate, the interpretation must include words like 'tentative' or 'preliminary'. > >>Respondents might differ [for the popuulation' on some variables for which I couldn't/didn't control. > >Yes, and on one very big one: their answers to the question you're asking. I'm not a survey analyst, but I believe it's well known that the likelihood that people will answer a question, varies with their answers. For example, it's my understanding that, >. If it's a question of satisfaction, the most dissatisfied are most likely to respond, and the mid-scale ones the least. >. People are less likely to respond, the less they respect and trust whomever is asking them. (And your response rate doesn't suggest a lot of respect and trust, overall.) 'Respect and trust' for the questioner are probably related to the responses on your question. > >That's just to start with. I could probably think of more; participants with extensive experience with surveys, will be able to think of many more. > >You haven't said anything about your survey. If it's really just one or a few questions, you simply don't have much information. If it's a survey of some richness, there may be patterns that stand out. You won't be able to say much about the prevalence of those patterns, but sometimes showing their existence is informative. (Of course, you can estimate a MINIMUM incidence, on the assumption that none of the 97% non-respondents match the pattern.) > >Your questions are beyond what can be addressed with statistical methodology. I think methodology, by itself, can say only what we have said: Your results must be viewed with deep caution. There's a heavy burden of proof on you, to argue why they shouldn't be ignored. > >Beyond that, we come to questions of meaning, and that's for subject specialists, not methodologists. |
www.statisticsdoc.com
Stephen Brand Nicola, Having a consistemt factor structure is good, but it still leaves me concerned about your original question concerning estimation of the mean. With such a small response rate, it is entirely possible that your estimation of the population standard deviation (and variance) could be very biased. I am not sure why you tried to sample 12,000 people. As a general rule, it is far better to have a smaller, but representative sample than a sample that has a larger n but a lower response rate (and hence more risk of bias). It is easier to get a larger response rate, and a more representative sample, from a smaller group of potential participants (as you did in your first study). I do not want to be presumptuous, but in the future, you may want to start by sending out a smaller number of surveys (even just 300) to a random sample, and trying very hard to get a high response rate from that group. A rule of thumb in mail surveys is that you can get 35-40% on the first mailing alone, and then get up over 65% with followup mailings and calls (more if you have incentives). However, this rule of thumb assumes that you start with a realistic sample size - one where you can afford the time and resources for additional mail and phone contacts. If you sent the survey out by e-mail, I wonder how many potential participants had Spam filters that sent the e-mail to the trash, sight unseen? Anyway, I do not wish to be presumptuous. There may be reasons why it made sense to go after a large sample that apply to your topic. Treat this as "rule of thumb" advice. HTH, Stephen Brand For personalized and professional consultation in statistics and research design, visit www.statisticsdoc.com -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of Nicola Baldini Sent: Wednesday, December 13, 2006 9:45 AM To: [hidden email] Subject: Re: guessing mean of bounded variable with 1:30 sampling ratio The 12-item scale which I used is the same of another survey of mine (with 40% response rate), and a PCA gives the same results for both of them (same items loads on same factors). Would this give you more confidence in my data, despite the 1:30 response rate? Respondeses were elicited by e-mail (two recalls) in the less successful case and by e-mail (two recalls) plus a phone call in the other one (the sample with 40% response rate counted only 600 persons, it was much easier to phone them begging to complete my survey). At 14.30 12/12/2006 -0500, Richard Ristow wrote: >At 08:04 AM 12/10/2006, Nicola Baldini wrote: > >>I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control. >> >>Have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them? > >Whatever is to be done, can't be done by analytic techniques. You need to argue that your results are meaningful, and you can make sense of what they mean. At the best, with this response rate, the interpretation must include words like 'tentative' or 'preliminary'. > >>Respondents might differ [for the popuulation' on some variables for which I couldn't/didn't control. > >Yes, and on one very big one: their answers to the question you're asking. I'm not a survey analyst, but I believe it's well known that the likelihood that people will answer a question, varies with their answers. For example, it's my understanding that, >. If it's a question of satisfaction, the most dissatisfied are most likely to respond, and the mid-scale ones the least. >. People are less likely to respond, the less they respect and trust whomever is asking them. (And your response rate doesn't suggest a lot of respect and trust, overall.) 'Respect and trust' for the questioner are probably related to the responses on your question. > >That's just to start with. I could probably think of more; participants with extensive experience with surveys, will be able to think of many more. > >You haven't said anything about your survey. If it's really just one or a few questions, you simply don't have much information. If it's a survey of some richness, there may be patterns that stand out. You won't be able to say much about the prevalence of those patterns, but sometimes showing their existence is informative. (Of course, you can estimate a MINIMUM incidence, on the assumption that none of the 97% non-respondents match the pattern.) > >Your questions are beyond what can be addressed with statistical methodology. I think methodology, by itself, can say only what we have said: Your results must be viewed with deep caution. There's a heavy burden of proof on you, to argue why they shouldn't be ignored. > >Beyond that, we come to questions of meaning, and that's for subject specialists, not methodologists. |
Free forum by Nabble | Edit this page |