SPSSX Discussion

Re: guessing mean of bounded variable with 1:30 sampling ratio (4)

Classic

List

Threaded

5 messages Options

nicola.baldini2

Re: guessing mean of bounded variable with 1:30 sampling ratio (4)

Thank you all for your suggestions. Sorry for not being fully clear about the data involved my question.
I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control.
Given such new information, have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them?
Nicola

-----Original Message-----
I have a population of N=12000. I want to know the mean (and possibly the standard deviation) of a variable x, bounded between 1 and 7. I took a (let's suppose random) sample of n=400 and estimated mean = 3.14 (standard error = .15) and standard deviation = 2.28. The sample strongly departs from normality. Can I trust such estimates? How much? Can I attach a p-value to them? I need to state formally that, despite a ridicolous response rate, my research is not that bad.

Richard Ristow

Re: guessing mean of bounded variable with 1:30 sampling ratio

At 08:04 AM 12/10/2006, Nicola Baldini wrote:

>I sent the questionnaire to the entire population (N=12000). I
>received only n=400 responses. Respondents resemble the population in
>terms of all known characteristics, but - obviously - they might
>differ on some variables for which I couldn't/didn't control.
>
>Have I chances to get published with some statistical analysis on my
>data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping
>variables instead of full observations) or should I trash them?

Whatever is to be done, can't be done by analytic techniques. You need
to argue that your results are meaningful, and you can make sense of
what they mean. At the best, with this response rate, the
interpretation must include words like 'tentative' or 'preliminary'.

>Respondents might differ [for the popuulation' on some variables for
>which I couldn't/didn't control.

Yes, and on one very big one: their answers to the question you're
asking. I'm not a survey analyst, but I believe it's well known that
the likelihood that people will answer a question, varies with their
answers. For example, it's my understanding that,
. If it's a question of satisfaction, the most dissatisfied are most
likely to respond, and the mid-scale ones the least.
. People are less likely to respond, the less they respect and trust
whomever is asking them. (And your response rate doesn't suggest a lot
of respect and trust, overall.) 'Respect and trust' for the questioner
are probably related to the responses on your question.

That's just to start with. I could probably think of more; participants
with extensive experience with surveys, will be able to think of many
more.

You haven't said anything about your survey. If it's really just one or
a few questions, you simply don't have much information. If it's a
survey of some richness, there may be patterns that stand out. You
won't be able to say much about the prevalence of those patterns, but
sometimes showing their existence is informative. (Of course, you can
estimate a MINIMUM incidence, on the assumption that none of the 97%
non-respondents match the pattern.)

Your questions are beyond what can be addressed with statistical
methodology. I think methodology, by itself, can say only what we have
said: Your results must be viewed with deep caution. There's a heavy
burden of proof on you, to argue why they shouldn't be ignored.

Beyond that, we come to questions of meaning, and that's for subject
specialists, not methodologists.

statisticsdoc

Re: guessing mean of bounded variable with 1:30 sampling ratio (4)

In reply to this post by nicola.baldini2

Stephen Brand
www.statisticsdoc.com

Nicola,

I would be very concerned about the low response rate - I would be very
concerned that the sample is not representative of the larger population.
How does it compare with other studies in your area of investigation? How
were responses elicited?
How would you account for the reponse rate?

HTH,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Nicola Baldini
Sent: Sunday, December 10, 2006 8:05 AM
To: [hidden email]
Subject: Re: guessing mean of bounded variable with 1:30 sampling ratio
(4)

Thank you all for your suggestions. Sorry for not being fully clear about
the data involved my question.
I sent the questionnaire to the entire population (N=12000). I received only
n=400 responses. Respondents resemble the population in terms of all known
characteristics, but - obviously - they might differ on some variables for
which I couldn't/didn't control.
Given such new information, have I chances to get published with some
statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe
too bootstrapping variables instead of full observations) or should I trash
them?
Nicola

-----Original Message-----
I have a population of N=12000. I want to know the mean (and possibly the
standard deviation) of a variable x, bounded between 1 and 7. I took a
(let's suppose random) sample of n=400 and estimated mean = 3.14 (standard
error = .15) and standard deviation = 2.28. The sample strongly departs from
normality. Can I trust such estimates? How much? Can I attach a p-value to
them? I need to state formally that, despite a ridicolous response rate, my
research is not that bad.

nicola.baldini2

Re: guessing mean of bounded variable with 1:30 sampling ratio

In reply to this post by Richard Ristow

The 12-item scale which I used is the same of another survey of mine (with 40% response rate), and a PCA gives the same results for both of them (same items loads on same factors). Would this give you more confidence in my data, despite the 1:30 response rate?
Respondeses were elicited by e-mail (two recalls) in the less successful case and by e-mail (two recalls) plus a phone call in the other one (the sample with 40% response rate counted only 600 persons, it was much easier to phone them begging to complete my survey).

At 14.30 12/12/2006 -0500, Richard Ristow wrote:

>At 08:04 AM 12/10/2006, Nicola Baldini wrote:
>
>>I sent the questionnaire to the entire population (N=12000). I received only n=400 responses. Respondents resemble the population in terms of all known characteristics, but - obviously - they might differ on some variables for which I couldn't/didn't control.
>>
>>Have I chances to get published with some statistical analysis on my data (e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables instead of full observations) or should I trash them?
>
>Whatever is to be done, can't be done by analytic techniques. You need to argue that your results are meaningful, and you can make sense of what they mean. At the best, with this response rate, the interpretation must include words like 'tentative' or 'preliminary'.
>
>>Respondents might differ [for the popuulation' on some variables for which I couldn't/didn't control.
>
>Yes, and on one very big one: their answers to the question you're asking. I'm not a survey analyst, but I believe it's well known that the likelihood that people will answer a question, varies with their answers. For example, it's my understanding that,
>. If it's a question of satisfaction, the most dissatisfied are most likely to respond, and the mid-scale ones the least.
>. People are less likely to respond, the less they respect and trust whomever is asking them. (And your response rate doesn't suggest a lot of respect and trust, overall.) 'Respect and trust' for the questioner are probably related to the responses on your question.
>
>That's just to start with. I could probably think of more; participants with extensive experience with surveys, will be able to think of many more.
>
>You haven't said anything about your survey. If it's really just one or a few questions, you simply don't have much information. If it's a survey of some richness, there may be patterns that stand out. You won't be able to say much about the prevalence of those patterns, but sometimes showing their existence is informative. (Of course, you can estimate a MINIMUM incidence, on the assumption that none of the 97% non-respondents match the pattern.)
>
>Your questions are beyond what can be addressed with statistical methodology. I think methodology, by itself, can say only what we have said: Your results must be viewed with deep caution. There's a heavy burden of proof on you, to argue why they shouldn't be ignored.
>
>Beyond that, we come to questions of meaning, and that's for subject specialists, not methodologists.

statisticsdoc

Re: guessing mean of bounded variable with 1:30 sampling ratio

www.statisticsdoc.com
Stephen Brand

Nicola,

Having a consistemt factor structure is good, but it still leaves me
concerned about your original question concerning estimation of the mean.
With such a small response rate, it is entirely possible that your
estimation of the population standard deviation (and variance) could be very
biased.

I am not sure why you tried to sample 12,000 people. As a general rule, it
is far better to have a smaller, but representative sample than a sample
that has a larger n but a lower response rate (and hence more risk of bias).
It is easier to get a larger response rate, and a more representative
sample, from a smaller group of potential participants (as you did in your
first study). I do not want to be presumptuous, but in the future, you may
want to start by sending out a smaller number of surveys (even just 300) to
a random sample, and trying very hard to get a high response rate from that
group. A rule of thumb in mail surveys is that you can get 35-40% on the
first mailing alone, and then get up over 65% with followup mailings and
calls (more if you have incentives). However, this rule of thumb assumes
that you start with a realistic sample size - one where you can afford the
time and resources for additional mail and phone contacts.

If you sent the survey out by e-mail, I wonder how many potential
participants had Spam filters that sent the e-mail to the trash, sight
unseen?

Anyway, I do not wish to be presumptuous. There may be reasons why it made
sense to go after a large sample that apply to your topic. Treat this as
"rule of thumb" advice.

HTH,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Nicola Baldini
Sent: Wednesday, December 13, 2006 9:45 AM
To: [hidden email]
Subject: Re: guessing mean of bounded variable with 1:30 sampling ratio

The 12-item scale which I used is the same of another survey of mine (with
40% response rate), and a PCA gives the same results for both of them (same
items loads on same factors). Would this give you more confidence in my
data, despite the 1:30 response rate?
Respondeses were elicited by e-mail (two recalls) in the less successful
case and by e-mail (two recalls) plus a phone call in the other one (the
sample with 40% response rate counted only 600 persons, it was much easier
to phone them begging to complete my survey).

At 14.30 12/12/2006 -0500, Richard Ristow wrote:
>At 08:04 AM 12/10/2006, Nicola Baldini wrote:
>
>>I sent the questionnaire to the entire population (N=12000). I received
only n=400 responses. Respondents resemble the population in terms of all
known characteristics, but - obviously - they might differ on some variables
for which I couldn't/didn't control.
>>
>>Have I chances to get published with some statistical analysis on my data
(e.g. anova, maybe using bootstrap --- maybe too bootstrapping variables
instead of full observations) or should I trash them?
>
>Whatever is to be done, can't be done by analytic techniques. You need to
argue that your results are meaningful, and you can make sense of what they
mean. At the best, with this response rate, the interpretation must include
words like 'tentative' or 'preliminary'.
>
>>Respondents might differ [for the popuulation' on some variables for which
I couldn't/didn't control.
>
>Yes, and on one very big one: their answers to the question you're asking.
I'm not a survey analyst, but I believe it's well known that the likelihood
that people will answer a question, varies with their answers. For example,
it's my understanding that,
>. If it's a question of satisfaction, the most dissatisfied are most likely
to respond, and the mid-scale ones the least.
>. People are less likely to respond, the less they respect and trust
whomever is asking them. (And your response rate doesn't suggest a lot of
respect and trust, overall.) 'Respect and trust' for the questioner are
probably related to the responses on your question.
>
>That's just to start with. I could probably think of more; participants
with extensive experience with surveys, will be able to think of many more.
>
>You haven't said anything about your survey. If it's really just one or a
few questions, you simply don't have much information. If it's a survey of
some richness, there may be patterns that stand out. You won't be able to
say much about the prevalence of those patterns, but sometimes showing their
existence is informative. (Of course, you can estimate a MINIMUM incidence,
on the assumption that none of the 97% non-respondents match the pattern.)
>
>Your questions are beyond what can be addressed with statistical
methodology. I think methodology, by itself, can say only what we have said:
Your results must be viewed with deep caution. There's a heavy burden of
proof on you, to argue why they shouldn't be ignored.
>
>Beyond that, we come to questions of meaning, and that's for subject
specialists, not methodologists.