Margin of error redux

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Margin of error redux

Arthur Kramer

This problem has been vexing me for a while.  When I read in newspapers about surveys, they report a margin of error for the entire survey. I have looked for ways those margins of error could have been calculated, but the only formula I find for a 95% margin of error is this:

 

Sqrt((p)(1-p)/N)*1.96 

 

But to my knowledge, that only pertains to the margin of error of an item on the survey, and N is the number of respondents to the survey.  Can the same formula be used to assess the margin of error of the survey respondents based on the number of people surveyed, as stated above?

 

Thanks for any and all assistance.

 

Arthur Kramer

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Margin of error redux

Art Kendall
See my second reply to the OP.

Art

On 6/13/2011 11:05 AM, Arthur Kramer wrote:

This problem has been vexing me for a while.  When I read in newspapers about surveys, they report a margin of error for the entire survey. I have looked for ways those margins of error could have been calculated, but the only formula I find for a 95% margin of error is this:

 

Sqrt((p)(1-p)/N)*1.96 

 

But to my knowledge, that only pertains to the margin of error of an item on the survey, and N is the number of respondents to the survey.  Can the same formula be used to assess the margin of error of the survey respondents based on the number of people surveyed, as stated above?

 

Thanks for any and all assistance.

 

Arthur Kramer

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Margin of error redux

Art Kendall
I am sorry I do not have time to work this out right now.
If you run the syntax below you'll see the poisson confidence intervals that SPSS produces. without an fpc.
With a finite population correction the limits should be a little narrower.
In working it out, with the poisson distribution the proportion = its "variance"  but I do not recall whether that is the ordinary variance or the sampling variance

New file.
data list list /study hits samplesize popsize (f1,3f5).
begin data
2 1 28 500
3 2 28 500
end data.

PROPOR NUM=hits DENOM=samplesize /LEVEL ALPHA=.05.

New file.
data list list /study hits samplesize popsize (f1,3f5).
begin data
2 16 280 500
3 17 280 500
end data.

PROPOR NUM=hits DENOM=samplesize /LEVEL ALPHA=.05.



Art Kendall
Social Research Consultants

On 6/13/2011 2:46 PM, Arthur Kramer wrote:

So, then using the fpc when the returned sample =6% (n=28); the population is N=500, the standard error for the proportion:

 

S.E.=sqrt((p*p-1)/n)*sqrt((N-n)/(N-1))

 

S.E.=sqrt ((.06*94)/28)*sqrt((500-280/499)=.138

 

So, I have a margin of error of +/- 14%?

 

Arthur Kramer

 

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Art Kendall
Sent: Monday, June 13, 2011 12:39 PM
To: [hidden email]
Subject: Re: Margin of error redux

 

See my second reply to the OP.

Art

On 6/13/2011 11:05 AM, Arthur Kramer wrote:

This problem has been vexing me for a while.  When I read in newspapers about surveys, they report a margin of error for the entire survey. I have looked for ways those margins of error could have been calculated, but the only formula I find for a 95% margin of error is this:

 

Sqrt((p)(1-p)/N)*1.96 

 

But to my knowledge, that only pertains to the margin of error of an item on the survey, and N is the number of respondents to the survey.  Can the same formula be used to assess the margin of error of the survey respondents based on the number of people surveyed, as stated above?

 

Thanks for any and all assistance.

 

Arthur Kramer

 

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Margin of error redux

jmdpulido
In reply to this post by Arthur Kramer
Dear Arthur,

The formula you are talking about 1,96*sqrt[(p*(1-p)/n] comes from the following:

+- 1,96 is the point in an Standard Normal Distribution (0,1) that correspond to 95% probability. Thus, 1,96 implies you are approximating a binomial distribution by a normal distribution, which is right for large samples (you can check Anderson, Sweeney and Williams: Statistics for Business Administration and Economics for a prove of this result).

sqrt[p*(1-p)/n] is the Standard Error of a point estimate of a proportion.

The usual margin error reported in samples asume p=50%=(1-p) and N for the total sample. As you can easily check with an excel file sqrt[p*(1-p)/n] is a parabol, with a maximum at p=50%. Thus, the SE is maximum at p=50% for any given N. That's why in many samples they say that the reported margin of error is calculated under the hypothesis of maximum uncertainty.

However, this is not always the case. On the first hand, some times the sample is not a pure random sample, so not all the individuals have the same probability of being sample. If this is the case, this formula will underestimate the "true" standard error of your estimation of the proportion.

Secondly, sometimes not all the sample have a "valid" answer. So, your "n" is not the total sample size "N", but some smaller "n", not counting the "missing data". Because if you either delete the missing data or impute them by any method (e.g. regression), you do not have N independent observations, but an smaller one.
E.g., if you want to calculate the percentage of women that are unemployed, your "n" is not the total number of people (men & women), but only the women with valid data for their labour status (which is an n much smaller than the total N).

Thridly, there is the "finite population" correction that Art Kendall wisely talks to you about.

Summing up, I prefer to calculate the margin of error for each question, rather than using the whole sample. This some times gives me a smaller confidence interval that the reported margin of error for the whole sample, and some times a bigger one. I also tend to prefer a 99% confidence interval, specially for large samples, as in large samples everything tends to be statistically significant.

Don't hesitate to contact me if you have any more queries

Jmdpulido@yahoo.es
PhD student in Applied Economics.
Reply | Threaded
Open this post in threaded view
|

Re: Margin of error redux

Arthur Kramer
I understand what you said and agree that it is better to report the margin
of error for each question rather than for the survey, as a whole. But when
reporting to most of the members of the upper administration of my
institution, who do not have a statistical background and are accustomed to
seeing the margin of error reported in newspapers, they want to know the
margin of error of my survey. Giving them an explanation similar to the one
you provided leaves them "unsatisfied" with the "statistical jargon."

Arthur Kramer

"Believe half of what you see and none of what you hear."

N. Whitfield
B. Strong



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
jmdpulido
Sent: Thursday, June 16, 2011 5:50 AM
To: [hidden email]
Subject: Re: Margin of error redux

Dear Arthur,

The formula you are talking about 1,96*sqrt[(p*(1-p)/n] comes from the
following:

+- 1,96 is the point in an Standard Normal Distribution (0,1) that
correspond to 95% probability. Thus, 1,96 implies you are approximating a
binomial distribution by a normal distribution, which is right for large
samples (you can check Anderson, Sweeney and Williams: Statistics for
Business Administration and Economics for a prove of this result).

sqrt[p*(1-p)/n] is the Standard Error of a point estimate of a proportion.

The usual margin error reported in samples asume p=50%=(1-p) and N for the
total sample. As you can easily check with an excel file sqrt[p*(1-p)/n] is
a parabol, with a maximum at p=50%. Thus, the SE is maximum at p=50% for any
given N. That's why in many samples they say that the reported margin of
error is calculated under the hypothesis of maximum uncertainty.

However, this is not always the case. On the first hand, some times the
sample is not a pure random sample, so not all the individuals have the same
probability of being sample. If this is the case, this formula will
underestimate the "true" standard error of your estimation of the
proportion.

Secondly, sometimes not all the sample have a "valid" answer. So, your "n"
is not the total sample size "N", but some smaller "n", not counting the
"missing data". Because if you either delete the missing data or impute them
by any method (e.g. regression), you do not have N independent observations,
but an smaller one.
E.g., if you want to calculate the percentage of women that are unemployed,
your "n" is not the total number of people (men & women), but only the women
with valid data for their labour status (which is an n much smaller than the
total N).

Thridly, there is the "finite population" correction that Art Kendall wisely
talks to you about.

Summing up, I prefer to calculate the margin of error for each question,
rather than using the whole sample. This some times gives me a smaller
confidence interval that the reported margin of error for the whole sample,
and some times a bigger one. I also tend to prefer a 99% confidence
interval, specially for large samples, as in large samples everything tends
to be statistically significant.

Don't hesitate to contact me if you have any more queries

[hidden email]
PhD student in Applied Economics.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Margin-of-error-redux-tp448469
4p4494317.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD