Can anyone help me get a population standard deviation?

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Can anyone help me get a population standard deviation?

Nancy Rusinak
In SPSS?  It automatically assumes a sample and I do not know how to get SPSS to give me a standard deviation for a population.  Many thanks, in advance.

Nancy
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
Nancy Rusinak wrote
In SPSS?  It automatically assumes a sample and I do not know how to get
SPSS to give me a standard deviation for a population.  Many thanks, in
advance.

Nancy
Here's one way to get it.

1. Use AGGREGATE to write the sample SD and N for the variable of interest to the working data file.
2. Compute SS = SD^2 x (n-1)
3. Compute Pop. Variance = SS/n
4. Compute Pop. SD = SQRT(Pop. Variance)

E.g.,

data list free / x (f2.0) .
begin data
2 5 4 9 8 7 4 3 1
end data.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /ssd_x=SD(x)
  /n = nu(x).

compute #svarx = ssd_x**2 . /* sample variance of X .
compute #SS_x = #svarx * (n-1) . /* SS for X .
compute #pvarx = #SS_x / n . /* population variance of X .
compute psd_x = SQRT(#pvarx). /* population SD .
formats ssd_x psd_x (f8.3).
var lab
 ssd_x 'Sample SD of X'
 psd_x 'Population SD of X'
.
descrip ssd_x psd_x / stat = mean.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Garry Gelade
In reply to this post by Nancy Rusinak

Hi Nancy

 

You can do this by weighting your cases.  Suppose you have N cases.

COMPUTE a new variable equal to N/N-1. WEIGHT the cases by w.

 

Then run your DESCRIPTIVES command and the reported SD is the population SD you need.

 

Garry Gelade

Business Analytic Ltd.

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Nancy Rusinak
Sent: 21 January 2010 18:30
To: [hidden email]
Subject: Can anyone help me get a population standard deviation?

 

In SPSS?  It automatically assumes a sample and I do not know how to get SPSS to give me a standard deviation for a population.  Many thanks, in advance.

Nancy



__________ Information from ESET NOD32 Antivirus, version of virus signature database 4794 (20100121) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
In reply to this post by Bruce Weaver
I am not sure about Bruce's formula. The sample SD is SS divided by n. The
ESTIMATE of the population SD, based in the sample, is SS divided by n-1. If
I'm right about this, the SPSS SD should be multiplied by n and divided by
n-1 to get the estimated population SD.
On the other hand, if Nancy's dataset represents itself the whole of the
population (e.g. a census), perhaps Nancy believes that the population
variance should be computed directly. But that would not be right:
(a) The directly computed population variance is SS/n, just as in a sample
(a population is a sample of size n=N, with a sampling ratio of 1:1.
(b) On the other hand, a measurement of a population (e.g. a census) is just
one sample measurement (out of many measurements you can take, with
different census takers or at different times of day, etc.), and therefore
even a census is a sample, with a standard error of the estimate (possibly
quite small unless the census takers are very sloppy). From this viewpoint,
again the SD of the one (sampled) take of the census is SS/n; the estimate
of the SD for the whole "population" of various possible measurements of the
same (human) population would be SS/(n-1) which in this case is also
SS/(N-1).
Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 21 January 2010 18:57
To: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

Nancy Rusinak wrote:
>
> In SPSS?  It automatically assumes a sample and I do not know how to get
> SPSS to give me a standard deviation for a population.  Many thanks, in
> advance.
>
> Nancy
>
>

Here's one way to get it.

1. Use AGGREGATE to write the sample SD and N for the variable of interest
to the working data file.
2. Compute SS = SD^2 x (n-1)
3. Compute Pop. Variance = SS/n
4. Compute Pop. SD = SQRT(Pop. Variance)

E.g.,

data list free / x (f2.0) .
begin data
2 5 4 9 8 7 4 3 1
end data.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /ssd_x=SD(x)
  /n = nu(x).

compute #svarx = ssd_x**2 . /* sample variance of X .
compute #SS_x = #svarx * (n-1) . /* SS for X .
compute #pvarx = #SS_x / n . /* population variance of X .
compute psd_x = SQRT(#pvarx). /* population SD .
formats ssd_x psd_x (f8.3).
var lab
 ssd_x 'Sample SD of X'
 psd_x 'Population SD of X'
.
descrip ssd_x psd_x / stat = mean.



-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27265309.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
Hector, FWIW, Excel gives the same results I obtained with SPSS:

2.728 from STDEV (with division by n-1)
2.572 from STDEVP (with division by N)

I don't have time to try Gerry's weighting suggestion right now, but off the top of my head, I think that it will work for variances, but not standard deviations.


Hector Maletta wrote
I am not sure about Bruce's formula. The sample SD is SS divided by n. The
ESTIMATE of the population SD, based in the sample, is SS divided by n-1. If
I'm right about this, the SPSS SD should be multiplied by n and divided by
n-1 to get the estimated population SD.
On the other hand, if Nancy's dataset represents itself the whole of the
population (e.g. a census), perhaps Nancy believes that the population
variance should be computed directly. But that would not be right:
(a) The directly computed population variance is SS/n, just as in a sample
(a population is a sample of size n=N, with a sampling ratio of 1:1.
(b) On the other hand, a measurement of a population (e.g. a census) is just
one sample measurement (out of many measurements you can take, with
different census takers or at different times of day, etc.), and therefore
even a census is a sample, with a standard error of the estimate (possibly
quite small unless the census takers are very sloppy). From this viewpoint,
again the SD of the one (sampled) take of the census is SS/n; the estimate
of the SD for the whole "population" of various possible measurements of the
same (human) population would be SS/(n-1) which in this case is also
SS/(N-1).
Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Bruce Weaver
Sent: 21 January 2010 18:57
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: Can anyone help me get a population standard deviation?

Nancy Rusinak wrote:
>
> In SPSS?  It automatically assumes a sample and I do not know how to get
> SPSS to give me a standard deviation for a population.  Many thanks, in
> advance.
>
> Nancy
>
>

Here's one way to get it.

1. Use AGGREGATE to write the sample SD and N for the variable of interest
to the working data file.
2. Compute SS = SD^2 x (n-1)
3. Compute Pop. Variance = SS/n
4. Compute Pop. SD = SQRT(Pop. Variance)

E.g.,

data list free / x (f2.0) .
begin data
2 5 4 9 8 7 4 3 1
end data.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /ssd_x=SD(x)
  /n = nu(x).

compute #svarx = ssd_x**2 . /* sample variance of X .
compute #SS_x = #svarx * (n-1) . /* SS for X .
compute #pvarx = #SS_x / n . /* population variance of X .
compute psd_x = SQRT(#pvarx). /* population SD .
formats ssd_x psd_x (f8.3).
var lab
 ssd_x 'Sample SD of X'
 psd_x 'Population SD of X'
.
descrip ssd_x psd_x / stat = mean.



-----
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27265309.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
In reply to this post by Garry Gelade
Garry Gelade wrote
Hi Nancy



You can do this by weighting your cases.  Suppose you have N cases.

COMPUTE a new variable equal to N/N-1. WEIGHT the cases by w.

Then run your DESCRIPTIVES command and the reported SD is the population SD
you need.
Hi Garry.  First, I apologize for misspelling your name in my other post.  Second, SPSS always divides by n-1 when computing a variance, so your weight would have to be (n-1)/n.  But even then, it doesn't work, as far as I can tell.  It is the squared deviation scores that would have to be so weighted, not the raw scores themselves.  I.e.,

1. Use AGGREGATE to add the mean of X to the file
2. Compute the squared deviation from the mean for each score
3. let W = (n-1)/n, and weight by W
4. Use DESCRIPTIVES (or MEANS) to get the mean of the squared deviation score from step 2

compute w = (n-1)/n.
weight by w.
descrip sqdev / stat = mean .

This weighted mean of squared deviations = the population variance (with division by N).

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
In reply to this post by Bruce Weaver
Bruce,
According to many standard textbooks, such as the classic H.Blalock "Social
Statistics", the sample standard deviation is s= sqrt(SS/n) (equation 6.3),
but a footnote specifies: "Some texts define s with n-1 in the denominator
instead of n. We shall later define delta=sqrt(SS/(n-1)) with delta being an
unbiased estimate of sigma [the population SD] for RANDOM samples." (note 1
of section 6.4, 1980 edition). Blalock notes that some authors present
directly s with n-1 in the denominator, but he prefers using n, and then
noting the bias of that formula before introducing the correction in the
denominator in order to get an unbiased estimate of the population SD.
The difference is of course significative only for small samples.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 21 January 2010 20:44
To: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

Hector, FWIW, Excel gives the same results I obtained with SPSS:

2.728    from STDEV (with division by n-1)
2.572    from STDEVP (with division by N)

I don't have time to try Gerry's weighting suggestion right now, but off the
top of my head, I think that it will work for variances, but not standard
deviations.



Hector Maletta wrote:

>
> I am not sure about Bruce's formula. The sample SD is SS divided by n. The
> ESTIMATE of the population SD, based in the sample, is SS divided by n-1.
> If
> I'm right about this, the SPSS SD should be multiplied by n and divided by
> n-1 to get the estimated population SD.
> On the other hand, if Nancy's dataset represents itself the whole of the
> population (e.g. a census), perhaps Nancy believes that the population
> variance should be computed directly. But that would not be right:
> (a) The directly computed population variance is SS/n, just as in a sample
> (a population is a sample of size n=N, with a sampling ratio of 1:1.
> (b) On the other hand, a measurement of a population (e.g. a census) is
> just
> one sample measurement (out of many measurements you can take, with
> different census takers or at different times of day, etc.), and therefore
> even a census is a sample, with a standard error of the estimate (possibly
> quite small unless the census takers are very sloppy). From this
> viewpoint,
> again the SD of the one (sampled) take of the census is SS/n; the estimate
> of the SD for the whole "population" of various possible measurements of
> the
> same (human) population would be SS/(n-1) which in this case is also
> SS/(N-1).
> Hector
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Bruce Weaver
> Sent: 21 January 2010 18:57
> To: [hidden email]
> Subject: Re: Can anyone help me get a population standard deviation?
>
> Nancy Rusinak wrote:
>>
>> In SPSS?  It automatically assumes a sample and I do not know how to get
>> SPSS to give me a standard deviation for a population.  Many thanks, in
>> advance.
>>
>> Nancy
>>
>>
>
> Here's one way to get it.
>
> 1. Use AGGREGATE to write the sample SD and N for the variable of interest
> to the working data file.
> 2. Compute SS = SD^2 x (n-1)
> 3. Compute Pop. Variance = SS/n
> 4. Compute Pop. SD = SQRT(Pop. Variance)
>
> E.g.,
>
> data list free / x (f2.0) .
> begin data
> 2 5 4 9 8 7 4 3 1
> end data.
>
> AGGREGATE
>   /OUTFILE=* MODE=ADDVARIABLES
>   /BREAK=
>   /ssd_x=SD(x)
>   /n = nu(x).
>
> compute #svarx = ssd_x**2 . /* sample variance of X .
> compute #SS_x = #svarx * (n-1) . /* SS for X .
> compute #pvarx = #SS_x / n . /* population variance of X .
> compute psd_x = SQRT(#pvarx). /* population SD .
> formats ssd_x psd_x (f8.3).
> var lab
>  ssd_x 'Sample SD of X'
>  psd_x 'Population SD of X'
> .
> descrip ssd_x psd_x / stat = mean.
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>
> NOTE:  My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
> --
> View this message in context:
>
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation

> --tp27263817p27265309.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27266244.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
OK, I see that we are just having a problem with terminology.  Here is how I understood the original question.

1. SPSS computes the SD with n-1 in the denominator.
2. Nancy wants a standard deviation with n in the denominator.

I showed a way to do that.  It now occurs to me that I took an extremely scenic route to the answer--Rube Goldberg would be very proud of that code.  Here's a much more direct way.


data list free / x (f2.0) .
begin data
2 5 4 9 8 7 4 3 1
end data.

* SPSS gives the sample SD -- i.e., denominator is n-1, not N .

* Use AGGREGATE to get SD and N for X.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /s_x=SD(x)
  /n = nu(x).

compute sigma_x = s_x * SQRT((n-1)/n).
format s_x sigma_x (f5.3).
descrip x s_x sigma_x.


Regarding the issue of terminology, I respectfully disagree with that excerpt from Blalock.  I have always been taught to use the terminology as it is given on the Wikipedia page on the standard deviation:

   http://en.wikipedia.org/wiki/Standard_deviation

The population variance has N as the denominator, and is only computed if one has the entire population of scores (which occurs only rarely).  The sample variance, on the other  hand, uses n-1 as the denominator, and is used when you have a sample from some population.  For random samples, the sample variance is an unbiased estimator of the population variance.

Taking the square roots of those variances yields the corresponding standard deviations.  But note that the sample SD (with n-1 in the denominator) is not an unbiased estimator of the population SD.  The Wikipedia page says, "s is not an unbiased estimator for the standard deviation [sigma]; it tends to underestimate the population standard deviation".  There is another Wikipedia page on "unbiased estimation of the standard deviation" here:

   http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation

Cheers,
Bruce


Hector Maletta wrote
Bruce,
According to many standard textbooks, such as the classic H.Blalock "Social
Statistics", the sample standard deviation is s= sqrt(SS/n) (equation 6.3),
but a footnote specifies: "Some texts define s with n-1 in the denominator
instead of n. We shall later define delta=sqrt(SS/(n-1)) with delta being an
unbiased estimate of sigma [the population SD] for RANDOM samples." (note 1
of section 6.4, 1980 edition). Blalock notes that some authors present
directly s with n-1 in the denominator, but he prefers using n, and then
noting the bias of that formula before introducing the correction in the
denominator in order to get an unbiased estimate of the population SD.
The difference is of course significative only for small samples.

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Bruce Weaver
Sent: 21 January 2010 20:44
To: SPSSX-L@LISTSERV.UGA.EDU
Subject: Re: Can anyone help me get a population standard deviation?

Hector, FWIW, Excel gives the same results I obtained with SPSS:

2.728    from STDEV (with division by n-1)
2.572    from STDEVP (with division by N)

I don't have time to try Gerry's weighting suggestion right now, but off the
top of my head, I think that it will work for variances, but not standard
deviations.



Hector Maletta wrote:
>
> I am not sure about Bruce's formula. The sample SD is SS divided by n. The
> ESTIMATE of the population SD, based in the sample, is SS divided by n-1.
> If
> I'm right about this, the SPSS SD should be multiplied by n and divided by
> n-1 to get the estimated population SD.
> On the other hand, if Nancy's dataset represents itself the whole of the
> population (e.g. a census), perhaps Nancy believes that the population
> variance should be computed directly. But that would not be right:
> (a) The directly computed population variance is SS/n, just as in a sample
> (a population is a sample of size n=N, with a sampling ratio of 1:1.
> (b) On the other hand, a measurement of a population (e.g. a census) is
> just
> one sample measurement (out of many measurements you can take, with
> different census takers or at different times of day, etc.), and therefore
> even a census is a sample, with a standard error of the estimate (possibly
> quite small unless the census takers are very sloppy). From this
> viewpoint,
> again the SD of the one (sampled) take of the census is SS/n; the estimate
> of the SD for the whole "population" of various possible measurements of
> the
> same (human) population would be SS/(n-1) which in this case is also
> SS/(N-1).
> Hector
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
> Bruce Weaver
> Sent: 21 January 2010 18:57
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Re: Can anyone help me get a population standard deviation?
>
> Nancy Rusinak wrote:
>>
>> In SPSS?  It automatically assumes a sample and I do not know how to get
>> SPSS to give me a standard deviation for a population.  Many thanks, in
>> advance.
>>
>> Nancy
>>
>>
>
> Here's one way to get it.
>
> 1. Use AGGREGATE to write the sample SD and N for the variable of interest
> to the working data file.
> 2. Compute SS = SD^2 x (n-1)
> 3. Compute Pop. Variance = SS/n
> 4. Compute Pop. SD = SQRT(Pop. Variance)
>
> E.g.,
>
> data list free / x (f2.0) .
> begin data
> 2 5 4 9 8 7 4 3 1
> end data.
>
> AGGREGATE
>   /OUTFILE=* MODE=ADDVARIABLES
>   /BREAK=
>   /ssd_x=SD(x)
>   /n = nu(x).
>
> compute #svarx = ssd_x**2 . /* sample variance of X .
> compute #SS_x = #svarx * (n-1) . /* SS for X .
> compute #pvarx = #SS_x / n . /* population variance of X .
> compute psd_x = SQRT(#pvarx). /* population SD .
> formats ssd_x psd_x (f8.3).
> var lab
>  ssd_x 'Sample SD of X'
>  psd_x 'Population SD of X'
> .
> descrip ssd_x psd_x / stat = mean.
>
>
>
> -----
> --
> Bruce Weaver
> bweaver@lakeheadu.ca
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>
> NOTE:  My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
> --
> View this message in context:
>
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
> --tp27263817p27265309.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>


-----
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27266244.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
Both approaches are right. The sample SD divided by n is just the average
squared deviation. It is, however, shown that this is a biased estimate of
the population SD because one degree of freedom has been used up already
when computing the mean (a prior step for computing SD). Thence the
correction of the denominator to n-1 in order to estimate the pop SD. Now,
with the theoretical (and usually unknown) population SD, dubbed sigma, the
correct denominator is N, not N-1 (where n is sample size and N pop size).
This is because the mean and the SD are simply the first and second order
moments of the variable distribution (the kth moment of a variable centred
on its mean uses the variable's deviation raised to the kth power). For
instance, the skewness and kurtosis of a distribution depend on the third
and fifth moments. All those moments are averages of deviations, raised to
the kth power, and therefore divided by N.
Regarding the case of data covering the whole population, as I explained
before, if the variable is a random variable itself, even a full population
enumeration is still a "sample" from the universe of possible measurements,
and affected by random error of measurement. Therefore an unbiased ESTIMATE
of the true pop SD required dividing by n-1, including the extreme case of
n=N. Since nothing is excempt from random error measurement, not even in
Physics (let alone Social Sciences), all unbiased estimates of the true SD
use n-1 in the denominator.
In analysis of variance, too, total variance is computed with n-1 degrees of
freedom, including k degrees of freedom for the k groups determined by
factors, interactions and covariates, and n-k-1 to the residual variance,
for a total of n-1 DF. The remaining degree of freedom has been used up by
the mean.
Thus the essential distinction is between the "true" pop SD (divided by N)
and the "unbiased estimate" of that pop SD based on a sample (divided by
N-1). For the sample, its "intrinsic" or "descriptive" SD is SS divided by
n, giving the SD of sample units around sample mean, but for inferential
purposes (unbiased estimate of pop SD) the denominator should be n-1.
Hector
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 22 January 2010 12:31
To: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

OK, I see that we are just having a problem with terminology.  Here is how I
understood the original question.

1. SPSS computes the SD with n-1 in the denominator.
2. Nancy wants a standard deviation with n in the denominator.

I showed a way to do that.  It now occurs to me that I took an extremely
scenic route to the answer--Rube Goldberg would be very proud of that code.
Here's a much more direct way.


data list free / x (f2.0) .
begin data
2 5 4 9 8 7 4 3 1
end data.

* SPSS gives the sample SD -- i.e., denominator is n-1, not N .

* Use AGGREGATE to get SD and N for X.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /s_x=SD(x)
  /n = nu(x).

compute sigma_x = s_x * SQRT((n-1)/n).
format s_x sigma_x (f5.3).
descrip x s_x sigma_x.


Regarding the issue of terminology, I respectfully disagree with that
excerpt from Blalock.  I have always been taught to use the terminology as
it is given on the Wikipedia page on the standard deviation:

   http://en.wikipedia.org/wiki/Standard_deviation

The population variance has N as the denominator, and is only computed if
one has the entire population of scores (which occurs only rarely).  The
sample variance, on the other  hand, uses n-1 as the denominator, and is
used when you have a sample from some population.  For random samples, the
sample variance is an unbiased estimator of the population variance.

Taking the square roots of those variances yields the corresponding standard
deviations.  But note that the sample SD (with n-1 in the denominator) is
not an unbiased estimator of the population SD.  The Wikipedia page says, "s
is not an unbiased estimator for the standard deviation [sigma]; it tends to
underestimate the population standard deviation".  There is another
Wikipedia page on "unbiased estimation of the standard deviation" here:

   http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation

Cheers,
Bruce



Hector Maletta wrote:

>
> Bruce,
> According to many standard textbooks, such as the classic H.Blalock
> "Social
> Statistics", the sample standard deviation is s= sqrt(SS/n) (equation
> 6.3),
> but a footnote specifies: "Some texts define s with n-1 in the denominator
> instead of n. We shall later define delta=sqrt(SS/(n-1)) with delta being
> an
> unbiased estimate of sigma [the population SD] for RANDOM samples." (note
> 1
> of section 6.4, 1980 edition). Blalock notes that some authors present
> directly s with n-1 in the denominator, but he prefers using n, and then
> noting the bias of that formula before introducing the correction in the
> denominator in order to get an unbiased estimate of the population SD.
> The difference is of course significative only for small samples.
>
> Hector
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Bruce Weaver
> Sent: 21 January 2010 20:44
> To: [hidden email]
> Subject: Re: Can anyone help me get a population standard deviation?
>
> Hector, FWIW, Excel gives the same results I obtained with SPSS:
>
> 2.728    from STDEV (with division by n-1)
> 2.572    from STDEVP (with division by N)
>
> I don't have time to try Gerry's weighting suggestion right now, but off
> the
> top of my head, I think that it will work for variances, but not standard
> deviations.
>
>
>
> Hector Maletta wrote:
>>
>> I am not sure about Bruce's formula. The sample SD is SS divided by n.
>> The
>> ESTIMATE of the population SD, based in the sample, is SS divided by n-1.
>> If
>> I'm right about this, the SPSS SD should be multiplied by n and divided
>> by
>> n-1 to get the estimated population SD.
>> On the other hand, if Nancy's dataset represents itself the whole of the
>> population (e.g. a census), perhaps Nancy believes that the population
>> variance should be computed directly. But that would not be right:
>> (a) The directly computed population variance is SS/n, just as in a
>> sample
>> (a population is a sample of size n=N, with a sampling ratio of 1:1.
>> (b) On the other hand, a measurement of a population (e.g. a census) is
>> just
>> one sample measurement (out of many measurements you can take, with
>> different census takers or at different times of day, etc.), and
>> therefore
>> even a census is a sample, with a standard error of the estimate
>> (possibly
>> quite small unless the census takers are very sloppy). From this
>> viewpoint,
>> again the SD of the one (sampled) take of the census is SS/n; the
>> estimate
>> of the SD for the whole "population" of various possible measurements of
>> the
>> same (human) population would be SS/(n-1) which in this case is also
>> SS/(N-1).
>> Hector
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>> Bruce Weaver
>> Sent: 21 January 2010 18:57
>> To: [hidden email]
>> Subject: Re: Can anyone help me get a population standard deviation?
>>
>> Nancy Rusinak wrote:
>>>
>>> In SPSS?  It automatically assumes a sample and I do not know how to get
>>> SPSS to give me a standard deviation for a population.  Many thanks, in
>>> advance.
>>>
>>> Nancy
>>>
>>>
>>
>> Here's one way to get it.
>>
>> 1. Use AGGREGATE to write the sample SD and N for the variable of
>> interest
>> to the working data file.
>> 2. Compute SS = SD^2 x (n-1)
>> 3. Compute Pop. Variance = SS/n
>> 4. Compute Pop. SD = SQRT(Pop. Variance)
>>
>> E.g.,
>>
>> data list free / x (f2.0) .
>> begin data
>> 2 5 4 9 8 7 4 3 1
>> end data.
>>
>> AGGREGATE
>>   /OUTFILE=* MODE=ADDVARIABLES
>>   /BREAK=
>>   /ssd_x=SD(x)
>>   /n = nu(x).
>>
>> compute #svarx = ssd_x**2 . /* sample variance of X .
>> compute #SS_x = #svarx * (n-1) . /* SS for X .
>> compute #pvarx = #SS_x / n . /* population variance of X .
>> compute psd_x = SQRT(#pvarx). /* population SD .
>> formats ssd_x psd_x (f8.3).
>> var lab
>>  ssd_x 'Sample SD of X'
>>  psd_x 'Population SD of X'
>> .
>> descrip ssd_x psd_x / stat = mean.
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>> [hidden email]
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>> "When all else fails, RTFM."
>>
>> NOTE:  My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>> --
>> View this message in context:
>>
>
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation

>> --tp27263817p27265309.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>
> NOTE:  My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
> --
> View this message in context:
>
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation

> --tp27263817p27266244.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27274958.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Marta Garcia-Granero
Hector Maletta wrote:

> Both approaches are right. The sample SD divided by n is just the average
> squared deviation. It is, however, shown that this is a biased estimate of
> the population SD because one degree of freedom has been used up already
> when computing the mean (a prior step for computing SD). Thence the
> correction of the denominator to n-1 in order to estimate the pop SD. Now,
> with the theoretical (and usually unknown) population SD, dubbed sigma, the
> correct denominator is N, not N-1 (where n is sample size and N pop size).
> This is because the mean and the SD are simply the first and second order
> moments of the variable distribution (the kth moment of a variable centred
> on its mean uses the variable's deviation raised to the kth power). For
> instance, the skewness and kurtosis of a distribution depend on the third
> and fifth moments.
A minor clarification. Kurtosis depends on the FOURTH moment, not the fifth

Best regards and happy weekend.
Marta

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
Thanks, Marta for correcting my involuntary lapsus mentis.
I take it you agree with the rest.
Hector


Marta Garcia Granero wrote:
A minor clarification. Kurtosis depends on the FOURTH moment, not the fifth

Best regards and happy weekend.
Marta

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
García-Granero
Sent: 22 January 2010 13:25
To: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

Hector Maletta wrote:
> Both approaches are right. The sample SD divided by n is just the average
> squared deviation. It is, however, shown that this is a biased estimate of
> the population SD because one degree of freedom has been used up already
> when computing the mean (a prior step for computing SD). Thence the
> correction of the denominator to n-1 in order to estimate the pop SD. Now,
> with the theoretical (and usually unknown) population SD, dubbed sigma,
the
> correct denominator is N, not N-1 (where n is sample size and N pop size).
> This is because the mean and the SD are simply the first and second order
> moments of the variable distribution (the kth moment of a variable centred
> on its mean uses the variable's deviation raised to the kth power). For
> instance, the skewness and kurtosis of a distribution depend on the third
> and fifth moments.

--
For miscellaneous SPSS related statistical stuff, visit:
http://gjyp.nl/marta/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Kornbrot, Diana
In reply to this post by Hector Maletta
Re: Can anyone help me get a population standard deviation? Who wants a biased estimate of population SD?
I’d be a biy dubious about any analysis which provided such an estimate

But Blackelock says some sociologists DO prefer the biased estimate
We wouldn’t want to interfere with these sociologists’ freedom of choice, if they do exist. Just as long as they tell us what they are doing and provide full information, including N in their report
Best

Diana





Professor Diana Kornbrot
email: 
d.e.kornbrot@...    
web:    http://web.me.com/kornbrot/KornbrotHome.html
Work
School of Psychology
 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK
 voice:   +44 (0) 170 728 4626
   fax:     +44 (0) 170 728 5073
Home
 
19 Elmhurst Avenue
 London N2 0LT, UK
    voice:   +44 (0) 208 883  3657
    mobile: +44 (0)
796 890 2102
   fax:      +44 (0) 870 706 4997





Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
Re: Can anyone help me get a population standard deviation?

I concur, as long as data are considered as sample results and are used for inferential purposes. However, the distinction is irrelevant for most practical purposes, except for EXTREMELY small samples. Dividing a SS of, say, 0.5 or 5 by 100 or by 99 usually results in the same value down to the third decimal at least. The difference becomes noticeable if your sample is, say, smaller than 30 cases or so. With SS=0.5, dividing by 30 or by 29 gives you respectively 0.166 or 1.72, and the standard error of the estimate (dividing SD by the sqrt of n) would be respectively 0.0030 and 0.0032, a difference hardly important for most purposes although somewhat noticeable. Now, if you have a sample of 10 or 15 cases, beware of the denominator you use.

Hector

 


From: kornbrot [mailto:[hidden email]]
Sent: 22 January 2010 14:05
To: Hector Maletta; [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

 

Who wants a biased estimate of population SD?
I’d be a biy dubious about any analysis which provided such an estimate

But Blackelock says some sociologists DO prefer the biased estimate
We wouldn’t want to interfere with these sociologists’ freedom of choice, if they do exist. Just as long as they tell us what they are doing and provide full information, including N in their report
Best

Diana

 

 


Professor Diana Kornbrot
email:  d.e.kornbrot@...    
web:    http://web.me.com/kornbrot/KornbrotHome.html
Work
School
of Psychology
 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK
 voice:   +44 (0) 170 728 4626
   fax:     +44 (0) 170 728 5073
Home
 
19 Elmhurst Avenue
 London N2 0LT, UK
    voice:   +44 (0) 208 883  3657
    mobile: +44 (0) 796 890 2102
   fax:      +44 (0) 870 706 4997




Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
In reply to this post by Hector Maletta
Hector Maletta wrote
--- snip ---
Thus the essential distinction is between the "true" pop SD (divided by N)
and the "unbiased estimate" of that pop SD based on a sample (divided by
N-1). For the sample, its "intrinsic" or "descriptive" SD is SS divided by
n, giving the SD of sample units around sample mean, but for inferential
purposes (unbiased estimate of pop SD) the denominator should be n-1.
Hector
Hi Hector.  You are still saying that a SD computed using a random sample and with n-1 in the denominator provides an unbiased estimate of the "true" population SD.  This is not correct--it underestimates the population SD.  See the Background section of the second Wikipedia page I gave in my previous post.

   http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Background

Note especially this bit:  "The use of n − 1 instead of n in the formula for the sample variance is known as Bessel's correction, which corrects the bias in the estimation of the sample variance, and some, but not all of the bias in the estimation of the sample standard deviation."

As for why virtually no one uses an unbiased version of the sample SD, I suspect the answer is that it's considerably more complicated than just taking the square root of the sample variance, and for most applications, the difference doesn't matter that much.

Cheers,
Bruce
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
In reply to this post by Kornbrot, Diana
kornbrot wrote
Who wants a biased estimate of population SD?
I¹d be a biy dubious about any analysis which provided such an estimate
Diana, most of us use a biased estimate of the population SD all the time.  See the page I mentioned in my response to Hector.

   http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Background

From that site:

--- start excerpt ---
The use of n − 1 instead of n in the formula for the sample variance is known as Bessel's correction, which corrects the bias in the estimation of the sample variance, and some, but not all of the bias in the estimation of the sample standard deviation.

It is not possible to find an estimate of the standard deviation which is unbiased for all population distributions, as the bias depends on the particular distribution. Much of the following relates to estimation assuming a normal distribution.
--- end excerpt ---

Cheers,
Bruce
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Hector Maletta
In reply to this post by Hector Maletta

In a previous message to this thread I referred to the case when the entire population is measured (e.g. in a census). That measurement is but one of the many measurements that could be taken with random variations (different census takers, different time of day, different household member being interviewed as main informant, different data entry workers, etc). If several such measurements are made of the same objects (say, a national population census or the measurement of the diameters of all buttons in a large jar) slightly different results would come out for the mean of any variable (e.g. mean age in a human population, or mean diameter for buttons) due to random differences in the individual measurements of the various persons or buttons.

 

Thus even a census is to be considered a “sample” drawn from the “universe” of possible such “samples” of size n=N. The same is valid for a smaller sample.

 

Random error involved in passing from sample to universe can be seen as composed of two main parts: (1) the random selection of a particular subset of n cases (n<N) and (2) random error incurred while studying that subset with a certain group of interviewers, entering the data with a definite group of data punchers and other random differences between samples, even if two samples are composed of the same subset of cases. The standard error of an estimate, i.e. SD/n^(1/2), would be probably very small for n approaching N or =N, but would very rarely be zero. The first component (particular subset) will disappear with n=N, but not the second one. The SD of a variable in the census divided by sqrt(N-1) gives an unbiased estimate of this standard error of census results (notice that for not very small N, subtracting or not subtracting 1 in the denominator would cause no perceptible difference in that std error). The standard error of population estimates derived from one single random sample exists for all sample sizes, from 1 to N, although for large N it is probably very small (imagine dividing an SD of, say, age (an SD probably not larger than 15 years) by the SQRT of population of the entire country (which is probably millions). For a relatively small country of 25 million people, the SQRT is 5000 and the std error would be 15/5000=0.003 years, or about 1 day of age (26 hours). For a mean age of 30 years, this would be about 0.000027% of the mean. For a large country like the US the relative error would be correspondingly smaller.

 

One may want to consider random measurement error as distinct from “pure” sampling error (i.e. defining sampling error by just the first component) but this is a matter of words that can hardly make the second part disappear. In fact, when one infers population values from a sample, the error involved is a conflation of the two components.

 

One may further distinguish between “intrinsic” measurement error (e.g. related to the precision of the type of instrument used for the measurement) and “sample-related” measurement error related to the particular circumstances of the actual measurement (choice of particular interviewers and data-entry workers, identity of the informant, time of day, etc.). The examples above do not include the intrinsic imprecision of the instruments (meter, questionnaire), because the instrument is supposed to be the same for all, whatever its intrinsic precision.

 

Hector


From: James Parry [mailto:[hidden email]]
Sent: 22 January 2010 14:46
To: Hector Maletta
Cc: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

 


I think implicit in mentioning 'sample-size' that (n-1) would be the appropriate denominator, and I'd be alarmed if any kind of significance testing were being considered, in that if it's the true population, what inference would there be to draw? However, hypothesis testing wasn't mentioned. But, if there is any hypothesis testing being conducted, (n-1) would be the recommended SD in that it admits to non-perfect measurement by decreasing the certainty around the parameter estimate.



James E. Parry
Sr. Sales Engineer,
SPSS An IBM Company
 233 S. Wacker Dr
Chicago, IL


From:

Hector Maletta <[hidden email]>

To:

[hidden email]

Date:

01/22/2010 12:29 PM

Subject:

Re: Can anyone help me get a population standard deviation?

Sent by:

"SPSSX(r) Discussion" <[hidden email]>

 





I concur, as long as data are considered as sample results and are used for inferential purposes. However, the distinction is irrelevant for most practical purposes, except for EXTREMELY small samples. Dividing a SS of, say, 0.5 or 5 by 100 or by 99 usually results in the same value down to the third decimal at least. The difference becomes noticeable if your sample is, say, smaller than 30 cases or so. With SS=0.5, dividing by 30 or by 29 gives you respectively 0.166 or 1.72, and the standard error of the estimate (dividing SD by the sqrt of n) would be respectively 0.0030 and 0.0032, a difference hardly important for most purposes although somewhat noticeable. Now, if you have a sample of 10 or 15 cases, beware of the denominator you use.
Hector
 

 



From: kornbrot [[hidden email]]
Sent:
22 January 2010 14:05
To:
Hector Maletta; [hidden email]
Subject:
Re: Can anyone help me get a population standard deviation?

 
Who wants a biased estimate of population SD?
I’d be a biy dubious about any analysis which provided such an estimate

But Blackelock says some sociologists DO prefer the biased estimate
We wouldn’t want to interfere with these sociologists’ freedom of choice, if they do exist. Just as long as they tell us what they are doing and provide full information, including N in their report
Best

Diana

 
 

 



Professor Diana Kornbrot
email:  
d.e.kornbrot@...    
web:    
http://web.me.com/kornbrot/KornbrotHome.html
Work
School
of Psychology
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626
  fax:     +44 (0) 170 728 5073

Home
19 Elmhurst Avenue
London N2 0LT, UK
   voice:   +44 (0) 208 883  3657
   mobile: +44 (0) 796 890 2102
  fax:      +44 (0) 870 706 4997




Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Kornbrot, Diana
In reply to this post by Bruce Weaver
Re: Can anyone help me get a population standard deviation? Bruce is, as usual correct, the usual formula is biased as described in wikipaedia
http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Background

Actually, most of us don’t use estimates of the sd most of the time at all. Most inferential estimates are based on unbiased estimates of variance, not biased estimates of  SD.

But its a salutary reminder that when one does tests of heterogeneity of variance they are based on mean variance, not mean SD. Hence if one is testing hypothesis that one group is more variable than another, then mean variances for each group rather than mean SDs should be provided.

Similaelyy, when people use the currently fashionable confidence limits, they are based on a t-distribution for se = SD√n (presumably that’s unbiased? Bruce).

NEW POINT
Having read the wikipaedial article, I wondered it it is ALWAYS true that families of dsitribtuions other than the normal have mean and variance not independent. It certainly true of many families, but ALL?

Best

Diana



On 22/01/2010 18:24, "Bruce Weaver" <bruce.weaver@...> wrote:

kornbrot wrote:
>
> Who wants a biased estimate of population SD?
> I¹d be a biy dubious about any analysis which provided such an estimate
>

Diana, most of us use a biased estimate of the population SD all the time.
See the page I mentioned in my response to Hector.


http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Background

From that site:

--- start excerpt ---
The use of n - 1 instead of n in the formula for the sample variance is
known as Bessel's correction, which corrects the bias in the estimation of
the sample variance, and some, but not all of the bias in the estimation of
the sample standard deviation.

It is not possible to find an estimate of the standard deviation which is
unbiased for all population distributions, as the bias depends on the
particular distribution. Much of the following relates to estimation
assuming a normal distribution.
--- end excerpt ---

Cheers,
Bruce


-----
--
Bruce Weaver
bweaver@...
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context: http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation--tp27263817p27277933.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Professor Diana Kornbrot
email:?
d.e.kornbrot@...    
web:    http://web.me.com/kornbrot/KornbrotHome.html
Work
School of Psychology
 University of Hertfordshire
 College Lane, Hatfield, Hertfordshire AL10 9AB, UK
 voice:   +44 (0) 170 728 4626
   fax:     +44 (0) 170 728 5073
Home
 
19 Elmhurst Avenue
 London N2 0LT, UK
    voice:   +44 (0) 208 883  3657
    mobile: +44 (0)
796 890 2102
   fax:      +44 (0) 870 706 4997





Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Bruce Weaver
Administrator
kornbrot wrote
Bruce is, as usual correct, the usual formula is biased as described in
wikipaedia
http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation#Backg
round

Actually, most of us don’t use estimates of the sd most of the time at all.
Most inferential estimates are based on unbiased estimates of variance, not
biased estimates of  SD.

But its a salutary reminder that when one does tests of heterogeneity of
variance they are based on mean variance, not mean SD. Hence if one is
testing hypothesis that one group is more variable than another, then mean
variances for each group rather than mean SDs should be provided.

Similaelyy, when people use the currently fashionable confidence limits,
they are based on a t-distribution for se = SD√n (presumably that’s
unbiased? Bruce).

See note 2 here:

  http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Standard_error_of_the_mean


Bruce
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Can anyone help me get a population standard deviation?

Garry Gelade
In reply to this post by Bruce Weaver
You're right. My suggestion was rubbish! It doesn't even work for variances.
Sorry - long day and all that.

Garry

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Bruce Weaver
Sent: 21 January 2010 23:44
To: [hidden email]
Subject: Re: Can anyone help me get a population standard deviation?

Hector, FWIW, Excel gives the same results I obtained with SPSS:

2.728    from STDEV (with division by n-1)
2.572    from STDEVP (with division by N)

I don't have time to try Gerry's weighting suggestion right now, but off the
top of my head, I think that it will work for variances, but not standard
deviations.



Hector Maletta wrote:

>
> I am not sure about Bruce's formula. The sample SD is SS divided by n. The
> ESTIMATE of the population SD, based in the sample, is SS divided by n-1.
> If
> I'm right about this, the SPSS SD should be multiplied by n and divided by
> n-1 to get the estimated population SD.
> On the other hand, if Nancy's dataset represents itself the whole of the
> population (e.g. a census), perhaps Nancy believes that the population
> variance should be computed directly. But that would not be right:
> (a) The directly computed population variance is SS/n, just as in a sample
> (a population is a sample of size n=N, with a sampling ratio of 1:1.
> (b) On the other hand, a measurement of a population (e.g. a census) is
> just
> one sample measurement (out of many measurements you can take, with
> different census takers or at different times of day, etc.), and therefore
> even a census is a sample, with a standard error of the estimate (possibly
> quite small unless the census takers are very sloppy). From this
> viewpoint,
> again the SD of the one (sampled) take of the census is SS/n; the estimate
> of the SD for the whole "population" of various possible measurements of
> the
> same (human) population would be SS/(n-1) which in this case is also
> SS/(N-1).
> Hector
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Bruce Weaver
> Sent: 21 January 2010 18:57
> To: [hidden email]
> Subject: Re: Can anyone help me get a population standard deviation?
>
> Nancy Rusinak wrote:
>>
>> In SPSS?  It automatically assumes a sample and I do not know how to get
>> SPSS to give me a standard deviation for a population.  Many thanks, in
>> advance.
>>
>> Nancy
>>
>>
>
> Here's one way to get it.
>
> 1. Use AGGREGATE to write the sample SD and N for the variable of interest
> to the working data file.
> 2. Compute SS = SD^2 x (n-1)
> 3. Compute Pop. Variance = SS/n
> 4. Compute Pop. SD = SQRT(Pop. Variance)
>
> E.g.,
>
> data list free / x (f2.0) .
> begin data
> 2 5 4 9 8 7 4 3 1
> end data.
>
> AGGREGATE
>   /OUTFILE=* MODE=ADDVARIABLES
>   /BREAK=
>   /ssd_x=SD(x)
>   /n = nu(x).
>
> compute #svarx = ssd_x**2 . /* sample variance of X .
> compute #SS_x = #svarx * (n-1) . /* SS for X .
> compute #pvarx = #SS_x / n . /* population variance of X .
> compute psd_x = SQRT(#pvarx). /* population SD .
> formats ssd_x psd_x (f8.3).
> var lab
>  ssd_x 'Sample SD of X'
>  psd_x 'Population SD of X'
> .
> descrip ssd_x psd_x / stat = mean.
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>
> NOTE:  My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
> --
> View this message in context:
>
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation

> --tp27263817p27265309.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context:
http://old.nabble.com/Can-anyone-help-me-get-a-population-standard-deviation
--tp27263817p27266244.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4794 (20100121) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com




__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4797 (20100122) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD