Dear Listers,
We have made a simulation to help us determine a minimum sample size for a survey we'd like to distribute. The simulation generates a series of hypothetical populations that range in size from 100-10,000 values, drawn randomly from a normal distribution (mean 4.0, std dev 1.0) For each population we take random samples of size 10%, 20%... 100% For each sample we compute the Student's t-test against the parent population and record any sample that has a significantly different mean from the population (p < 0.05) For each sample size we repeat the random selection/testing 1000 times and record the ratio of sig. dif. samples to total samples (with the assumption that if less than 5% of the samples don't approximate the population mean then this sample size is not large enough) We expected to find that we needed a large relative sample size when the population size was small, but this doesn't appear to be true and has us confused… Even for a population of 100, taking 10 samples seems to be enough to approximate the mean. Since this seems to contradict experience, so there must be something wrong with the simulation design? Testing the populations show that they are indeed normally distributed and have the correct mean and stddev, and the samples are random subsets of the parent population and of the correct size. Our t_test and various other equations (mean, stddev, variance, t_cdf (t significance test)) have all been verified against SPSS's values and don't seem to be the problem. Thanks for any insight! Haijie Ding, Ph D, Cognitive Psychologist, Ideation Group, Haworth Furniture. |
You should be able to derive the sample size mathematically given your prespecified type I and type II error requirements.
>>> Haijie Ding <[hidden email]> 7/10/2006 12:36 AM >>> Dear Listers, We have made a simulation to help us determine a minimum sample size for a survey we'd like to distribute. The simulation generates a series of hypothetical populations that range in size from 100-10,000 values, drawn randomly from a normal distribution (mean 4.0, std dev 1.0) For each population we take random samples of size 10%, 20%... 100% For each sample we compute the Student's t-test against the parent population and record any sample that has a significantly different mean from the population (p < 0.05) For each sample size we repeat the random selection/testing 1000 times and record the ratio of sig. dif. samples to total samples (with the assumption that if less than 5% of the samples don't approximate the population mean then this sample size is not large enough) We expected to find that we needed a large relative sample size when the population size was small, but this doesn't appear to be true and has us confused* Even for a population of 100, taking 10 samples seems to be enough to approximate the mean. Since this seems to contradict experience, so there must be something wrong with the simulation design? Testing the populations show that they are indeed normally distributed and have the correct mean and stddev, and the samples are random subsets of the parent population and of the correct size. Our t_test and various other equations (mean, stddev, variance, t_cdf (t significance test)) have all been verified against SPSS's values and don't seem to be the problem. Thanks for any insight! Haijie Ding, Ph D, Cognitive Psychologist, Ideation Group, Haworth Furniture. |
In reply to this post by Haijie Ding
Haijie,
It seems as if you are working on a power question that is embedded in another concern, a concern which I don't quite understand. It seems to me that you can get the sample size required for a t-test with a given effect size from any of the many power calculators on the web. So suppose you knew that a sample of 346 was required to given a power of xx for an effect size of yy, would that answer your questions? Gene Maguin |
In reply to this post by Haijie Ding
All,
I'd like some help with a question about how to do something. I'm doing data screening for a multilevel analysis and so, I'd like to look at a set of analyses on a cluster by cluster basis. Here's what I'd like to do. Loop #i=1 to 41. Compute pick=0. If (cluster eq #1) pick=1. Filter by pick. IGraph x by y. Frequenices x y. Correlation x with y. Regression y on x/residuals. Filter off. End loop. I understand that Loop-End loop can't be used with procedures. I also know that Split file can be used but the result is grouped incorrectly for easy use. (Also, the graph output is totally useless because all graphs are grouped together in a tiled sort of arrangement. Argh.) Is it as simple as !Define macro1(). !Do !i=1 !to 41. Compute pick=0. If (cluster eq !i) pick=1. Filter by pick. IGraph x by y. Frequenices x y. Filter off. !Doend. !Enddefine. Thanks, Gene Maguin |
In reply to this post by Maguin, Eugene
Dear Gene and Joseph,
Thank you for your kind reply. Yes, we can calculate the sample size based on the effect size, SD, power. Our question is what's wrong with our simulation? Furthermore, is there any way to get the sample size (true/unbiased sample for the population) without preset the effect size? Bests, Haijie On 7/10/06, Gene Maguin <[hidden email]> wrote: > > Haijie, > > It seems as if you are working on a power question that is embedded in > another concern, a concern which I don't quite understand. It seems to me > that you can get the sample size required for a t-test with a given effect > size from any of the many power calculators on the web. So suppose you > knew > that a sample of 346 was required to given a power of xx for an effect > size > of yy, would that answer your questions? > > Gene Maguin > |
Haijie,
Asking for the "true" sample size without specifying power is self-contradictory. There is no "true" sample size. Each sample size has a specific margin of error. Whether you are happy with that margin of error depends on the size of the effect you are trying to catch. If you need to detect small differences you would need a larger sample, implying a smaller margin of error (other things being constant, especially population variance and the level of confidence). Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Haijie Ding Enviado el: Tuesday, July 11, 2006 12:16 AM Para: [hidden email] Asunto: Re: Question on response rate determination simulation Dear Gene and Joseph, Thank you for your kind reply. Yes, we can calculate the sample size based on the effect size, SD, power. Our question is what's wrong with our simulation? Furthermore, is there any way to get the sample size (true/unbiased sample for the population) without preset the effect size? Bests, Haijie On 7/10/06, Gene Maguin <[hidden email]> wrote: > > Haijie, > > It seems as if you are working on a power question that is embedded in > another concern, a concern which I don't quite understand. It seems to me > that you can get the sample size required for a t-test with a given effect > size from any of the many power calculators on the web. So suppose you > knew > that a sample of 346 was required to given a power of xx for an effect > size > of yy, would that answer your questions? > > Gene Maguin > |
Free forum by Nabble | Edit this page |