sampling with fixed mean and SD

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

sampling with fixed mean and SD

huang jialin
Hi,

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

Thank you for your attention.

Sincerely,
Jialin Huang

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Maguin, Eugene

Jialin,

 

As I remember, Spss has a specific command, Sample, for sampling cases. If that command is inadequate, please explain the significance of ‘fixed mean and SD’ and ‘sample size is from 300-500’. I understand replacement is not allowed.

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: Tuesday, January 24, 2012 11:46 AM
To: [hidden email]
Subject: sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

Reply | Threaded
Open this post in threaded view
|

Automatic reply: sampling with fixed mean and SD

Sarraf, Shimon Aaron

I will be out of the office until Friday, January 27. If you need immediate assistance, please call 812-856-5824. I will respond to your e-mail when I return to the office.

Thank you,

 

Shimon Sarraf

Center for Postsecondary Research

Indiana University Bloomington

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

John F Hall
In reply to this post by huang jialin

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    (+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: 24 January 2012 17:46
To: [hidden email]
Subject: sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

John F Hall
In reply to this post by huang jialin

Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    (+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent: 24 January 2012 18:43
To: 'huang jialin'; '[hidden email]'
Subject: RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    (+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: 24 January 2012 17:46
To: [hidden email]
Subject: sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
Hi everyone,

Thanks for your reply. Let me elaborate what I am planning to do.  

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

Thank you again.

Sincerely,
Jialin Huang



On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <[hidden email]> wrote:

Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" value="+33233459147" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent: 24 January 2012 18:43
To: 'huang jialin'; '[hidden email]'
Subject: RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" value="+33233459147" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: 24 January 2012 17:46
To: [hidden email]
Subject: sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 


Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Rick Oliver-3
SAMPLE 300 FROM 1000.

Depending on your definition of "around", the mean and standard deviation will probably meet your requirement.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]
Phone: 312.893.4922 | T/L: 206-4922




From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        01/24/2012 01:30 PM
Subject:        Re: sampling with fixed mean and SD
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi everyone,

Thanks for your reply. Let me elaborate what I am planning to do.  

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

Thank you again.

Sincerely,
Jialin Huang



On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <johnfhall@...> wrote:
Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     johnfhall@...

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href=tel:%28%2B33%29%20%280%29%202.33.45.91.47 target=_blank>(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:johnfhall@...]
Sent:
24 January 2012 18:43
To:
'huang jialin'; '
[hidden email]'
Subject:
RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     johnfhall@...

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href=tel:%28%2B33%29%20%280%29%202.33.45.91.47 target=_blank>(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
24 January 2012 17:46
To:
[hidden email]
Subject:
sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
Rick,

Thanks for your response. Actually, it would be nice to have the exact mean and sd as listed. I am concerned that random sampling may not be able to pull out the lower distribution cases. 

Thanks.

Sincerely,
Jialin Huang


On Tue, Jan 24, 2012 at 1:45 PM, Rick Oliver <[hidden email]> wrote:
SAMPLE 300 FROM 1000.

Depending on your definition of "around", the mean and standard deviation will probably meet your requirement.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]
Phone: <a href="tel:312.893.4922" value="+13128934922" target="_blank">312.893.4922 | T/L: 206-4922




From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        01/24/2012 01:30 PM
Subject:        Re: sampling with fixed mean and SD
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Hi everyone,

Thanks for your reply. Let me elaborate what I am planning to do.  

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

Thank you again.

Sincerely,
Jialin Huang



On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <[hidden email]> wrote:
Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent:
24 January 2012 18:43
To:
'huang jialin'; '
[hidden email]'
Subject:
RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
24 January 2012 17:46
To:
[hidden email]
Subject:
sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 


Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Rick Oliver-3
Someone who knows something about complex samples might have some suggestions.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]
Phone: 312.893.4922 | T/L: 206-4922




From:        huang jialin <[hidden email]>
To:        Rick Oliver/Chicago/IBM@IBMUS
Cc:        [hidden email]
Date:        01/24/2012 01:52 PM
Subject:        Re: sampling with fixed mean and SD




Rick,

Thanks for your response. Actually, it would be nice to have the exact mean and sd as listed. I am concerned that random sampling may not be able to pull out the lower distribution cases. 

Thanks.

Sincerely,
Jialin Huang


On Tue, Jan 24, 2012 at 1:45 PM, Rick Oliver <oliverr@...> wrote:
SAMPLE 300 FROM 1000.

Depending on your definition of "around", the mean and standard deviation will probably meet your requirement.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
oliverr@...
Phone:
<a href=tel:312.893.4922 target=_blank>312.893.4922 | T/L: 206-4922



From:        
huang jialin <huangpsych@...>
To:        
[hidden email]
Date:        
01/24/2012 01:30 PM
Subject:        
Re: sampling with fixed mean and SD
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





Hi everyone,

Thanks for your reply. Let me elaborate what I am planning to do.  

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

Thank you again.

Sincerely,
Jialin Huang



On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <
johnfhall@...> wrote:
Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     johnfhall@...

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href=tel:%28%2B33%29%20%280%29%202.33.45.91.47 target=_blank>(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:johnfhall@...]
Sent:
24 January 2012 18:43
To:
'huang jialin'; '
[hidden email]'
Subject:
RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     johnfhall@...

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href=tel:%28%2B33%29%20%280%29%202.33.45.91.47 target=_blank>(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
24 January 2012 17:46
To:
[hidden email]
Subject:
sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Michael Kruger
In reply to this post by huang jialin
Huang,


You don't even have to use syntax! From the menu, 'Data, Select Cases,
Random Sample of Cases,  Exactly (and then specify no. of cases you want
to select)...'

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Maguin, Eugene
In reply to this post by huang jialin

Ok, that’s what I figured you had in mind. There might be some sort of optimal solution to your question but I don’t know what it is; perhaps, others do. In lieu of that optimal solution, this is my first round suggestion. I’ll assume your dataset of 1000 records of a variable y, with mean=17, sd=5.1.

 

Compute ydev=abs(y-15).

Compute rannum=uniform(1).

Sort cases by rannum.

Select if ($casenum le 300).

Descriptives y.

 

This will give you a sample of 300 cases randomly selected without replacement. I think the mean will be near 15. The standard deviation will not be near 5.7, probably near 5.1. And the distribution probably will be skewed, if it is not already. As an experiment, you could also compute the squared deviation (ydev**2), randomly order it and select 300 cases from that.  I think the result will be similar, i.e., mean near 15 and sd near 17 but skewed.

 

I think you may have to subdivide the ydev distribution into, say, 20 five percentile wide ‘bars’. The about 50 cases within each bar are randomly numbered but the number of cases retained from each bar varies in such a way that more cases are selected from bars near the mean but fewer the further you move away the mean. I’d guess the number to be selected is a ratio of the midpoint height (or area) of the bar for a normal distribution with a sd of 15 to the midpoint height (or area) of the bar for a normal distribution with a sd of 17. This won’t be too easy to code but it won’t be too hard either.

 

I’d try that and see what I get and hope that there really is an optimal solution that smarter people know about.

 

Gene Maguin

 

 

 

 

 

 

 

 

From: huang jialin [mailto:[hidden email]]
Sent: Tuesday, January 24, 2012 2:27 PM
To: John F Hall; Gene Maguin
Cc: [hidden email]
Subject: Re: sampling with fixed mean and SD

 

Hi everyone,

 

Thanks for your reply. Let me elaborate what I am planning to do.  

 

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

 

Thank you again.

 

Sincerely,

Jialin Huang

 

 

 

On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <[hidden email]> wrote:

Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent: 24 January 2012 18:43
To: 'huang jialin'; '[hidden email]'
Subject: RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: 24 January 2012 17:46
To: [hidden email]
Subject: sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

 

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
In reply to this post by Michael Kruger
Mr. Kruger,

I tried to use the command. It ends up different sample from I expected. I need the cases with exact mean and sd I want. 

Thank you.

Sincerely,
Jialin Huang

On Tue, Jan 24, 2012 at 2:17 PM, Michael Kruger <[hidden email]> wrote:
Huang,


You don't even have to use syntax! From the menu, 'Data, Select Cases, Random Sample of Cases,  Exactly (and then specify no. of cases you want to select)...'

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
In reply to this post by huang jialin
Hi,

Thanks for your helps. I appreciate it. I will try to see whether they work.

Sincerely,
Jialin Huang


On Tue, Jan 24, 2012 at 2:23 PM, John F Hall <[hidden email]> wrote:

Jialin

 

I’m not quite sure what you are doing, but I used to do something like this when I was teaching.  I had data from a sample survey with 3100 cases (British Social Attitudes series) and wanted to demonstrate the idea of sampling from a population.  I used the “sample” as the population and got students to draw successive samples of size n from N to demonstrate sampling variation of the mean, proportions etc.  In those days we were limited by technology and classroom time, so even with 24 students by 2 samples each, it wasn’t always possible to show that the sampling variation of the mean was approximately normal.  I think SPSS always started with the same seed, so to avoid all students getting the same sample I got them to SET the SEED to a very high integer, usually their date of birth in yymmdd format.  In one class I had three students with the same birth date!

 

I also discovered that you need pretty large samples of eg 400 or 500 from 3100 to get anywhere near the results I needed: 100 from 3100 produced some very erratic means and percentages, but the students learned a lot about sampling variation.

 

You could try using TEMPORARY to select successive samples, but you may need to set the seed first.

 

SET SEED  401207 .

TEMP.

SAMPLE 300 FROM 1000 .

~ ~ ~ ~ ~

TEMP.

SAMPLE 300 FROM 1000 .

~ ~ ~ ~ ~

TEMP.

SAMPLE 300 FROM 1000 .

~ ~ ~ ~ ~

 

Happy sampling

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" value="+33233459147" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent: 24 January 2012 20:51
To: [hidden email]


Subject: Re: sampling with fixed mean and SD

 

Rick,

 

Thanks for your response. Actually, it would be nice to have the exact mean and sd as listed. I am concerned that random sampling may not be able to pull out the lower distribution cases. 

 

Thanks.

 

Sincerely,

Jialin Huang

 

On Tue, Jan 24, 2012 at 1:45 PM, Rick Oliver <[hidden email]> wrote:

SAMPLE 300 FROM 1000.

Depending on your definition of "around", the mean and standard deviation will probably meet your requirement.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]
Phone: <a href="tel:312.893.4922" target="_blank">312.893.4922 | T/L: 206-4922




From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        01/24/2012 01:30 PM
Subject:        Re: sampling with fixed mean and SD
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Hi everyone,

Thanks for your reply. Let me elaborate what I am planning to do.  

I have a dataset of 1000 cases, considering it as a population. M= 17, SD = 5.1. I am trying to pull out a sample size of roughly 300 cases, but the mean need to be around 15, and SD is around 5.7. 

I was wondering whether SPSS has any syntax that I can use. Your helps are very appreciated.

Thank you again.

Sincerely,
Jialin Huang



On Tue, Jan 24, 2012 at 11:54 AM, John F Hall <[hidden email]> wrote:
Should have said you do that in syntax.

 

From data editor:

 

 

File > New > Syntax

 

. . to open a new syntax file.  Write the command, but make sure you put a full stop (period) at the end of it, then press the green triangle etc.

 

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

 

From: John F Hall [mailto:[hidden email]]
Sent:
24 January 2012 18:43
To:
'huang jialin'; '
[hidden email]'
Subject:
RE: sampling with fixed mean and SD

 

You can sample in SPSS with:

 

sample <n> from <N>

 

where n is the sample size you want and N is the number of cases in the data set, or you can use:

 

sample <p>

 

where p is the proportion you want to sample expressed as a decimal.

 

John Hall

 

Email:     [hidden email]

Website: www.surveyresearch.weebly.com

Skype:   surveyresearcher1

Phone:    <a href="tel:%28%2B33%29%20%280%29%202.33.45.91.47" target="_blank">(+33) (0) 2.33.45.91.47

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of huang jialin
Sent:
24 January 2012 17:46
To:
[hidden email]
Subject:
sampling with fixed mean and SD

 

Hi,

 

I am planning to sample cases from a known dataset with fixed mean and SD. The sample size is from 300-500. The replacement is not allowed. Can I do it in SPSS? If so, how can I do it? 

 

Thank you for your attention.

 

Sincerely,

Jialin Huang

 

 


Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Jon K Peck
In reply to this post by huang jialin
First of all, note that it might not even be possible to exactly match the mean and sd with the existing cases.

Now is this the only variable involved?  If so, just do this.
1. Draw your sample of 300 at random and then compute the mean and sd.
2. Add the difference between the sample mean and the exact mean to each case.

You can make a similar adjust to the sd by another linear transform.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        01/24/2012 01:26 PM
Subject:        Re: [SPSSX-L] sampling with fixed mean and SD
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Mr. Kruger,

I tried to use the command. It ends up different sample from I expected. I need the cases with exact mean and sd I want. 

Thank you.

Sincerely,
Jialin Huang

On Tue, Jan 24, 2012 at 2:17 PM, Michael Kruger <aa3657@...> wrote:
Huang,


You don't even have to use syntax! From the menu, 'Data, Select Cases, Random Sample of Cases,  Exactly (and then specify no. of cases you want to select)...'


Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

David Marso
Administrator
In reply to this post by huang jialin
<BEGIN PROCESS: Opening rusty can of worms with sharp rock!>
*WHY*:  This sounds very close to manufacturing data.
Your sample is what it is.
FWIW:  You would be restricting sampling away from the higher end of the distribution.  ergo, you would be *REDUCING* the variability, not increasing it as you seem to request.
Sounds *FISHY* .
--------
<Retiring sharp rock>


huang jialin wrote
Hi,

I am planning to sample cases from a known dataset with fixed mean and SD.
The sample size is from 300-500. The replacement is not allowed. Can I do
it in SPSS? If so, how can I do it?

Thank you for your attention.

Sincerely,
Jialin Huang
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
In reply to this post by Jon K Peck
Hi Jon,

Thanks for your suggestion. Unfortunately, I did not only deal with only one variable. 

Sincerely,
Jialin Huang


On Tue, Jan 24, 2012 at 2:39 PM, Jon K Peck <[hidden email]> wrote:
First of all, note that it might not even be possible to exactly match the mean and sd with the existing cases.

Now is this the only variable involved?  If so, just do this.
1. Draw your sample of 300 at random and then compute the mean and sd.
2. Add the difference between the sample mean and the exact mean to each case.

You can make a similar adjust to the sd by another linear transform.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: <a href="tel:720-342-5621" value="+17203425621" target="_blank">720-342-5621




From:        huang jialin <[hidden email]>
To:        [hidden email]
Date:        01/24/2012 01:26 PM
Subject:        Re: [SPSSX-L] sampling with fixed mean and SD
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Mr. Kruger,

I tried to use the command. It ends up different sample from I expected. I need the cases with exact mean and sd I want. 

Thank you.

Sincerely,
Jialin Huang

On Tue, Jan 24, 2012 at 2:17 PM, Michael Kruger <[hidden email]> wrote:
Huang,


You don't even have to use syntax! From the menu, 'Data, Select Cases, Random Sample of Cases,  Exactly (and then specify no. of cases you want to select)...'



Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

huang jialin
In reply to this post by David Marso
David,

I think you are totally right. My bad of choosing a wrong word. 

What I am trying to do is to see the effects of range restriction. That is why I have the exact mean and sd. 

Thanks.

Sincerely,
Jialin Huang


On Tue, Jan 24, 2012 at 3:01 PM, David Marso <[hidden email]> wrote:
<BEGIN PROCESS: Opening rusty can of worms with sharp rock!>
*WHY*:  This sounds very close to manufacturing data.
Your sample is what it is.
FWIW:  You would be restricting sampling away from the higher end of the
distribution.  ergo, you would be *REDUCING* the variability, not increasing
it as you seem to request.
Sounds *FISHY* .
--------
<Retiring sharp rock>



huang jialin wrote
>
> Hi,
>
> I am planning to sample cases from a known dataset with fixed mean and SD.
> The sample size is from 300-500. The replacement is not allowed. Can I do
> it in SPSS? If so, how can I do it?
>
> Thank you for your attention.
>
> Sincerely,
> Jialin Huang
>


--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/sampling-with-fixed-mean-and-SD-tp5315312p5386749.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Rich Ulrich
In reply to this post by huang jialin
If you want to be sure that separate "strata" are represented,
you can do stratified sampling -  N1 from strata 1, N2 from strata 2,
and so on. 
 
Do you have a good reason *not*  to use the whole sample?
The main reason for randomizing a whole selection is that
the next step of necessary data collection is too expensive.
When your file has all the data you will use, that is what you
should use, almost all of the time, except for validation strategies.

If you are doing cross-validation, a common practice is to draw
*many* random samples... and then, to report their variability. 
You might want to read up on cross-validation, or on "bootstrap"
samples.

If you do, in fact, achieve *exactly* the original mean and SD on
some criterion variable, drawing your sample from a population,
you will be in a position where every statistician who reads your
work will be (rightfully) highly skeptical.  Unless you hide that
achievement.  An exact match is not a likely outcome for a
randomized sample.  You may need to adjust what you expect, or
to adjust the expectations of whoever is requesting the analysis.

--
Rich Ulrich



Date: Tue, 24 Jan 2012 13:51:11 -0600
From: [hidden email]
Subject: Re: sampling with fixed mean and SD
To: [hidden email]

Rick,

Thanks for your response. Actually, it would be nice to have the exact mean and sd as listed. I am concerned that random sampling may not be able to pull out the lower distribution cases. 

[snip, previous]
Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Mike
In reply to this post by David Marso
I agree with David that this is a strange request and
it would be very difficult to obtain a sample with a specific
mean and SD from a larger sample/population.

That being said, let's change perspectives on the problem.
Let's say you have 3000 cases in the larger sample/pop.
First, convert them to z-scores and rank order them from
smallest to largest value.  Assuming you have a symmetric
distribution that is more or less normal.

If you want a sample of 300 cases, then select 150 cases
with a negative z-score and 150 cases with positive z-scores
such that

absolute value(sum(negative z-scores) = sum(positive z-scores)

The sum of deviations around the mean is zero, so when the
absolute value of the sum of negative deviations equals the
sum of the positive deviations, you have a sample of N=300 that
will produce the specified mean.  Reconvert to original scale
by using a formula like:

original scale score = z-score*(SD) + Mean

You should now have a sample whose mean is equal to the
specified mean.

Note that if you have 150 pairs of z-scores that are the
same in absolute value but one is positive and the other
is negative, then the sample of 300 would reproduce the
desired mean.  But this might be overly restrictive.

I'm less clear on how to make sure that your sample has the
same SD or variance as the larger sample/pop but maybe
someone else will have an idea.

-MIke Palij
New York University
[hidden email]


On Tue, Jan 24, 2012 at 4:01 PM, David Marso <[hidden email]> wrote:

> <BEGIN PROCESS: Opening rusty can of worms with sharp rock!>
> *WHY*:  This sounds very close to manufacturing data.
> Your sample is what it is.
> FWIW:  You would be restricting sampling away from the higher end of the
> distribution.  ergo, you would be *REDUCING* the variability, not increasing
> it as you seem to request.
> Sounds *FISHY* .
> --------
> <Retiring sharp rock>
>
>
>
> huang jialin wrote
>>
>> Hi,
>>
>> I am planning to sample cases from a known dataset with fixed mean and SD.
>> The sample size is from 300-500. The replacement is not allowed. Can I do
>> it in SPSS? If so, how can I do it?
>>
>> Thank you for your attention.
>>
>> Sincerely,
>> Jialin Huang
>>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/sampling-with-fixed-mean-and-SD-tp5315312p5386749.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: sampling with fixed mean and SD

Maguin, Eugene
Huang,

Probably this question should have been asked earlier. What is the full
purpose of this project? I think I saw something about range but the
statement seemed like a comment in passing. And, I think you commented to
Jon that more than one variable is involved. Please elaborate on this part
of the project as well.

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Michael Palij
Sent: Tuesday, January 24, 2012 4:32 PM
To: [hidden email]
Subject: Re: sampling with fixed mean and SD

I agree with David that this is a strange request and
it would be very difficult to obtain a sample with a specific
mean and SD from a larger sample/population.

That being said, let's change perspectives on the problem.
Let's say you have 3000 cases in the larger sample/pop.
First, convert them to z-scores and rank order them from
smallest to largest value.  Assuming you have a symmetric
distribution that is more or less normal.

If you want a sample of 300 cases, then select 150 cases
with a negative z-score and 150 cases with positive z-scores
such that

absolute value(sum(negative z-scores) = sum(positive z-scores)

The sum of deviations around the mean is zero, so when the
absolute value of the sum of negative deviations equals the
sum of the positive deviations, you have a sample of N=300 that
will produce the specified mean.  Reconvert to original scale
by using a formula like:

original scale score = z-score*(SD) + Mean

You should now have a sample whose mean is equal to the
specified mean.

Note that if you have 150 pairs of z-scores that are the
same in absolute value but one is positive and the other
is negative, then the sample of 300 would reproduce the
desired mean.  But this might be overly restrictive.

I'm less clear on how to make sure that your sample has the
same SD or variance as the larger sample/pop but maybe
someone else will have an idea.

-MIke Palij
New York University
[hidden email]


On Tue, Jan 24, 2012 at 4:01 PM, David Marso <[hidden email]> wrote:
> <BEGIN PROCESS: Opening rusty can of worms with sharp rock!>
> *WHY*:  This sounds very close to manufacturing data.
> Your sample is what it is.
> FWIW:  You would be restricting sampling away from the higher end of the
> distribution.  ergo, you would be *REDUCING* the variability, not
increasing

> it as you seem to request.
> Sounds *FISHY* .
> --------
> <Retiring sharp rock>
>
>
>
> huang jialin wrote
>>
>> Hi,
>>
>> I am planning to sample cases from a known dataset with fixed mean and
SD.

>> The sample size is from 300-500. The replacement is not allowed. Can I do
>> it in SPSS? If so, how can I do it?
>>
>> Thank you for your attention.
>>
>> Sincerely,
>> Jialin Huang
>>
>
>
> --
> View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/sampling-with-fixed-mean-and-S
D-tp5315312p5386749.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
12