How to do repetitive tests on random samples

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to do repetitive tests on random samples

msherman

Dear list:  Is there a way to embed within a macro repetitive tests (single variable chi squared) when at the same time one obtains a random sample from a data set.

That is, I have a large data set with 10,000 cases and want to sample 100 cases at a time and perform this 100 times and to obtain single variable chi squared values for each random sample.

 

Martin F. Sherman, Ph.D.

Professor of Psychology

Director of  Masters Education in Psychology: Thesis Track

 

Loyola University Maryland

Department of Psychology

222 B Beatty Hall

4501 North Charles Street

Baltimore, MD 21210

 

410-617-2417

[hidden email]

 

Reply | Threaded
Open this post in threaded view
|

Re: How to do repetitive tests on random samples

Andy W
There is a related question here, http://spssx-discussion.1045642.n5.nabble.com/Random-sampling-amp-matrix-of-histograms-problem-td5718425.html

In a nutshell, you just generate the sample ids + rep# in a second dataset, match the needed info into the simulation dataset, and then use split file when doing whatever statistics.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: How to do repetitive tests on random samples

David Marso
Administrator
In reply to this post by msherman
Untested:

DEFINE whatever ().
!DO !I=1 !TO 100
TEMPORARY.
SAMPLE 100 FROM 10000 /* use your actual values */.
CROSSTABS or whatever you are using for 1 sample Chi Sq.
!DOEND
!ENDDEFINE .

msherman wrote
Dear list:  Is there a way to embed within a macro repetitive tests (single variable chi squared) when at the same time one obtains a random sample from a data set.
That is, I have a large data set with 10,000 cases and want to sample 100 cases at a time and perform this 100 times and to obtain single variable chi squared values for each random sample.

Martin F. Sherman, Ph.D.
Professor of Psychology
Director of  Masters Education in Psychology: Thesis Track

Loyola University Maryland
Department of Psychology
222 B Beatty Hall
4501 North Charles Street
Baltimore, MD 21210

410-617-2417
[hidden email]<mailto:[hidden email]>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

FW: How to do repetitive tests on random samples

John F Hall
In reply to this post by msherman

 

David Marso did something like this for me several months ago when I wanted to show how means and other statistics can vary for 100 sub-samples size n from the main sample size N.  This was a prelude to explaining sampling variation of the mean in a class and the exercise was supposed to yield a nice series of means from each student which would be approximately normally distributed.  It never did, but they got the general idea.   I may not have it on this computer, but you may be able to find something via Nabble, unless DM can provide a short bit of syntax again :)

 

Panic over: I found it.  See below.

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html

 

SPSS syntax to provide 100 samples and calculate means of life satisfation. %% very happy and mean/median age.


 

*Marso sample .

 

* data set 'ql4gb1975.sav' .

* variables: var544 (happy), var545 (lifesat), age .

compute happy = var544.

compute lifesat = var545 .

compute caseid = serial .

execute .

 

*** OK here we go **.

**First of all oversample **.

LOOP SAMPLE=1 TO 100.

+  DO IF UNIFORM(1) < .12 .

+    XSAVE OUTFILE "Samples.sav" / KEEP caseid lifesat age happy sample.

+  END IF.

END LOOP.

*ONE OF THE ONLY Places you NEED an EXECUTE *.

EXECUTE.

GET FILE "samples.sav".

FREQ SAMPLE.

* Using same ideas as what I posted to Cindy Gregory * .

COMPUTE SCRAMBLE=UNIFORM(1).

SORT CASES BY SAMPLE SCRAMBLE.

IF $CASENUM=1 OR LAG(SAMPLE) NE SAMPLE  GPCount=1.

IF MISSING( GPCount) GPCount=LAG(GPCount)+1.

 

*ONE OF THE ONLY Places you NEED an EXECUTE *.

EXECUTE.

SELECT IF GPCount LE 300.

FREQ SAMPLE.

AGGREGATE OUTFILE *

        / BREAK SAMPLE

        / MLIFESAT MEANAGE = MEAN(lifesat age)

        / PctVHapp=PIN(happy,3,3).

 

freq MLIFESAT MEANAGE PctVHapp

  /ST STDDEV SEMEAN MEAN

  /HIS NOR

  /FOR not .

 

disp lab .

 

freq age /for not /his nor /sta mea std sem .


In Data View it produced this:

 

101       7.98      48.47    37.1

101       8.04      48.84    39.0

101       7.70      45.05    33.6

101       7.76      46.51    30.7

101       8.05      47.57    38.2

101       7.95      49.92    49.0

101       7.70      47.91    39.3

101       7.83      46.82    39.6

101       7.86      48.86    38.3

101       8.17      50.07    44.8

101       7.75      48.56    39.1

101       8.08      46.87    44.4

101       7.86      47.87    35.2

101       7.82      47.66    38.8

101       8.05      46.20    39.2

101       8.04      45.98    39.8

101       8.15      46.43    36.4

101       7.85      48.16    34.3

101       7.73      47.69    29.5

101       7.69      47.13    32.8

101       7.88      45.64    33.0

101       8.04      43.84    44.3

101       8.02      47.11    33.9

101       7.82      47.28    38.8

101       7.75      49.66    39.5

101       8.01      51.24    47.0

101       8.00      47.78    41.1

101       7.75      47.09    35.5

101       7.95      45.63    44.5

101       8.06      48.41    36.8

101       7.95      47.90    39.1

101       7.92      45.60    40.0

101       7.70      48.33    37.8

101       7.81      47.92    35.2

101       7.48      45.92    35.9

101       7.88      47.26    35.9

101       7.80      48.01    41.5

101       7.69      46.35    36.5

101       7.54      47.33    31.4

101       7.90      47.65    37.1

101       7.99      47.00    44.2

101       7.73      47.45    35.0

101       8.22      48.72    55.3

101       7.47      47.36    39.8

101       7.90      45.46    39.5

101       7.87      47.36    35.6

101       7.89      49.56    32.7

101       7.80      43.92    31.9

101       7.73      45.24    28.0

101       7.78      49.96    40.4

101       8.01      46.58    43.4

101       7.80      47.31    29.4

101       7.91      47.27    37.5

101       7.94      46.89    41.1

101       7.70      48.46    41.1

101       7.62      46.79    36.3

101       8.01      46.79    39.0

101       7.62      48.49    35.1

101       8.05      47.62    45.5

101       8.08      48.05    47.7

101       7.70      49.87    36.7

101       7.86      47.18    32.3

101       8.24      47.21    43.2

101       7.76      46.51    36.0

101       7.97      48.15    42.6

101       8.23      46.23    48.2

101       7.87      49.18    40.9

101       7.91      46.48    43.0

101       7.74      45.99    45.0

101       8.07      46.96    36.2

101       7.83      43.11    37.3

101       7.78      46.80    35.9

101       7.97      47.44    37.6

101       7.81      47.98    31.1

101       8.12      48.19    46.6

101       7.89      47.26    36.0

101       7.77      47.86    35.6

101       7.46      47.72    32.4

101       8.11      46.54    44.0

101       7.64      47.02    36.4

101       8.04      46.85    42.2

101       7.84      47.73    38.6

101       7.78      45.02    40.0

101       8.05      49.07    43.2

101       7.87      49.58    36.8

101       7.86      45.89    41.1

101       8.06      42.93    40.7

101       7.89      49.43    43.5

101       8.01      47.14    44.4

101       7.95      47.41    37.4

101       7.81      48.30    37.9

101       8.12      45.36    40.2

101       7.71      46.57    41.7

101       7.97      48.35    40.8

101       7.95      46.47    37.5

101       7.66      47.06    35.2

101       8.01      45.85    36.1

101       7.71      47.45    35.7

101       7.95      48.44    43.2

101       8.21      46.29    40.4

 

I just ran this syntax on the *.sav file and got some nice histograms with almost normal distributions:

 

freq mlifesat meanage pctvhapp

    /for not /his nor.

 

Just what the doctor ordered.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Martin Sherman
Sent: 29 October 2013 17:06
To: [hidden email]
Subject: How to do repetitive tests on random samples

 

Dear list:  Is there a way to embed within a macro repetitive tests (single variable chi squared) when at the same time one obtains a random sample from a data set.

That is, I have a large data set with 10,000 cases and want to sample 100 cases at a time and perform this 100 times and to obtain single variable chi squared values for each random sample.

 

Martin F. Sherman, Ph.D.

Professor of Psychology

Director of  Masters Education in Psychology: Thesis Track

 

Loyola University Maryland

Department of Psychology

222 B Beatty Hall

4501 North Charles Street

Baltimore, MD 21210

 

410-617-2417

[hidden email]