SPSSX Discussion - Re: not an spss question - what is an appropriate sample size?

Re: not an spss question - what is an appropriate sample size?

Posted by Rich Ulrich on Oct 16, 2011; 8:32pm
URL: http://spssx-discussion.165.s1.nabble.com/Importing-from-MS-Access-64-bit-version-tp4907197p4907695.html

I don't know the purpose of their data collection, but it looks
to me as if it is properly designed for an administrative overview.

Someone, overhead, wants to know,
"How accurately are particular agencies reporting their incidents?"

Then they scold everyone with more than 1 error, and schedule
some sort of remediation, or new bosses, for the two agencies with
6 each (unless those successful argue for special circumstances
or unlucky sampling). Maybe these two have the least-experienced
bosses, since they are among the smallest agencies. The CI for 5
or more errors, per 30 incidents, does not include "1"; "1" seems
like a viable target.

The summary table will be more useful if it is added correctly.
"23.57%" for the overall error for the second group is obviously
in error, since a summary percentage cannot be larger than *all*
the row-percents that it stands as a summary of. I didn't check
the other numbers, but 16, not 33, is the total of 6 2 3 4 1.

The result of using samples of 30-each is a summary of agency
performances, and not an accurate estimate of how much error-
free data exists in the state. Most of that is what you see in
Agency A, all by itself, or A+H.

--
Rich Ulrich

Date: Sun, 16 Oct 2011 13:16:07 -0400
From: [hidden email]
Subject: not an spss question - what is an appropriate sample size?
To: [hidden email]

I would appreciate some guidance on a sample size question.

I work for an agency which collects statistical crime data from local police agencies. These agencies serve cities/towns which vary in size, and therefore the total number of criminal 'incidents' which they report in a year also vary.

I am periodically audited by a higher authority. This higher authority selects about 10 agencies each time they come to visit. They have determined that if they audit 300 incidents they will have gotten a 'statistically valid' sample of cases for my state.

They then divide these n = 300 into roughly equal number of cases to audit for each of the ten agencies, therefore each agency has about 30 cases examined.

However, as stated above, the 10 agencies serve different sized populations and report different numbers of incidents per year.

Here are the results of the most recent audit:

1^st group sampled
Agency	Records Reviewed	Incidents in 2009	Errors Found	Calculated Error Rate	% of All Offenses Reviewed
A	55	26941	1	1.8%	0.20%
B	40	306	3	7.5%	13.07%
C	40	3163	1	2.5%	1.26%
D	32	1050	4	12.5%	3.05%
E	32	179	6	18.75%	17.88%
Total	199	31639	15	7.5%	0.63%
2^nd group sampled
Agency	Records Reviewed	Incidents in 2009	Errors Found	Calculated Error Rate	% of all incidents reviewed
F	32	200	6	18.75%	16.0%
G	25	174	2	8.0%	14.4%
H	25	9217	3	12.0%	0.3%
I	31	1301	4	12.9%	2.4%
J	27	2673	1	3.7%	1.0%
Total	140	13565	33	23.57%	1.0%

My question to you all: is the methodology specified above reasonable? What questions or issues does this methodology suggest – if it is not reasonable.

It seems to me that this techniques oversamples cases in agencies with few cases, and undersamples cases in larger agencies.