Chi-Square Goodness to Fit with Contingency Tables

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Chi-Square Goodness to Fit with Contingency Tables

Donna Daniels
Hello,

Please accept my apologies if you receive this message more than one time.

I am working on one final statistical test for my dissertation.  I am
looking at specific drug use to specific criminal activity.  To do this,
I have proposed to run the Chi Square Goodness to Fit Test using
Contingency Tables.  However, I am not sure how to do this on my version
of the SPSS software.  I have the SPSS Graduate Pack 14.0.

Thank you in advance for your guidance.

Warm regards,

Donna Daniels
Doctoral Candidate
Walden University

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Chi-Square Goodness to Fit with Contingency Tables

Dominic Lusinchi
Donna,

In order to get some advice you need to provide more information.

By definition a contingency table requires at least two variables. You
mention drug use - that's one variable.

What is it that you are trying to do?

Please be more specific. Thanks.

Dominic Lusinchi
Statistical Consultant
Far West Research
P: 415-664-3032
San Francisco, California
Email: [hidden email]
Web: http://www.farwestresearch.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Chi-Square Goodness to Fit with Contingency Tables

Donna Daniels
Please forgive my previous email. I pasted the description of what I am
trying to do from my dissertation proposal. The test is the Chi Square
Test for Independence. The variables are the type of drug which includes
marijuana, cocaine (powdered), crack cocaine, ecstasy, methamphetamine,
the illegal use of prescription drugs, and polydrug use; the other
variable is the crime which is the dependent variable in this study.
There are 23 different crimes used in this study. Please review the
additional information below:

The researcher will conduct the Chi-Square test to verify frequencies of
specific drug use and specific crimes committed. Crime data is recorded
in a database as required by the Federal Bureau of Investigation under
the Uniform Crime Reporting Program and the Incident Based Reporting
System. The counts from the databases are considered expected counts
while the participant responses to the survey questions are considered
actual counts.

The data collected based on the type of drug has nominal values.
Therefore, the Chi-Square test will be helpful in determining the actual
count of marijuana, cocaine (powder), crack cocaine, ecstasy,
methamphetamine, prescription drugs, and polydrug use in specific
crimes. “The statistic is the difference between the observed count and
the expected count in each cell, divided by the expected count, summed
over all cells” (Aczel & Sounderpandian, 2006, pp. 680-681).

The researcher will gather the data based on police reports submitted to
the Uniform Crime Reporting System as required by the Federal Bureau of
Investigation (FBI). These are the expected counts on the contingency
tables. Once the numbers are entered for all crimes committed, the
researcher will then enter the numbers from the data gathered during the
offender survey. The table for Crime Categories is found on page 113,
Table 13. The purpose for conducting the Chi-Square Distribution is to
determine if specific drug use leads to significant occurrences of
specific crimes.

The researcher will first conduct a Chi-Square Test for Independence. To
do this, a contingency table will be used. There are two classifications
in the contingency table for this study: Drug Type and Crime Category.
The hypothesis test for independence is:

H0: The variables named drug type and crime category are independent of
each other.
H1: The variables named drug type and crime category are not independent.

If the computed value of the chi square statistic is greater than the
critical value, the null hypothesis is rejected.

I hope this clarifies anyone's questions.

Warm regards,

Donna

Dominic Lusinchi wrote:

> Donna,
>
> In order to get some advice you need to provide more information.
>
> By definition a contingency table requires at least two variables. You
> mention drug use - that's one variable.
>
> What is it that you are trying to do?
>
> Please be more specific. Thanks.
>
> Dominic Lusinchi
> Statistical Consultant
> Far West Research
> P: 415-664-3032
> San Francisco, California
> Email: [hidden email]
> Web: http://www.farwestresearch.com
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Chi-Square Goodness to Fit with Contingency Tables

Swank, Paul R
That is a lot of cells, 161 to be precise. Not only will that require a
large data set, the interpretation will be complicated. Crosstabs will
do it if you have enough data. You need an expected frequency of at
least five in most of the cells.

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Donna Daniels
Sent: Monday, March 10, 2008 5:19 AM
To: [hidden email]
Subject: Re: Chi-Square Goodness to Fit with Contingency Tables

Please forgive my previous email. I pasted the description of what I am
trying to do from my dissertation proposal. The test is the Chi Square
Test for Independence. The variables are the type of drug which includes
marijuana, cocaine (powdered), crack cocaine, ecstasy, methamphetamine,
the illegal use of prescription drugs, and polydrug use; the other
variable is the crime which is the dependent variable in this study.
There are 23 different crimes used in this study. Please review the
additional information below:

The researcher will conduct the Chi-Square test to verify frequencies of
specific drug use and specific crimes committed. Crime data is recorded
in a database as required by the Federal Bureau of Investigation under
the Uniform Crime Reporting Program and the Incident Based Reporting
System. The counts from the databases are considered expected counts
while the participant responses to the survey questions are considered
actual counts.

The data collected based on the type of drug has nominal values.
Therefore, the Chi-Square test will be helpful in determining the actual
count of marijuana, cocaine (powder), crack cocaine, ecstasy,
methamphetamine, prescription drugs, and polydrug use in specific
crimes. "The statistic is the difference between the observed count and
the expected count in each cell, divided by the expected count, summed
over all cells" (Aczel & Sounderpandian, 2006, pp. 680-681).

The researcher will gather the data based on police reports submitted to
the Uniform Crime Reporting System as required by the Federal Bureau of
Investigation (FBI). These are the expected counts on the contingency
tables. Once the numbers are entered for all crimes committed, the
researcher will then enter the numbers from the data gathered during the
offender survey. The table for Crime Categories is found on page 113,
Table 13. The purpose for conducting the Chi-Square Distribution is to
determine if specific drug use leads to significant occurrences of
specific crimes.

The researcher will first conduct a Chi-Square Test for Independence. To
do this, a contingency table will be used. There are two classifications
in the contingency table for this study: Drug Type and Crime Category.
The hypothesis test for independence is:

H0: The variables named drug type and crime category are independent of
each other.
H1: The variables named drug type and crime category are not
independent.

If the computed value of the chi square statistic is greater than the
critical value, the null hypothesis is rejected.

I hope this clarifies anyone's questions.

Warm regards,

Donna

Dominic Lusinchi wrote:

> Donna,
>
> In order to get some advice you need to provide more information.
>
> By definition a contingency table requires at least two variables. You
> mention drug use - that's one variable.
>
> What is it that you are trying to do?
>
> Please be more specific. Thanks.
>
> Dominic Lusinchi
> Statistical Consultant
> Far West Research
> P: 415-664-3032
> San Francisco, California
> Email: [hidden email]
> Web: http://www.farwestresearch.com
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Chi-Square Goodness to Fit with Contingency Tables

Dominic Lusinchi
In reply to this post by Donna Daniels
OK, so you are cross-classifying drug type and crime type: one is the row
variable, the other is the column variable. You will have a very large table
if you maintain 23 x 7 (?) categories. You will need a relatively large
sample (n>800), in order to avoid having too many empty cells. You might
want to consider consolidating the crime categories in a way that makes
sense, and even the drug categories.

One thing I don't understand in your write-up is the role of the UCR in your
research: you refer to it as the "expected counts". What are you saying here
or trying to do?

Dominic

-----Original Message-----
From: Donna Daniels [mailto:[hidden email]]
Sent: Monday, March 10, 2008 3:19 AM
To: [hidden email]
Cc: [hidden email]
Subject: Re: Chi-Square Goodness to Fit with Contingency Tables

Please forgive my previous email. I pasted the description of what I am
trying to do from my dissertation proposal. The test is the Chi Square
Test for Independence. The variables are the type of drug which includes
marijuana, cocaine (powdered), crack cocaine, ecstasy, methamphetamine,
the illegal use of prescription drugs, and polydrug use; the other
variable is the crime which is the dependent variable in this study.
There are 23 different crimes used in this study. Please review the
additional information below:

The researcher will conduct the Chi-Square test to verify frequencies of
specific drug use and specific crimes committed. Crime data is recorded
in a database as required by the Federal Bureau of Investigation under
the Uniform Crime Reporting Program and the Incident Based Reporting
System. The counts from the databases are considered expected counts
while the participant responses to the survey questions are considered
actual counts.

The data collected based on the type of drug has nominal values.
Therefore, the Chi-Square test will be helpful in determining the actual
count of marijuana, cocaine (powder), crack cocaine, ecstasy,
methamphetamine, prescription drugs, and polydrug use in specific
crimes. "The statistic is the difference between the observed count and
the expected count in each cell, divided by the expected count, summed
over all cells" (Aczel & Sounderpandian, 2006, pp. 680-681).

The researcher will gather the data based on police reports submitted to
the Uniform Crime Reporting System as required by the Federal Bureau of
Investigation (FBI). These are the expected counts on the contingency
tables. Once the numbers are entered for all crimes committed, the
researcher will then enter the numbers from the data gathered during the
offender survey. The table for Crime Categories is found on page 113,
Table 13. The purpose for conducting the Chi-Square Distribution is to
determine if specific drug use leads to significant occurrences of
specific crimes.

The researcher will first conduct a Chi-Square Test for Independence. To
do this, a contingency table will be used. There are two classifications
in the contingency table for this study: Drug Type and Crime Category.
The hypothesis test for independence is:

H0: The variables named drug type and crime category are independent of
each other.
H1: The variables named drug type and crime category are not independent.

If the computed value of the chi square statistic is greater than the
critical value, the null hypothesis is rejected.

I hope this clarifies anyone's questions.

Warm regards,

Donna

Dominic Lusinchi wrote:

> Donna,
>
> In order to get some advice you need to provide more information.
>
> By definition a contingency table requires at least two variables. You
> mention drug use - that's one variable.
>
> What is it that you are trying to do?
>
> Please be more specific. Thanks.
>
> Dominic Lusinchi
> Statistical Consultant
> Far West Research
> P: 415-664-3032
> San Francisco, California
> Email: [hidden email]
> Web: http://www.farwestresearch.com
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Chi-Square Goodness to Fit with Contingency Tables

David Hitchin
In reply to this post by Donna Daniels
Quoting Donna Daniels <[hidden email]>:

> The variables are the type of drug which includes marijuana, cocaine
(powdered), crack cocaine, ecstasy, methamphetamine,the illegal use of
prescription drugs, and polydrug use; the other variable is the crime
which is the dependent variable in this study.

The chi-squared test is appropriate when the two variables that you are
considering are of equal status, but if you have a dependent variable
(and therefore also an independent variable) these are NOT of equal status.

Whether or not you find independence is very much a factor related to
sample size; if there is even a very slight connnection between drug and
crime, then with a sufficiently large sample size you will find that
these are not independent of each other.

The chi-square test also produces just a "yes" or "no" answer about
independence, so it is not very informative. It's a little bit like
calculating a correlation coefficient, testing it for significance, but
not looking at the size of the correlation, which is probably the most
useful information.

One way to tackle this is to use the statistics provided with
cross-tabulations.
Analyse > Descriptive statistics > Crosstabs
Choose the variables to be crosstabulated and then open the "statistics"
box.
In the "Nominal" column tick the Lambda box.

When you run the crosstab the results will include a table of
directional measures with three rows of values for lambda, and the
relevant one is the row marked "crime dependent". Although there is a
measure of significance at the right hand side of the table, the really
useful figure will be the "Value" at the left.

The way that the lambda statistic works is like this: Suppose that you
look at each of your participants, and (forgetting drugs for the moment)
try to guess which crimes they have committed. Obviously you will take
into account the relative frequencies of the crimes to get the best
guesses. Now it's just a simple probability calculation to work out how
many guesses are likely to be correct by chance.

Now do the same thing again, but for each participant take into account
both the crime and the drug. Obviously your guesses ought to be more
accurate when you use both variables, and the lambda measure is the
relative decrease in the probability of error when you include the
second variable.

This figure should be much more informative than anything that a
chi-square test can tell you.

Note that the relationship is not symmetrical; you may be able to
predict the crime from the drug better than you can predict the drug
from the crime (or the other way round) and this is especially the case
when your table is long and thin rather than squarish, as in this case
when you have 7 drug categories and 23 crimes.

David Hitchin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD