Re: Statistical Significance Testing Versus Total in CTABLES - Forgot to Mention Something

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Statistical Significance Testing Versus Total in CTABLES - Forgot to Mention Something

Feinstein, Zachary
The rows can be proportions or they can be means.  Let's say a mean
example is the number of children in each household and we wish to test
if there are statistically significant differences among the regions.  A
proportion example could be who people are likely voting for in the next
presidential election.  Again, would be interesting to statistically
test the proportion that would choose Hillary Clinton among the four
regions.

But the part that I really should clarify is what I mean when I say I
want to test the columns by Total.  When I wish to test North versus
Total I mean North versus Total minus North, or North versus all the
other regions.  That's what the test within a platform like CfMC does.

Thank you and sorry about any of the confusion.  Still hoping there is
an automatic way to do this with SPSS.

Zachary

-----Original Message-----
From: Bob Schacht [mailto:[hidden email]]
Sent: Monday, June 19, 2006 6:55 PM
To: Feinstein, Zachary; [hidden email]
Subject: Re: Statistical Significance Testing Versus Total in CTABLES

At 12:37 PM 6/19/2006, Feinstein, Zachary wrote:
>I am using V14 and I thought I was under the impression that CTABLES
>would perform statistical significance testing of columns versus Total
>columns.  Let's say I have four regions- North, South, East, & West.
>The stat. testing appears to test these columns versus each other but
>not versus Total.  I thought that was a new feature, or maybe it was
>just wishful thinking.

Just what is it that you're trying to test? "Testing columns versus
Total"
is pushing numbers around. What do you mean? So, you have four regions.
Do you mean the regions are your "columns"? What, then, are your rows?
Are they alternative values of another categorical variable?

In other words, you have not given us enough information to answer your
question.

Bob Schacht


>
>If anyone knows anything, or if I am missing something, let me know.
>Thanks.
>
>Zachary
>[hidden email]
>

Robert M. Schacht, Ph.D. <[hidden email]> Pacific Basin
Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Re: Statistical Significance Testing Versus Total in CTABLES - Forgot to Mention Something

Bob Schacht-3
At 04:11 AM 6/20/2006, Feinstein, Zachary wrote:
>The rows can be proportions or they can be means.  Let's say a mean
>example is the number of children in each household and we wish to test
>if there are statistically significant differences among the regions.

Zachary,
You're still not using the language of statistics correctly (by which I
mean "in a way that other people can understand what you're asking"). In
this example, the "number of children" is the *row* and the *cell contents*
are proportions or means. What you're describing here sounds like a
half-done Analysis of Variance model. In order to test the statistical
significance, you're going to need not just the means, or proportions, but
the variances, because that's what the tests of significance are based on.

Usually in an analysis of this sort, the "rows" are observations of the
variable in question. In your example, the "variable" is the number of
children, and the unit of observation is the household. You initially
presented your problem in what sounded to me like a cross-tabulation,
because you spoke of a "Totals" column. But now I see that's not the
appropriate model.


>  A proportion example could be who people are likely voting for in the next
>presidential election.  Again, would be interesting to statistically
>test the proportion that would choose Hillary Clinton among the four
>regions.

This looks like a cross-tabulation model, in which the row variable is
"Vote for Hillary?" and the rows are "Yes" and "No," and your column
variable is "Regions" with columns for North, South, etc. The null
hypothesis is that voting for Hillary is independent of what region you're in.

But what you need in the table then are not proportions, but frequencies,
in order to take sample size into account. Statistical tests of
significance depend on sample size. For example, if the "Yes" vote for
Hillary in "North" is 60%, and the "No" vote is 40%, that's a landslide in
a realistic vote population (e.g., 600,000 yes, vs. 400,000 no), but if its
based on a sample with 3 votes for Hillary and 2 votes against, that's not
a significant sample.


>But the part that I really should clarify is what I mean when I say I
>want to test the columns by Total.  When I wish to test North versus
>Total I mean North versus Total minus North, or North versus all the
>other regions.  That's what the test within a platform like CfMC does.

This is where your first example gets confusing, because you wrote about
using means. If you use means, what does "Total" mean? What it really
sounds like you're saying here is that "Total" doesn't mean "Total," but
rather "Other".


>Thank you and sorry about any of the confusion.  Still hoping there is
>an automatic way to do this with SPSS.

It  seems to me that your problem, and the data for it, are not precisely
formulated.
You seem to be confusing several statistical models. So I can't think of
anything to help you.

Bob

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814