SPSSX Discussion

Show an empty category in a frequency table

Classic

List

Threaded

7 messages Options

anafuster

Apr 22, 2015; 8:43am

Show an empty category in a frequency table

2 posts

It is the first time I use SPSS so I'm beginner and I have a problem.
How can I show an empty category in a frequency table?

I have a nominal variable, for example "Q1 - Which list has more movies that you find appealing?", it has 6 possible categories, one for each algorithm {1="ItemItem"; 2="Lucene"; 3="Persmean"; 4="Popular"; 5="SVD"; 6="UserUSer"}.(I have defined it on the variable view as Value labels)
But when nobody has opted for one of the algorithm, the category doesn't appear when I do the chi-squared test or the frequency analysis. For example, in the case of the group analysis the answers for this questions are:

Observed Count Value Label
6 ItemItem
1 Lucene
0 Persmean
3 Popular
0 SVD
0 UserUser

And the results obtained with the chi-squared test are:

Q1Accuracy
Observed N Expected N Residual
ItemItem 6 3,3 2,7
Lucene 1 3,3 -2,3
Popular 3 3,3 -,3
Total 10

But the expected count should be 1,6 instead of 3,3.

What I want it's that all the categories appear on the analysis even if it is with Observed Count = 0.
Do you know how can I manage it?

Thanks in advance!

Andy W

Apr 22, 2015; 11:51am

Re: Show an empty category in a frequency table

732 posts

Something like this should do it:

NPAR TESTS
/CHISQUARE=Q1Accuracy (1,6)
/EXPECTED=EQUAL.

Note that for your particular example the cells counts are quite low.

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/

Andy W

Apr 22, 2015; 12:55pm

Re: Show an empty category in a frequency table

732 posts

Never mind the comment about the small counts. I evaluated the test using the exact distribution and it results in the same inferences. See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2476536 for a more detailed description of the procedure.

I made some R code to create the exact distribution for this case. One thing to note is that using the exact distribution will only make the p-value smaller, so if you reject the null using the asymptotic distribution you will also reject using the exact distribution, which is what happens in this case.

####################################
library(partitions)
loc <- "C:\\Users\\andrew.wheeler\\Dropbox\\Public\\" #just replace where you download the file
f <- paste0(loc,"Exact_Dist.R") #file available at "https://dl.dropboxusercontent.com/u/3385251/Exact_Dist.R"
source(f)

#my small sample test function that returns the exact null distribution
d <- c(6,1,0,3,0,0)
res <- SmallSampTest(d=d,type="Chi")
res[2:5]

#gives the same inferences
chisq.test(d)
chisq.test(d,simulate.p.value = TRUE, B = 1e4)
####################################

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/

Bruce Weaver

Apr 22, 2015; 1:25pm

Re: Show an empty category in a frequency table

Administrator

3512 posts

Good catch, Andy. I was just about to post the "exact" test results when I saw your post. For those who do not currently have access to SPSS, here are the results for both Chi-square and exact tests (with p-values rounded to 6 decimals):

Test Statistics
Q1
Chi-Square 17.600[a]
df 5
Asymp. Sig. .003492
Exact Sig. .003439
Point Probability .001667
[a] 6 cells (100.0%) have expected frequencies less than 5. The minimum expected cell frequency is 1.7.

Syntax:

DATA LIST free / Q1 (F1).
BEGIN DATA
1 1 1 1 1 1
2
4 4 4
END DATA.

NPAR TESTS
/CHISQUARE=Q1 (1,6)
/EXPECTED=EQUAL
/METHOD=EXACT TIMER(5).

Andy W wrote

Never mind the comment about the small counts. I evaluated the test using the exact distribution and it results in the same inferences. See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2476536 for a more detailed description of the procedure.

I made some R code to create the exact distribution for this case. One thing to note is that using the exact distribution will only make the p-value smaller, so if you reject the null using the asymptotic distribution you will also reject using the exact distribution, which is what happens in this case.

####################################
library(partitions)
loc <- "C:\\Users\\andrew.wheeler\\Dropbox\\Public\\" #just replace where you download the file
f <- paste0(loc,"Exact_Dist.R") #file available at "https://dl.dropboxusercontent.com/u/3385251/Exact_Dist.R"
source(f)

#my small sample test function that returns the exact null distribution
d <- c(6,1,0,3,0,0)
res <- SmallSampTest(d=d,type="Chi")
res[2:5]

#gives the same inferences
chisq.test(d)
chisq.test(d,simulate.p.value = TRUE, B = 1e4)
####################################
... [show rest of quote]

... [show rest of quote]

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Andy W

Apr 22, 2015; 1:55pm

Re: Show an empty category in a frequency table

732 posts

Thank you Bruce, I had not noticed the Exact extension had an option for the univariate case. The p-value my R code produces for this example is 0.003438881, a good sign that mine and SPSS's agree! (The point probability is the same in my code as well, 0.001667048.)

Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/

anafuster

Apr 22, 2015; 2:10pm

Re: Show an empty category in a frequency table

2 posts

Thank you so much Andy! With your help I've been able to manage it.
And also thank you Bruce!

Bruce Weaver

Apr 22, 2015; 7:12pm

Re: Show an empty category in a frequency table

Administrator

3512 posts

In reply to this post by Andy W

Hi Andy. From SPSS, with more decimals displayed this time:

.003491841 <-- Asymptotic p-value from SPSS

.003438881 <-- Exact p-value from SPSS
.003438881 <-- Exact p-value from Andy's R code

.001667048 <-- Point probability from SPSS
.001667048 <-- Point probability from Andy's R code

;-)

Andy W wrote

Thank you Bruce, I had not noticed the Exact extension had an option for the univariate case. The p-value my R code produces for this example is 0.003438881, a good sign that mine and SPSS's agree! (The point probability is the same in my code as well, 0.001667048.)