Show an empty category in a frequency table

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Show an empty category in a frequency table

anafuster
It is the first time I use SPSS so I'm beginner and I have a problem.
How can I show an empty category in a frequency table?

I have a nominal variable, for example "Q1 - Which list has more movies that you find appealing?", it has 6 possible categories, one for each algorithm {1="ItemItem"; 2="Lucene"; 3="Persmean"; 4="Popular"; 5="SVD"; 6="UserUSer"}.(I have defined it on the variable view as Value labels)
But when nobody has opted for one of the algorithm, the category doesn't appear when I do the chi-squared test or the frequency analysis. For example, in the case of the group analysis the answers for this questions are:
       
        Observed Count Value Label
                6          ItemItem
                1          Lucene
                0         Persmean
                3          Popular
                0            SVD
                0         UserUser

And the results obtained with the chi-squared test are:

Q1Accuracy
        Observed N Expected N Residual
ItemItem 6   3,3          2,7        
Lucene 1   3,3 -2,3
Popular 3   3,3          -,3
Total 10

But the expected count should be 1,6 instead of 3,3.

What I want it's that all the categories appear on the analysis even if it is with  Observed Count = 0.
Do you know how  can I manage it?

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

Andy W
Something like this should do it:

NPAR TESTS
  /CHISQUARE=Q1Accuracy (1,6)
  /EXPECTED=EQUAL.

Note that for your particular example the cells counts are quite low.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

Andy W
Never mind the comment about the small counts. I evaluated the test using the exact distribution and it results in the same inferences. See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2476536 for a more detailed description of the procedure.

I made some R code to create the exact distribution for this case. One thing to note is that using the exact distribution will only make the p-value smaller, so if you reject the null using the asymptotic distribution you will also reject using the exact distribution, which is what happens in this case.

####################################
library(partitions)
loc <- "C:\\Users\\andrew.wheeler\\Dropbox\\Public\\" #just replace where you download the file
f <- paste0(loc,"Exact_Dist.R")  #file available at "https://dl.dropboxusercontent.com/u/3385251/Exact_Dist.R"
source(f)

#my small sample test function that returns the exact null distribution
d <- c(6,1,0,3,0,0)
res <- SmallSampTest(d=d,type="Chi")
res[2:5]

#gives the same inferences
chisq.test(d)
chisq.test(d,simulate.p.value = TRUE, B = 1e4)
####################################
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

Bruce Weaver
Administrator
Good catch, Andy.  I was just about to post the "exact" test results when I saw your post.  For those who do not currently have access to SPSS, here are the results for both Chi-square and exact tests (with p-values rounded to 6 decimals):

Test Statistics
Q1
Chi-Square 17.600[a]
df                 5
Asymp. Sig. .003492
Exact Sig.        .003439
Point Probability .001667
[a] 6 cells (100.0%) have expected frequencies less than 5. The minimum expected cell frequency is 1.7.


Syntax:

DATA LIST free / Q1 (F1).
BEGIN DATA
1 1 1 1 1 1
2
4 4 4
END DATA.

NPAR TESTS
  /CHISQUARE=Q1 (1,6)
  /EXPECTED=EQUAL
  /METHOD=EXACT TIMER(5).



Andy W wrote
Never mind the comment about the small counts. I evaluated the test using the exact distribution and it results in the same inferences. See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2476536 for a more detailed description of the procedure.

I made some R code to create the exact distribution for this case. One thing to note is that using the exact distribution will only make the p-value smaller, so if you reject the null using the asymptotic distribution you will also reject using the exact distribution, which is what happens in this case.

####################################
library(partitions)
loc <- "C:\\Users\\andrew.wheeler\\Dropbox\\Public\\" #just replace where you download the file
f <- paste0(loc,"Exact_Dist.R")  #file available at "https://dl.dropboxusercontent.com/u/3385251/Exact_Dist.R"
source(f)

#my small sample test function that returns the exact null distribution
d <- c(6,1,0,3,0,0)
res <- SmallSampTest(d=d,type="Chi")
res[2:5]

#gives the same inferences
chisq.test(d)
chisq.test(d,simulate.p.value = TRUE, B = 1e4)
####################################
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

Andy W
Thank you Bruce, I had not noticed the Exact extension had an option for the univariate case. The p-value my R code produces for this example is 0.003438881, a good sign that mine and SPSS's agree! (The point probability is the same in my code as well, 0.001667048.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

anafuster
Thank you so much Andy! With your help I've been able to manage it.
And also thank you Bruce!
Reply | Threaded
Open this post in threaded view
|

Re: Show an empty category in a frequency table

Bruce Weaver
Administrator
In reply to this post by Andy W
Hi Andy.  From SPSS, with more decimals displayed this time:

.003491841  <-- Asymptotic p-value from SPSS

.003438881  <-- Exact p-value from SPSS
.003438881  <-- Exact p-value from Andy's R code

.001667048  <-- Point probability from SPSS
.001667048  <-- Point probability from Andy's R code

;-)


Andy W wrote
Thank you Bruce, I had not noticed the Exact extension had an option for the univariate case. The p-value my R code produces for this example is 0.003438881, a good sign that mine and SPSS's agree! (The point probability is the same in my code as well, 0.001667048.)
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).