[This thread refers to the thread: "Poisson - negative binomial" Dec. 19, 2011 and Sep 06, 2012]
I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute the cumulative Neg. Bin. function for P(X<=2). CDF.NEGBIN(quant,thresh,prob) --------------------- According to my textbook [1] E(X)= k(1-p)/p V(X)=k(1-p)/p^2 p = E(x)/V(x) k= E(X) p(1-p) (where p is the probability and k is the threshold) -------------------- Example: E(x) = 1.0815 V(x) = 1.7697 -> p = 0.611, k = 1.7 -------------------- My problem is that I do not get the results of the SPSS function reproduced . In the described example the result should be 0.8694. Instead SPSS computes 0.5366. See for example this python recursive version: begin program. import math import spss def negbin(x,k,p): if x==0: return math.pow(p,k) return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1) def VertNegBin(x,k,p): zaehler=x summe=0 while zaehler>=0: summe=summe+negbin(zaehler,k,p) zaehler=zaehler-1 return summe end program. *----------------------------------------------------------------. COMPUTE var = 1.7697. COMPUTE Lambda_f = 1.0815. compute wert=2. *----------------------------------------------------------------. COMPUTE negProb=LAMBDA_F/var. COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)). execute. spssinc trans result=CDFnegBinResult /Formula "VertNegBin( wert, thresh,negProb)". execute. spssinc trans result=PDFnegBinResult /Formula "negbin( wert, thresh,negProb)". execute. COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb). EXECUTE. My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in the SPSS function, and if the last applies, how then to compute CDF.NEGBIN(quant,thresh,prob) ? Literature: [1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995, S.203 ff
Dr. Frank Gaeth
|
There are two different parameterizations
of the negative binomial in common usage. You are probably
looking at definitions for a different parameterization than SPSS uses.
Look at http://en.wikipedia.org/wiki/Negative_binomial_distribution and the SPSS function help for details. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: drfg2008 <[hidden email]> To: [hidden email] Date: 09/11/2012 05:03 AM Subject: [SPSSX-L] Negative Binomial – SPSS Bug (?) Sent by: "SPSSX(r) Discussion" <[hidden email]> [This thread refers to the thread: "Poisson - negative binomial" Dec. 19, 2011 and Sep 06, 2012] I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute the cumulative Neg. Bin. function for P(X<=2). CDF.NEGBIN(quant,thresh,prob) --------------------- According to my textbook [1] E(X)= k(1-p)/p V(X)=k(1-p)/p^2 p = E(x)/V(x) k= E(X) p(1-p) (where p is the probability and k is the threshold) -------------------- Example: E(x) = 1.0815 V(x) = 1.7697 -> p = 0.611, k = 1.7 -------------------- My problem is that I do not get the results of the SPSS function reproduced . In the described example the result should be 0.8694. Instead SPSS computes 0.5366. See for example this python recursive version: begin program. import math import spss def negbin(x,k,p): if x==0: return math.pow(p,k) return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1) def VertNegBin(x,k,p): zaehler=x summe=0 while zaehler>=0: summe=summe+negbin(zaehler,k,p) zaehler=zaehler-1 return summe end program. *----------------------------------------------------------------. COMPUTE var = 1.7697. COMPUTE Lambda_f = 1.0815. compute wert=2. *----------------------------------------------------------------. COMPUTE negProb=LAMBDA_F/var. COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)). execute. spssinc trans result=CDFnegBinResult /Formula "VertNegBin( wert, thresh,negProb)". execute. spssinc trans result=PDFnegBinResult /Formula "negbin( wert, thresh,negProb)". execute. COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb). EXECUTE. My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in the SPSS function, and if the last applies, how then to compute CDF.NEGBIN(quant,thresh,prob) ? Literature: [1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995, S.203 ff ----- Dr. Frank Gaeth FU-Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Negative-Binomial-SPSS-Bug-tp5715014.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by drfg2008
Hi Frank,
The negative binomial distribution has two fairly common parameterizations; the one you mention below, and the one Statistics uses. In the one Statistics uses, x (quant) is the number of trials needed (including the last trial) before k (thresh) successes are observed. In the one you mention below, x is the number of failures, x is the number of failures before k successes are observed. See http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/topic/com.ibm.spss.statistics.help/syn_transformation_expressions_random_variable_distribution_functions.htm for more details on the negative binomial and all of the other distributions in Statistics. Cheers, Alex |
In reply to this post by Jon K Peck
Hello,
I have the same problem like Frank and I don't manage to solve it. Regarding to http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm , http://pic.dhe.ibm.com/infocenter there is used a density formula, which uses an other definition of the negative Binomial distribution than the one from the source from Frank [1]. Since I have the expected value E and the Variance V, I want to compute the two parameters probability p and the threshold r. For the in SPSS used definition, I found E=r/p V=r*(1-p)/p² (found at http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert, sorry for the german link, I just didn't found this in english.) Derived from this I get p=1/(V/E +1) r=E*p But this results in two problems: One is, that there are different results than with the variant from Franks source. The other is that like this, p is always <1/2 (since V/E>1, over dispersion) and resulting from this, if E<=2, then r<1, then SPSS Error. Thank you in advance, Silvio. Literature: [1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995, S.203 ff |
There are two parameterizations of the negative binomial distribution with which I am familiar. If you search the document below for "4.8", you'll find the negative binomial probability distribution which I think interests you:
Based on the probability distribution function provided in the document above (4.8), one could easily obtain the probability that y equals a specific value using the syntax BELOW my name. Note that I have set the mean at lambda=3 and the variance at variance=4; this represents an overdispersed situation, which is appropriate for the negative binomial distribution. Also, you'll see that I demonstrate how to compute the dispersion parameter ("k") based on the mean and variance. Finally, the AGGREGATE function shows you how to obtain the probability that y is less than or equal to 15.
HTH, Ryan -- DATA LIST LIST / lambda variance y. BEGIN DATA. 3 4 0 3 4 1 3 4 2 3 4 3 3 4 4 3 4 5 3 4 6 3 4 7 3 4 8 3 4 9 3 4 10 3 4 11 3 4 12 3 4 13 3 4 14 3 4 15 END DATA. compute k = lambda**2 / (variance - lambda). compute prob_y = ((k / (k + lambda))**k) * (exp(lngamma(k + y)) / (exp(lngamma(y+1))*exp(lngamma(k)))) * (lambda / (k + lambda))**y. execute. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /cum_prob_15=SUM(prob_y). Ryan On Thu, Sep 27, 2012 at 7:04 AM, SD <[hidden email]> wrote: Hello, |
Free forum by Nabble | Edit this page |