SPSSX Discussion

Negative Binomial – SPSS Bug (?)

Classic

List

Threaded

5 messages Options

drfg2008

Sep 11, 2012; 10:57am

Negative Binomial – SPSS Bug (?)

[This thread refers to the thread: "Poisson - negative binomial" Dec. 19, 2011 and Sep 06, 2012]

I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute the cumulative Neg. Bin. function for P(X<=2).
CDF.NEGBIN(quant,thresh,prob)

---------------------
According to my textbook [1]
E(X)= k(1-p)/p
V(X)=k(1-p)/p^2
p = E(x)/V(x)
k= E(X) p(1-p)
(where p is the probability and k is the threshold)
--------------------

Example:
E(x) = 1.0815
V(x) = 1.7697
-> p = 0.611, k = 1.7
--------------------
My problem is that I do not get the results of the SPSS function reproduced . In the described example the result should be 0.8694. Instead SPSS computes 0.5366.
See for example this python recursive version:

begin program.
import math
import spss

def negbin(x,k,p):
if x==0:
return math.pow(p,k)
return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1)

def VertNegBin(x,k,p):
zaehler=x
summe=0
while zaehler>=0:
summe=summe+negbin(zaehler,k,p)
zaehler=zaehler-1
return summe
end program.

*----------------------------------------------------------------.
COMPUTE var = 1.7697.
COMPUTE Lambda_f = 1.0815.
compute wert=2.
*----------------------------------------------------------------.

COMPUTE negProb=LAMBDA_F/var.
COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)).
execute.

spssinc trans result=CDFnegBinResult /Formula "VertNegBin( wert, thresh,negProb)".
execute.

spssinc trans result=PDFnegBinResult /Formula "negbin( wert, thresh,negProb)".
execute.

COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb).
EXECUTE.

My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in the SPSS function, and if the last applies, how then to compute CDF.NEGBIN(quant,thresh,prob) ?

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995, S.203 ff

Dr. Frank Gaeth

Jon K Peck

Sep 11, 2012; 12:50pm

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

There are two different parameterizations of the negative binomial in common usage. You are probably looking at definitions for a different parameterization than SPSS uses.

Look at http://en.wikipedia.org/wiki/Negative_binomial_distribution and the SPSS function help for details.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: drfg2008 <[hidden email]>
To: [hidden email]
Date: 09/11/2012 05:03 AM
Subject: [SPSSX-L] Negative Binomial – SPSS Bug (?)
Sent by: "SPSSX(r) Discussion" <[hidden email]>

[This thread refers to the thread: "Poisson - negative binomial" Dec. 19, 2011 and Sep 06, 2012] I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute the cumulative Neg. Bin. function for P(X<=2). CDF.NEGBIN(quant,thresh,prob) --------------------- According to my textbook [1] E(X)= k(1-p)/p V(X)=k(1-p)/p^2 p = E(x)/V(x) k= E(X) p(1-p) (where p is the probability and k is the threshold) -------------------- Example: E(x) = 1.0815 V(x) = 1.7697 -> p = 0.611, k = 1.7 -------------------- My problem is that I do not get the results of the SPSS function reproduced . In the described example the result should be 0.8694. Instead SPSS computes 0.5366. See for example this python recursive version: begin program. import math import spss def negbin(x,k,p): if x==0: return math.pow(p,k) return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1) def VertNegBin(x,k,p): zaehler=x summe=0 while zaehler>=0: summe=summe+negbin(zaehler,k,p) zaehler=zaehler-1 return summe end program. *----------------------------------------------------------------. COMPUTE var = 1.7697. COMPUTE Lambda_f = 1.0815. compute wert=2. *----------------------------------------------------------------. COMPUTE negProb=LAMBDA_F/var. COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)). execute. spssinc trans result=CDFnegBinResult /Formula "VertNegBin( wert, thresh,negProb)". execute. spssinc trans result=PDFnegBinResult /Formula "negbin( wert, thresh,negProb)". execute. COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb). EXECUTE. My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in the SPSS function, and if the last applies, how then to compute CDF.NEGBIN(quant,thresh,prob) ? Literature: [1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995, S.203 ff ----- Dr. Frank Gaeth FU-Berlin -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/Negative-Binomial-SPSS-Bug-tp5715014.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Alex Reutter

Sep 11, 2012; 12:59pm

Re: Negative Binomial – SPSS Bug (?)

In reply to this post by drfg2008

Hi Frank,

The negative binomial distribution has two fairly common parameterizations; the one you mention below, and the one Statistics uses. In the one Statistics uses, x (quant) is the number of trials needed (including the last trial) before k (thresh) successes are observed. In the one you mention below, x is the number of failures, x is the number of failures before k successes are observed.

See http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/topic/com.ibm.spss.statistics.help/syn_transformation_expressions_random_variable_distribution_functions.htm for more details on the negative binomial and all of the other distributions in Statistics.

Cheers,
Alex

Sep 27, 2012; 11:04am

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

In reply to this post by Jon K Peck

Hello,

I have the same problem like Frank and I don't manage to solve it.

Regarding to http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm , http://pic.dhe.ibm.com/infocenter there is used a density formula, which uses an other definition of the negative Binomial distribution than the one from the source from Frank [1].

Since I have the expected value E and the Variance V, I want to compute the two parameters probability p and the threshold r.

For the in SPSS used definition, I found
E=r/p
V=r*(1-p)/p²
(found at http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert, sorry for the german link, I just didn't found this in english.)

Derived from this I get
p=1/(V/E +1)
r=E*p

But this results in two problems:
One is, that there are different results than with the variant from Franks source.

The other is that like this, p is always <1/2 (since V/E>1, over dispersion) and resulting from this, if E<=2, then r<1, then SPSS Error.

Thank you in advance,
Silvio.

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995,
S.203 ff

Ryan

Sep 28, 2012; 3:40am

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

There are two parameterizations of the negative binomial distribution with which I am familiar. If you search the document below for "4.8", you'll find the negative binomial probability distribution which I think interests you:

http://www.drmehrdad.com/tasavir/negativbinomialregression.pdf

Based on the probability distribution function provided in the document above (4.8), one could easily obtain the probability that y equals a specific value using the syntax BELOW my name. Note that I have set the mean at lambda=3 and the variance at variance=4; this represents an overdispersed situation, which is appropriate for the negative binomial distribution. Also, you'll see that I demonstrate how to compute the dispersion parameter ("k") based on the mean and variance. Finally, the AGGREGATE function shows you how to obtain the probability that y is less than or equal to 15.

HTH,

Ryan

DATA LIST LIST / lambda variance y.
BEGIN DATA.
3 4 0
3 4 1
3 4 2
3 4 3
3 4 4
3 4 5
3 4 6
3 4 7
3 4 8
3 4 9
3 4 10
3 4 11
3 4 12
3 4 13
3 4 14
3 4 15
END DATA.

compute k = lambda**2 / (variance - lambda).
compute prob_y = ((k / (k + lambda))**k) * (exp(lngamma(k + y)) / (exp(lngamma(y+1))*exp(lngamma(k)))) * (lambda / (k + lambda))**y.
execute.

AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=
/cum_prob_15=SUM(prob_y).

Ryan

On Thu, Sep 27, 2012 at 7:04 AM, SD <[hidden email]> wrote:

Hello,

I have the same problem like Frank and I don't manage to solve it.

Regarding to
http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm
, http://pic.dhe.ibm.com/infocenter
<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm>
there is used a density formula, which uses an other definition of the
negative Binomial distribution than the one from the source from Frank [1].

Since I have the expected value E and the Variance V, I want to compute the
two parameters probability p and the threshold r.

For the in SPSS used definition, I found
E=r/p
V=r*(1-p)/p²
(found at
http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert
<http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert> ,
sorry for the german link, I just didn't found this in english.)

Derived from this I get
p=1/(V/E +1)
r=E*p

But this results in two problems:
One is, that there are different results than with the variant from Franks
source.

The other is that like this, p is always <1/2 (since V/E>1, over dispersion)
and resulting from this, if E<=2, then r<1, then SPSS Error.

Thank you in advance,
Silvio.

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg. Oldenbourg 1995,
S.203 ff

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Negative-Binomial-SPSS-Bug-tp5715014p5715310.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

... [show rest of quote]