Negative Binomial – SPSS Bug (?)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Negative Binomial – SPSS Bug (?)

drfg2008
[This thread refers to the thread: "Poisson - negative binomial" Dec. 19, 2011 and Sep 06, 2012]

I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute the cumulative Neg. Bin. function for P(X<=2).
CDF.NEGBIN(quant,thresh,prob)

---------------------
According to my textbook [1]
E(X)= k(1-p)/p
V(X)=k(1-p)/p^2
p = E(x)/V(x)
k= E(X) p(1-p)
(where p is the probability and k is the threshold)
--------------------

Example:
E(x) = 1.0815
V(x) = 1.7697
->  p = 0.611, k = 1.7
--------------------
My problem is that I do not get the results of the SPSS function reproduced . In the described example the result should be 0.8694. Instead SPSS computes 0.5366.
See for example this python  recursive version:

begin program.
import math
import spss

def negbin(x,k,p):
 if x==0:
  return math.pow(p,k)
 return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1)

def VertNegBin(x,k,p):
 zaehler=x
 summe=0
 while zaehler>=0:
  summe=summe+negbin(zaehler,k,p)
  zaehler=zaehler-1
 return summe
end program.

*----------------------------------------------------------------.
COMPUTE var = 1.7697.
COMPUTE Lambda_f = 1.0815.
compute wert=2.
*----------------------------------------------------------------.

COMPUTE negProb=LAMBDA_F/var.
COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)).
execute.

spssinc trans result=CDFnegBinResult /Formula "VertNegBin( wert, thresh,negProb)".
execute.

spssinc trans result=PDFnegBinResult /Formula "negbin( wert, thresh,negProb)".
execute.


COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb).
EXECUTE.

My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in the SPSS function, and if the last applies, how then to compute CDF.NEGBIN(quant,thresh,prob) ?

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg.  Oldenbourg 1995, S.203 ff
Dr. Frank Gaeth

Reply | Threaded
Open this post in threaded view
|

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

Jon K Peck
There are two different parameterizations of the negative binomial in common usage.  You are  probably looking at definitions for a different parameterization than SPSS uses.

Look at http://en.wikipedia.org/wiki/Negative_binomial_distribution and the SPSS function help for details.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        drfg2008 <[hidden email]>
To:        [hidden email]
Date:        09/11/2012 05:03 AM
Subject:        [SPSSX-L] Negative Binomial – SPSS Bug (?)
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




[This thread refers to the thread: "Poisson - negative binomial" Dec. 19,
2011 and Sep 06, 2012]

I have E(X) and V(X) of a neg.bin. distributed RV and would like to compute
the cumulative Neg. Bin. function for P(X<=2).
CDF.NEGBIN(quant,thresh,prob)

---------------------
According to my textbook [1]
E(X)= k(1-p)/p
V(X)=k(1-p)/p^2
p = E(x)/V(x)
k= E(X) p(1-p)
(where p is the probability and k is the threshold)
--------------------

Example:
E(x) = 1.0815
V(x) = 1.7697
->  p = 0.611, k = 1.7
--------------------
My problem is that I do not get the results of the SPSS function reproduced
. In the described example the result should be 0.8694. Instead SPSS
computes 0.5366.
See for example this python  recursive version:

begin program.
import math
import spss

def negbin(x,k,p):
if x==0:
 return math.pow(p,k)
return negbin(x-1,k,p)*(1-p)*(x-1+k)/(x-1+1)

def VertNegBin(x,k,p):
zaehler=x
summe=0
while zaehler>=0:
 summe=summe+negbin(zaehler,k,p)
 zaehler=zaehler-1
return summe
end program.

*----------------------------------------------------------------.
COMPUTE var = 1.7697.
COMPUTE Lambda_f = 1.0815.
compute wert=2.
*----------------------------------------------------------------.

COMPUTE negProb=LAMBDA_F/var.
COMPUTE thresh=LAMBDA_F*(negProb/(1-negProb)).
execute.

spssinc trans result=CDFnegBinResult /Formula "VertNegBin(      wert,
thresh,negProb)".
execute.

spssinc trans result=PDFnegBinResult /Formula "negbin(  wert,
thresh,negProb)".
execute.


COMPUTE SPSS_version=CDF.NEGBIN(wert,thresh,negProb).
EXECUTE.

My question: Is it an SPSS Bug or my misunderstanding of thresh and prob in
the SPSS function, and if the last applies, how then to compute
CDF.NEGBIN(quant,thresh,prob) ?

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg.  Oldenbourg 1995,
S.203 ff




-----
Dr. Frank Gaeth
FU-Berlin

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Negative-Binomial-SPSS-Bug-tp5715014.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Negative Binomial – SPSS Bug (?)

Alex Reutter
In reply to this post by drfg2008
Hi Frank,

The negative binomial distribution has two fairly common parameterizations; the one you mention below, and the one Statistics uses.  In the one Statistics uses, x (quant) is the number of trials needed (including the last trial) before k (thresh) successes are observed.  In the one you mention below, x is the number of failures, x is the number of failures before k successes are observed.

See http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/topic/com.ibm.spss.statistics.help/syn_transformation_expressions_random_variable_distribution_functions.htm for more details on the negative binomial and all of the other distributions in Statistics.

Cheers,
Alex
SD
Reply | Threaded
Open this post in threaded view
|

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

SD
In reply to this post by Jon K Peck
Hello,

I have the same problem like Frank and I don't manage to solve it.

Regarding to http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm , http://pic.dhe.ibm.com/infocenter there is used a density formula, which uses an other definition of the negative Binomial distribution than the one from the source from Frank [1].

Since I have the expected value E and the Variance V, I want to compute the two parameters probability p and the threshold r.

For the in SPSS used definition, I found
E=r/p
V=r*(1-p)/p²
(found at http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert, sorry for the german link, I just didn't found this in english.)

Derived from this I get
p=1/(V/E +1)
r=E*p

But this results in two problems:
One is, that there are different results than with the variant from Franks source.

The other is that like this, p is always <1/2 (since V/E>1, over dispersion) and resulting from this, if E<=2, then r<1, then SPSS Error.

Thank you in advance,
Silvio.

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg.  Oldenbourg 1995,
S.203 ff
Reply | Threaded
Open this post in threaded view
|

Re: [SPSSX-L] Negative Binomial – SPSS Bug (?)

Ryan
There are two parameterizations of the negative binomial distribution with which I am familiar. If you search the document below for "4.8", you'll find the negative binomial probability distribution which I think interests you:
 
 
Based on the probability distribution function provided in the document above (4.8), one could easily obtain the probability that y equals a specific value using the syntax BELOW my name. Note that I have set the mean at lambda=3 and the variance at variance=4; this represents an overdispersed situation, which is appropriate for the negative binomial distribution. Also, you'll see that I demonstrate how to compute the dispersion parameter ("k") based on the mean and variance. Finally, the AGGREGATE function shows you how to obtain the probability that y is less than or equal to 15.
 
HTH,
 
Ryan
 
--
 
DATA LIST LIST / lambda variance y.
BEGIN DATA.
3 4 0
3 4 1
3 4 2
3 4 3
3 4 4
3 4 5
3 4 6
3 4 7
3 4 8
3 4 9
3 4 10
3 4 11
3 4 12
3 4 13
3 4 14
3 4 15
END DATA.
 
compute k = lambda**2 / (variance - lambda).
compute  prob_y = ((k / (k + lambda))**k) * (exp(lngamma(k + y)) / (exp(lngamma(y+1))*exp(lngamma(k)))) * (lambda / (k + lambda))**y.
execute.
 
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /cum_prob_15=SUM(prob_y).
 
Ryan
On Thu, Sep 27, 2012 at 7:04 AM, SD <[hidden email]> wrote:
Hello,

I have the same problem like Frank and I don't manage to solve it.

Regarding to
http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm
,  http://pic.dhe.ibm.com/infocenter
<http://pic.dhe.ibm.com/infocenter/spssstat/v21r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_transformation_expressions_random_variable_distribution_functions.htm>
there is used a density formula, which uses an other definition of the
negative Binomial distribution than the one from the source from Frank [1].

Since I have the expected value E and the Variance V, I want to compute the
two parameters probability p and the threshold r.

For the in SPSS used definition, I found
E=r/p
V=r*(1-p)/p²
(found at
http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert
<http://de.wikipedia.org/wiki/Negative_Binomialverteilung#Erwartungswert>  ,
sorry for the german link, I just didn't found this in english.)

Derived from this I get
p=1/(V/E +1)
r=E*p

But this results in two problems:
One is, that there are different results than with the variant from Franks
source.

The other is that like this, p is always <1/2 (since V/E>1, over dispersion)
and resulting from this, if E<=2, then r<1, then SPSS Error.

Thank you in advance,
Silvio.

Literature:
[1] Schlittgen, R.: Einführung in die Statistik. 5. Auflg.  Oldenbourg 1995,
S.203 ff




--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Negative-Binomial-SPSS-Bug-tp5715014p5715310.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD