Negative Binomial or Pascal and Geometric distribution

Navigation:  Probability distributions > Discrete Distributions >

Negative Binomial or Pascal and Geometric distribution

Previous pageReturn to chapter overviewNext page

Anscombe (1950) defines the Negative Binomial (NB) distribution as:

where m is the distribution mean, k is a parameter and Γ() is the Gamma function. Letting p=m/(m+k) this expression can be re-written as:

Plots of this distribution for p=0.5 and varying values of k are shown below.

Negative Binomial, p=0.5, k=0.5,1,3,5

negbin

Originally this distribution was introduced as a model of the number of successes in trials before a failure is observed, where p is the probability of success. However, the distribution has been more widely used as a model for count data that are more clustered than one would expect for a purely random process (i.e. more clustered than under a Poisson process). A quick test to see if the Negative Binomial might be appropriate when the Poisson is not is to see if the variance>mean. If an observed distribution shows more clustering than can be modeled effectively with a NB distribution, some other form of clustered or contagious distribution may be more effective. The distribution is always positively skewed (left skewed) and for large values of the parameter k tends to a symmetric distribution.

Ehrenberg (1959 [EHR1]) used the NB distribution (based on Anscombe's formulation [ANS1]) with great success to model consumer purchasing behavior. He found that for a very large range of regularly purchased branded products, such as breakfast cereals, canned goods, soft drinks, detergents etc, the number of units purchased by consumers over time could be modeled using the Negative Binomial. Furthermore, a convenient and effective fit of the model could be obtained by calculating the mean of the sample, m, and the proportion of non-buyers, p(0), both of which are readily available from the survey data. Ehrenberg cites the example of purchases made of a specific product over a 26-week period by a consumer panel of 2000 households, and demonstrates that using the fitting method just described the fit for 0 units is exact, and for up to 10 units is very good. The distribution of recorded purchases did have a very long tail, with a few consumers buying much larger numbers of the product than expected (e.g. 20+). This partly reflects the problem of fitting such distributions where there are varying packaging sizes, brand mixes, bulk offers etc., issues that have increased since the time Ehrenberg produced these findings. However, his core observations regarding consumer purchasing habits and the usefulness of NB models remains broadly valid today.

Key measures for the NB are shown below, where q=1-p:

Mean

kq/p

Variance

kq/p2

Skewness

Kurtosis

References

[ANS1] Anscombe F J (1950) Sampling Theory of the Negative Binomial and Logarithmic Series Distributions. Biometrika, 37(3/4), 358-382

[EHR1] Ehrenberg A S C (1959) The Pattern of Consumer Purchases. Applied Statistics, 8(1), 26-41

[JOH1] Johnson N L, Kotz S (1969) Discrete distributions. Houghton Mifflin/J Wiley & Sons, New York

Mathworld/Weisstein E W: Negative Binomial Distribution: http://mathworld.wolfram.com/NegativeBinomialDistribution.html

Wikipedia: Negative Binomial Distribution: http://en.wikipedia.org/wiki/Negative_binomial