Efficient Computation of Ordinary and Generalized Poisson Binomial Distributions

Introduction

The Poisson binomial distribution (in the following abbreviated as PBD) is becoming increasingly important, especially in the areas of statistics, finance, insurance mathematics and quality management. This package provides functions for two types of PBDs: ordinary and generalized PBDs (henceforth referred to as O-PBDs and G-PBDs).

Ordinary Poisson Binomial Distribution

The O-PBD is the distribution of the sum of a number n of independent Bernoulli-distributed random indicators Xi ∈ {0, 1} (i = 1, ..., n): $$X := \sum_{i = 1}^{n}{X_i}.$$ Each of the Xi possesses a predefined probability of success pi := P(Xi = 1) (subsequently P(Xi = 0) = 1 − pi =  : qi). With this, mean, variance and skewness can be expressed as $$E(X) = \sum_{i = 1}^{n}{p_i} \quad \quad Var(X) = \sum_{i = 1}^{n}{p_i q_i} \quad \quad Skew(X) = \frac{\sum_{i = 1}^{n}{p_i q_i(q_i - p_i)}}{\sqrt{Var(X)}^3}.$$ All possible observations are in {0, ..., n}.

Generalized Poisson Binomial Distribution

The G-PBD is defined very similar. Again, it is the distribution of a sum random variables, but here, each Xi ∈ {ui, vi} with P(Xi = ui) =  : pi and P(Xi = vi) = 1 − pi =  : qi. Using ordinary Bernoulli-distributed random variables Yi, Xi can be expressed as Xi = uiYi + vi(1 − Yi) = vi + Yi ⋅ (ui − vi). As a result, mean, variance and skewness are given by $$E(X) = \sum_{i = 1}^{n}{v_i} + \sum_{i = 1}^{n}{p_i (u_i - v_i)} \quad \quad Var(X) = \sum_{i = 1}^{n}{p_i q_i(u_i - v_i)^2} \\Skew(X) = \frac{\sum_{i = 1}^{n}{p_i q_i(q_i - p_i)(u_i - v_i)^3}}{\sqrt{Var(X)}^3}.$$ All possible observations are in {U, ..., V} with $U := \sum_{i = 1}^{n}{\min\{u_i, v_i\}}$ and $V := \sum_{i = 1}^{n}{\max\{u_i, v_i\}}$. Note that the size m := V − U of the distribution does not generally equal n!

Existing R Packages

Computing these distributions exactly is computationally demanding, but in the last few years, some efficient algorithms have been developed. Particularly significant in this respect are the works of Hong (2013), who derived the DFT-CF procedure for O-PBDs, Biscarri, Zhao & Brunner (2018) who developed two immensely faster algorithms for O-PBDs, namely the DC and DC-FFT procedures, and Zhang, Hong and Balakrishnan (2018) who further developed Hong’s (2013) DFT-CF algorithm for G-PBDs (in the following, this generalized procedure is referred to as G-DFT-CF). Still, only a few R packages exist for the calculation of either ordinary and generalized PBDs, e.g. poibin and poisbinom for O-PBDs and GPB for G-PDBs. Before the release of this PoissonBinomial package, there has been no R package that implemented the DC and DC-FFT algorithms of Biscarri, Zhao & Brunner (2018), as they only published a reference implementation for R, but refrained from releasing it as a package. Additionally, there are no comparable approaches for G-PBDs to date.

The poibin package implements the DFT-CF algorithm along with the exact recursive method of Barlow & Heidtmann (1984) and Normal and Poisson approximations. However, both exact procedures of this package possess some disadvantages, i.e. they are relatively slow at computing very large distributions, with the recursive algorithm being also very memory consuming. The G-DFT-CF procedure is implemented in the GPB package and inherits this performance drawback. The poisbinom package provides a more efficient and much faster DFT-CF implementation. The performance improvement over the poibin package lies in the use of the FFTW C library. Unfortunately, it sometimes yields some negative probabilities in the tail regions, especially for large distributions. However, this numerical issue has not been addressed to date. This PoissonBinomial also utilizes FFTW for both DFT-CF and G-DFT-CF algorithms, but corrects that issue. In addition to the disadvantages regarding computational speed (poibin and GPB) or numerics (poisbinom), especially for very large distributions, the aforementioned packages do not provide headers for their internal C/C++ functions, so that they cannot be imported directly by C or C++ code of other packages that use for example Rcpp.

In some situations, people might have to deal with Poisson binomial distributions that include Bernoulli variables with pi ∈ {0, 1}. Calculation performance can be further optimized by handling these indicators before the actual computations. Approximations also benefit from this in terms of accuracy. None of the aforementioned packages implements such optimizations. Therefore, the advantages of this PoissonBinomial package can be summarized as follows:

  • Efficient computation of very large distributions with both exact and approximate algorithms for O-PBDs and G-PBDs
  • Provides headers for the C++ functions so that other packages may include them in their own C++ code
  • Handles (sometimes large numbers of) 0- and 1-probabilities to speed up performance

In total, this package includes 10 different algorithms of computing ordinary Poisson binomial distributions, including optimized versions of the Normal, Refined Normal and Poisson approaches, and 5 approaches for generalized PBDs. In addition, the implementation of the exact recursive procedure for O-PBDs was rewritten so that it is considerably less memory intensive: the poibin implementation needs the memory equivalent of (n + 1)2 values of type double, while ours only needs 3 ⋅ (n + 1).


Exact Procedures

Ordinary Poisson Binomial Distribution

In this package implements the following exact algorithms for computing ordinary Poisson binomial distributions:

Generalized Poisson Binomial Distribution

For generalized Poisson binomial distributions, this package provides:

Examples

Examples and performance comparisons of these procedures are presented in a separate vignette.


Approximations

Ordinary Poisson Binomial Distribution

In addition, the following O-PBD approximation methods are included:

  • the Poisson Approximation approach,
  • the Arithmetic Mean Binomial Approximation procedure,
  • Geometric Mean Binomial Approximation algorithms,
  • the Normal Approximation and
  • the Refined Normal Approximation.

Generalized Poisson Binomial Distribution

For G-PBDs, there are

  • the Normal Approximation and
  • the Refined Normal Approximation.

Examples

Examples and performance comparisons of these approaches are provided in a separate vignette as well.


Handling special cases, zeros and ones

Handling special cases, such as ordinary binomial distributions, zeros and ones is useful to speed up performance. Unfortunately, some approximations do not work well for Bernoulli trials with pi ∈ {0, 1}, e.g. the Geometric Mean Binomial Approximations. This is why handling these values before the actual computation of the distribution is not only a performance tweak, but sometimes even a necessity. It is achieved by some simple preliminary considerations.

Ordinary Poisson Binomial Distributions

  1. All pi = p are equal?
    In this case, we have a usual binomial distribution. The specified method of computation is then ignored. In particular, the following applies:
    1. p = 0: The only observable value is 0, i.e. P(X = 0) = 1 and P(X ≠ 0) = 0.
    2. p = 1: The only observable value is n, i.e. P(X = n) = 1 and P(X ≠ n) = 0.
  2. All pi ∈ {0, 1}(i = 1, ..., n)?
    If one pi is 1, it is impossible to measure 0 successes. Following the same logic, if two pi are 1, we cannot observe 0 and 1 successes and so on. In general, a number of n1 values with pi = 1 makes it impossible to measure 0, ..., n1 − 1 successes. Likewise, if there are n0 Bernoulli trials with pi = 0, we cannot observe n − n0 + 1, ..., n successes. If all pi ∈ {0, 1}, it holds n = n0 + n1. As a result, the only observable value is n1, i.e. P(X = n1) = 1 and P(X ≠ n1) = 0.
  3. Are there pi ∉ {0, 1}?
    Using the deductions from above, we can only observe an “inner” distribution in the range of n1, n1 + 1, ..., n − n0, i.e. P(X ∈ {n1, ..., n − n0}) > 0 and P(X < n1) = P(X > n − n0) = 0. As a result, X can be expressed as X = n1 + Y with Y ∼ PBin({pi|0 < pi < 1}) and |{pi|0 < pi < 1}| = n − n0 − n1. Subsequently, the Poisson binomial distribution must only be computed for Y. Especially, if there is only one pi ∉ {0, 1}, Y follows a Bernoulli distribution with parameter pi, i.e. P(X = n1) = P(Y = 0) = 1 − pi and P(X = n1 + 1) = P(Y = 1) = pi.

These cases are illustrated in the following example:

# Case 1
dpbinom(NULL, rep(0.3, 7))
#> [1] 0.0823543 0.2470629 0.3176523 0.2268945 0.0972405 0.0250047 0.0035721
#> [8] 0.0002187
dbinom(0:7, 7, 0.3) # equal results
#> [1] 0.0823543 0.2470629 0.3176523 0.2268945 0.0972405 0.0250047 0.0035721
#> [8] 0.0002187

dpbinom(NULL, c(0, 0, 0, 0, 0, 0, 0)) # only 0 is observable
#> [1] 1 0 0 0 0 0 0 0
dpbinom(0, c(0, 0, 0, 0, 0, 0, 0)) # confirmation
#> [1] 1

dpbinom(NULL, c(1, 1, 1, 1, 1, 1, 1)) # only 7 is observable
#> [1] 0 0 0 0 0 0 0 1
dpbinom(7, c(1, 1, 1, 1, 1, 1, 1)) # confirmation
#> [1] 1

# Case 2
dpbinom(NULL, c(0, 0, 0, 0, 1, 1, 1)) # only 3 is observable
#> [1] 0 0 0 1 0 0 0 0
dpbinom(3, c(0, 0, 0, 0, 1, 1, 1)) # confirmation
#> [1] 1

# Case 3
dpbinom(NULL, c(0, 0, 0.1, 0.2, 0.4, 0.8, 1)) # only 1-5 are observable
#> [1] 0.0000 0.0864 0.4344 0.3784 0.0944 0.0064 0.0000 0.0000
dpbinom(1:5, c(0, 0, 0.1, 0.2, 0.4, 0.8, 1)) # confirmation
#> [1] 0.0864 0.4344 0.3784 0.0944 0.0064

dpbinom(NULL, c(0, 0, 0.4, 1)) # only 1 and 2 are observable
#> [1] 0.0 0.6 0.4 0.0 0.0
dpbinom(1:2, c(0, 0, 0.4, 1)) # confirmation
#> [1] 0.6 0.4

Generalized Poisson Binomial Distributions

  1. All ui ∈ {0, 1} and all vi = 1 − ui?
    Then, it is an ordinary Poisson binomial distribution with parameters pi′ = pi for all i for which ui = 1 and pi′ = 1 − pi otherwise. This includes all the special cases described above.
  2. All ui = u are equal and all vi = v are equal?
    In this case, we have a linearly transformed ordinary Poisson binomial distribution, i.e. X can be expressed as X = uY + v(n − Y) with Y ∼ PBin(p1, ..., pn). In particular, if all pi = p are also the same, we have a linear transformation of the usual binomial distribution, i.e. X = uZ + v(n − Z) with Z ∼ Bin(n, p). Summarizing this, the following applies:
    1. All pi = 0: The only observable value is n ⋅ v, i.e. P(X = n ⋅ v) = 1 and P(X ≠ n ⋅ v) = 0.
    2. All pi = 1: The only observable value is n ⋅ u, i.e. P(X = n ⋅ u) = 1 and P(X ≠ n ⋅ u) = 0.
    3. All pi = p: Observable values are in {u ⋅ k + v ⋅ (n − k)|k = 0, ..., n} and P(X = u ⋅ k + v ⋅ (n − k)) = P(Z = k).
    4. Otherwise: Observable values are in {u ⋅ k + v ⋅ (n − k)|k = 0, ..., n}) and P(X = u ⋅ k + v(n − k)) = P(Y = k)
  3. All pi ∈ {0, 1}?
    Let I = {i | pi = 1} ⊆ {1, ..., n} and J = {i | pi = 0} ⊆ {1, ..., n}. Then, we have:
    1. All pi = 0: The only observable value is $v^* := \sum_{i = 1}^{n}{v_i}$, i.e. P(X = v*) = 1 and P(X ≠ v*) = 0.
    2. All pi = 1: The only observable value is $u^* := \sum_{i = 1}^{n}{u_i}$, i.e. P(X = u*) = 1 and P(X ≠ u*) = 0.
    3. Otherwise, The only observable value is w* := ∑i ∈ Iui + ∑i ∈ Jvi, i.e. P(X = w*) = 1 and P(X ≠ w*) = 0. Note that the case that any ui = vi is equivalent to pi = 1, because the corresponding random variable Xi has always the same (non-random) value.
  4. Are there pi ∉ {0, 1}?
    Let I, J and w* as above and K = {i | pi > 0  ∧ pi < 1} ⊆ {1, ..., n}. Then, X can be expressed as X = w* + Z with Z = ∑i ∈ KXi following a (reduced) generalized Poisson Bernoulli distribution. In particular, if only one pi ∉ {0, 1}, Z follows a linearly transformed Bernoulli distribution.

These cases are illustrated in the following example:

set.seed(1)
pp <- runif(7)
va <- sample(0:6, 7, TRUE)
vb <- sample(0:6, 7, TRUE)

# Case 1
dgpbinom(NULL, pp, rep(1, 7), rep(0, 7))
#> [1] 8.114776e-05 3.112722e-03 4.063146e-02 2.115237e-01 3.793308e-01
#> [6] 2.735489e-01 8.297278e-02 8.798512e-03
dpbinom(NULL, pp) # equal results
#> [1] 8.114776e-05 3.112722e-03 4.063146e-02 2.115237e-01 3.793308e-01
#> [6] 2.735489e-01 8.297278e-02 8.798512e-03

dgpbinom(NULL, pp, rep(0, 7), rep(1, 7))
#> [1] 8.798512e-03 8.297278e-02 2.735489e-01 3.793308e-01 2.115237e-01
#> [6] 4.063146e-02 3.112722e-03 8.114776e-05
dpbinom(NULL, 1 - pp) # equal results
#> [1] 8.798512e-03 8.297278e-02 2.735489e-01 3.793308e-01 2.115237e-01
#> [6] 4.063146e-02 3.112722e-03 8.114776e-05

dgpbinom(NULL, pp, c(rep(1, 3), rep(0, 4)), c(rep(0, 3), rep(1, 4)))
#> [1] 3.062225e-02 1.998504e-01 3.769239e-01 2.828424e-01 9.450797e-02
#> [6] 1.426764e-02 9.620692e-04 2.331571e-05
dpbinom(NULL, c(pp[1:3], 1 - pp[4:7])) # reorder for 0 and 1; equal results
#> [1] 3.062225e-02 1.998504e-01 3.769239e-01 2.828424e-01 9.450797e-02
#> [6] 1.426764e-02 9.620692e-04 2.331571e-05

# Case 2 a)
dgpbinom(NULL, rep(0, 7), rep(4, 7), rep(2, 7)) # only 14 is observable
#>  [1] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
dgpbinom(7 * 2, rep(0, 7), rep(4, 7), rep(2, 7)) # confirmation
#> [1] 1

# Case 2 b)
dgpbinom(NULL, rep(1, 7), rep(4, 7), rep(2, 7)) # only 28 is observable
#>  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
dgpbinom(7 * 4, rep(1, 7), rep(4, 7), rep(2, 7)) # confirmation
#> [1] 1

# Case 2 c)
dgpbinom(NULL, rep(0.3, 7), rep(4, 7), rep(2, 7))
#>  [1] 0.0823543 0.0000000 0.2470629 0.0000000 0.3176523 0.0000000 0.2268945
#>  [8] 0.0000000 0.0972405 0.0000000 0.0250047 0.0000000 0.0035721 0.0000000
#> [15] 0.0002187
dbinom(0:7, 7, 0.3) # equal results, but on different support set
#> [1] 0.0823543 0.2470629 0.3176523 0.2268945 0.0972405 0.0250047 0.0035721
#> [8] 0.0002187

# Case 2 d)
dgpbinom(NULL, pp, rep(4, 7), rep(2, 7))
#>  [1] 8.114776e-05 0.000000e+00 3.112722e-03 0.000000e+00 4.063146e-02
#>  [6] 0.000000e+00 2.115237e-01 0.000000e+00 3.793308e-01 0.000000e+00
#> [11] 2.735489e-01 0.000000e+00 8.297278e-02 0.000000e+00 8.798512e-03
dpbinom(NULL, pp) # equal results, but on different support set
#> [1] 8.114776e-05 3.112722e-03 4.063146e-02 2.115237e-01 3.793308e-01
#> [6] 2.735489e-01 8.297278e-02 8.798512e-03

# Case 3 a)
dgpbinom(NULL, c(0, 0, 0, 0, 0, 0, 0), va, vb) # only sum(vb) is observable
#>  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
dgpbinom(sum(vb), rep(0, 7), va, vb) # confirmation
#> [1] 1

# Case 3 b)
dgpbinom(NULL, c(1, 1, 1, 1, 1, 1, 1), va, vb) # only sum(va) is observable
#>  [1] 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
dgpbinom(sum(va), rep(1, 7), va, vb) # confirmation
#> [1] 1

# Case 3 c)
dgpbinom(NULL, c(0, 0, 0, 1, 1, 1, 1), va, vb) # only sum(va[4:7], vb[1:3]) is observable
#>  [1] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
dgpbinom(sum(va[4:7], vb[1:3]), c(0, 0, 0, 1, 1, 1, 1), va, vb) # confirmation
#> [1] 1

# Case 4
dgpbinom(NULL, c(0, 0, 0.3, 0.6, 1, 1, 1), va, vb)
#>  [1] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.18 0.00 0.00 0.12 0.42 0.00 0.00 0.28
#> [16] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sure <- sum(va[5:7], vb[1:2])
x.transf <- sum(pmin(va[3:4], vb[3:4])):sum(pmax(va[3:4], vb[3:4]))
dgpbinom(sure + x.transf, c(0, 0, 0.3, 0.6, 1, 1, 1), va, vb)
#> [1] 0.18 0.00 0.00 0.12 0.42 0.00 0.00 0.28
dgpbinom(x.transf, c(0.3, 0.6), va[3:4], vb[3:4]) # equal results
#> [1] 0.18 0.00 0.00 0.12 0.42 0.00 0.00 0.28

dgpbinom(NULL, c(0, 0, 0, 0.6, 1, 1, 1), va, vb)
#>  [1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0.0 0.0 0.4 0.0 0.0 0.0 0.0
#> [20] 0.0 0.0 0.0 0.0 0.0 0.0
sure <- sum(va[5:7], vb[1:3])
x.transf <- va[4]:vb[4]
dgpbinom(sure + x.transf, c(0, 0, 0, 0.6, 1, 1, 1), va, vb)
#> [1] 0.6 0.0 0.0 0.4
dgpbinom(x.transf, 0.6, va[4], vb[4]) # equal results; essentially transformed Bernoulli
#> [1] 0.6 0.0 0.0 0.4

Usage with Rcpp

How to import and use the internal C++ functions in Rcpp based packages is described in a separate vignette.