Английская Википедия:Irwin–Hall distribution
Шаблон:Short description Шаблон:Probability distribution
In probability and statistics, the Irwin–Hall distribution, named after Joseph Oscar Irwin and Philip Hall, is a probability distribution for a random variable defined as the sum of a number of independent random variables, each having a uniform distribution.[1] For this reason it is also known as the uniform sum distribution.
The generation of pseudo-random numbers having an approximately normal distribution is sometimes accomplished by computing the sum of a number of pseudo-random numbers having a uniform distribution; usually for the sake of simplicity of programming. Rescaling the Irwin–Hall distribution provides the exact distribution of the random variates being generated.
This distribution is sometimes confused with the Bates distribution, which is the mean (not sum) of n independent random variables uniformly distributed from 0 to 1.
Definition
The Irwin–Hall distribution is the continuous probability distribution for the sum of n independent and identically distributed U(0, 1) random variables:
- <math>
X = \sum_{k=1}^n U_k. </math>
The probability density function (pdf) for <math>0\leq x\leq n</math> is given by
- <math>
f_X(x;n)=\frac{1}{(n-1)!}\sum_{k=0}^n (-1)^k{n \choose k} (x-k)_+^{n-1} </math>
where <math>(x-k)_+</math> denotes the positive part of the expression:
- <math> (x-k)_+ = \begin{cases}
x-k & x-k \geq 0 \\ 0 & x-k < 0.\end{cases} </math>
Thus the pdf is a spline (piecewise polynomial function) of degree n − 1 over the knots 0, 1, ..., n. In fact, for x between the knots located at k and k + 1, the pdf is equal to
- <math>
f_X(x;n) = \frac{1}{(n-1)!}\sum_{j=0}^{n-1} a_j(k,n) x^j </math>
where the coefficients aj(k,n) may be found from a recurrence relation over k
- <math>
a_j(k,n)=\begin{cases} 1&k=0, j=n-1\\
0&k=0, j<n-1\\
a_j(k-1,n) + (-1)^{n+k-j-1}{n\choose
k}{{n-1}\choose j}k^{n-j-1} &k>0\end{cases}
</math>
The coefficients are also A188816 in OEIS. The coefficients for the cumulative distribution is A188668.
The mean and variance are n/2 and n/12, respectively.
Special cases
- For n = 1, X follows a uniform distribution:
- <math>
f_X(x)= \begin{cases} 1 & 0\le x \le 1 \\ 0 & \text{otherwise} \end{cases} </math>
- For n = 2, X follows a triangular distribution:
- <math>
f_X(x)= \begin{cases} x & 0\le x \le 1\\ 2-x & 1\le x \le 2 \end{cases} </math>
- For n = 3,
- <math>
f_X(x)= \begin{cases} \frac{1}{2}x^2 & 0\le x \le 1\\ \frac{1}{2}(-2x^2 + 6x - 3)& 1\le x \le 2\\ \frac{1}{2}(3 - x)^2 & 2\le x \le 3 \end{cases} </math>
- For n = 4,
- <math>
f_X(x)= \begin{cases} \frac{1}{6}x^3 & 0\le x \le 1\\ \frac{1}{6}(-3x^3 + 12x^2 - 12x+4)& 1\le x \le 2\\ \frac{1}{6}(3x^3 - 24x^2 +60x-44) & 2\le x \le 3\\ \frac{1}{6}(4 - x)^3 & 3\le x \le 4 \end{cases} </math>
- For n = 5,
- <math>
f_X(x)= \begin{cases} \frac{1}{24}x^4 & 0\le x \le 1\\ \frac{1}{24}(-4x^4 + 20x^3 - 30x^2+20x-5)& 1\le x \le 2\\ \frac{1}{24}(6x^4-60x^3+210x^2-300x+155) & 2\le x \le 3\\ \frac{1}{24}(-4x^4+60x^3-330x^2+780x-655) & 3\le x \le 4\\ \frac{1}{24}(5 - x)^4 &4\le x\le5 \end{cases} </math>
Approximating a Normal distribution
By the Central Limit Theorem, as n increases, the Irwin–Hall distribution more and more strongly approximates a Normal distribution with mean <math>\mu=n/2</math> and variance <math>\sigma^2=n/12</math>. To approximate the standard Normal distribution <math>\phi(x)=\mathcal{N}(\mu=0, \sigma^2=1)</math>, the Irwin–Hall distribution can be centered by shifting it by its mean of n/2, and scaling the result by the square root of its variance:
- <math>
\phi(x) \overset{n\gg 0}{\approx} \sqrt{\frac{n}{12}} f_X\left(x\sqrt{\frac{n}{12}}+\frac{n}{2};n \right ) </math> This derivation leads to a computationally simple heuristic that removes the square root, whereby a standard Normal distribution can be approximated with the sum of 12 uniform U(0,1) draws like so:
- <math>
\sum_{k=1}^{12}U_k -6 \sim f_X(x+6;12) \mathrel{\dot\sim} \phi(x) </math>
The Irwin–Hall distribution is similar to the Bates distribution, but still featuring only integers as parameter. An extension to real-valued parameters is possible by adding also a random uniform variable with N − trunc(N) as width.
Extensions to the Irwin–Hall distribution
When using the Irwin–Hall for data fitting purposes one problem is that the IH is not very flexible because the parameter n needs to be an integer. However, instead of summing n equal uniform distributions, we could also add e.g. U + 0.5U to address also the case n = 1.5 (giving a trapezoidal distribution).
The Irwin–Hall distribution has an application to beamforming and pattern synthesis in Figure 1 of reference [2][3]
See also
- Bates distribution
- Normal distribution
- Central limit theorem
- Uniform distribution (continuous)
- Triangular distribution
Notes
References
- Hall, Philip. (1927) "The Distribution of Means for Samples of Size N Drawn from a Population in which the Variate Takes Values Between 0 and 1, All Such Values Being Equally Probable". Biometrika, Vol. 19, No. 3/4., pp. 240–245. Шаблон:Doi Шаблон:JSTOR
- Irwin, J.O. (1927) "On the Frequency Distribution of the Means of Samples from a Population Having any Law of Frequency with Finite Moments, with Special Reference to Pearson's Type II". Biometrika, Vol. 19, No. 3/4., pp. 225–239. Шаблон:Doi Шаблон:JSTOR
- ↑ Johnson, N.L.; Kotz, S.; Balakrishnan, N. (1995) Continuous Univariate Distributions, Volume 2, 2nd Edition, Wiley Шаблон:ISBN(Section 26.9)
- ↑ Шаблон:Cite web
- ↑ https://www.usnc-ursi-archive.org/nrsm/2018/papers/B15-9.pdf Шаблон:Bare URL PDF