Английская Википедия:Freedman–Diaconis rule

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

In statistics, the Freedman–Diaconis rule can be used to select the width of the bins to be used in a histogram.[1] It is named after David A. Freedman and Persi Diaconis.

For a set of empirical measurements sampled from some probability distribution, the Freedman-Diaconis rule is designed roughly to minimize the integral of the squared difference between the histogram (i.e., relative frequency density) and the density of the theoretical probability distribution.

The general equation for the rule is:

<math>\text{Bin width}=2\, { \text{IQR}(x) \over{ \sqrt[3]{n} }}</math>

where <math>\operatorname{IQR}(x) </math> is the interquartile range of the data and <math> n </math> is the number of observations in the sample <math> x. </math>

Other approaches

With the factor 2 replaced by approximately 2.59, the Freedman-Diaconis rule asymptotically matches Scott's normal reference rule for data sampled from a normal distribution.

Another approach is to use Sturges' rule: use a bin so large that there are about <math> 1+\log_2n </math> non-empty bins (Scott, 2009).[2] This works well for n under 200, but was found to be inaccurate for large n.[3] For a discussion and an alternative approach, see Birgé and Rozenholc.[4]

References

Шаблон:Reflist

Шаблон:Statistics-stub