Английская Википедия:Indian buffet process

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

Шаблон:One source In the mathematical theory of probability, the Indian buffet process (IBP) is a stochastic process defining a probability distribution over sparse binary matrices with a finite number of rows and an infinite number of columns. This distribution is suitable to use as a prior for models with potentially infinite number of features. The form of the prior ensures that only a finite number of features will be present in any finite set of observations but more features may appear as more data points are observed.

Indian buffet process prior

Let <math> Z </math> be an <math> N \times K </math> binary matrix indicating the presence or absence of a latent feature. The IBP places the following prior on <math> Z </math>:

<math>

p(Z) = \frac{\alpha^{K^+}}{\prod_{i=1}^N K_1^{(i)}!}\exp\{-\alpha H_N\}\prod_{k=1}^{K^+} \frac{(N-m_k)!(m_k-1)!}{N!} </math>

where <math> {K} </math> is the number of non-zero columns in <math> Z </math>, <math> m_k </math> is the number of ones in column <math> k </math> of <math> Z </math>, <math> H_N </math> is the <math>N</math>-th harmonic number, and <math> K_1^{(i)} </math> is the number of new dishes sampled by the <math>i</math>-th customer. The parameter <math> \alpha </math> controls the expected number of features present in each observation.

In the Indian buffet process, the rows of <math> Z </math> correspond to customers and the columns correspond to dishes in an infinitely long buffet. The first customer takes the first <math> \mathrm{Poisson}(\alpha) </math> dishes. The <math> i </math>-th customer then takes dishes that have been previously sampled with probability <math> m_k/i </math>, where <math> m_k </math> is the number of people who have already sampled dish <math> k </math>. He also takes <math> \mathrm{Poisson}(\alpha / i) </math> new dishes. Therefore, <math> z_{nk} </math> is one if customer <math> n </math> tried the <math> k </math>-th dish and zero otherwise.

This process is infinitely exchangeable for an equivalence class of binary matrices defined by a left-ordered many-to-one function. <math> \operatorname{lof}(Z) </math> is obtained by ordering the columns of the binary matrix <math> Z </math> from left to right by the magnitude of the binary number expressed by that column, taking the first row as the most significant bit.

See also

References

Шаблон:Reflist