Английская Википедия:Generalized Pareto distribution

| kurtosis   =<math>\frac{3(1-2\xi)(2\xi^2+\xi+3)}{(1-3\xi)(1-4\xi)}-3\,\;(\xi<1/4)</math>
| mgf        =<math>e^{\theta\mu}\,\sum_{j=0}^\infty \left[\frac{(\theta\sigma)^j}{\prod_{k=0}^j(1-k\xi)}\right], \;(k\xi<1)</math>|
| cf         =<math>e^{it\mu}\,\sum_{j=0}^\infty \left[\frac{(it\sigma)^j}{\prod_{k=0}^j(1-k\xi)}\right], \;(k\xi<1)</math>
| variance   =<math>\frac{\sigma^2}{(1-\xi)^2(1-2\xi)}\, \; (\xi < 1/2)</math>
| moments    =<math>\xi = \frac{1}{2}\left(1 - \frac{(E[X] - \mu)^2}{V[X]}\right)</math> 
<math> \sigma = (E[X] - \mu)(1 - \xi)</math>
| ES         =<math>\begin{cases}\mu + \sigma\left[ \frac{(1-p)^{-\xi} }{1-\xi}   + \frac{(1-p)^{-\xi} -1 }{\xi}  \right]&,\xi \neq 0\\\mu + \sigma[1- \ln(1-p) ]&,\xi =0\end{cases}</math>^[1]
| bPOE       =<math>\begin{cases}\frac{  \left(1+\frac{\xi(x-\mu)}{\sigma}\right)^{- \frac{1}{\xi} }  }{(1-\xi)^{ \frac{1}{\xi} } } &,\xi \neq 0\\\ e^{ 1 - \left( \frac{x-\mu}{\sigma} \right) }&,\xi =0\end{cases}</math>^[1]
 }}

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location <math>\mu</math>, scale <math>\sigma</math>, and shape <math>\xi</math>.^[2]^[3] Sometimes it is specified by only scale and shape^[4] and sometimes only by its shape parameter. Some references give the shape parameter as <math> \kappa = - \xi \,</math>.^[5]

Definition

The standard cumulative distribution function (cdf) of the GPD is defined by^[6]

<math>F_{\xi}(z) = \begin{cases}

1 - \left(1 + \xi z\right)^{-1/\xi} & \text{for }\xi \neq 0, \\ 1 - e^{-z} & \text{for }\xi = 0. \end{cases} </math>

where the support is <math> z \geq 0 </math> for <math> \xi \geq 0</math> and <math> 0 \leq z \leq - 1 /\xi </math> for <math> \xi < 0</math>. The corresponding probability density function (pdf) is

<math>f_{\xi}(z) = \begin{cases}

(1 + \xi z)^{-\frac{\xi +1}{\xi }} & \text{for }\xi \neq 0, \\ e^{-z} & \text{for }\xi = 0. \end{cases} </math>

Characterization

The related location-scale family of distributions is obtained by replacing the argument z by <math>\frac{x-\mu}{\sigma}</math> and adjusting the support accordingly.

The cumulative distribution function of <math>X \sim GPD(\mu, \sigma, \xi)</math> (<math>\mu\in\mathbb R</math>, <math>\sigma>0</math>, and <math>\xi\in\mathbb R</math>) is

<math>F_{(\mu,\sigma,\xi)}(x) = \begin{cases}

1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} & \text{for }\xi \neq 0, \\ 1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0, \end{cases} </math> where the support of <math>X</math> is <math> x \geqslant \mu </math> when <math> \xi \geqslant 0 \,</math>, and <math> \mu \leqslant x \leqslant \mu - \sigma /\xi </math> when <math> \xi < 0</math>.

The probability density function (pdf) of <math>X \sim GPD(\mu, \sigma, \xi)</math> is

<math>f_{(\mu,\sigma,\xi)}(x) = \frac{1}{\sigma}\left(1 + \frac{\xi (x-\mu)}{\sigma}\right)^{\left(-\frac{1}{\xi} - 1\right)}</math>,

again, for <math> x \geqslant \mu </math> when <math> \xi \geqslant 0</math>, and <math> \mu \leqslant x \leqslant \mu - \sigma /\xi </math> when <math> \xi < 0</math>.

The pdf is a solution of the following differential equation: Шаблон:Citation needed

<math>\left\{\begin{array}{l}

f'(x) (-\mu \xi +\sigma+\xi x)+(\xi+1) f(x)=0, \\ f(0)=\frac{\left(1-\frac{\mu \xi}{\sigma}\right)^{-\frac{1}{\xi }-1}}{\sigma} \end{array}\right\} </math>

Special cases

If the shape <math>\xi</math> and location <math>\mu</math> are both zero, the GPD is equivalent to the exponential distribution.
With shape <math>\xi = -1</math>, the GPD is equivalent to the continuous uniform distribution <math>U(0, \sigma)</math>.^[7]
With shape <math>\xi > 0</math> and location <math>\mu = \sigma/\xi</math>, the GPD is equivalent to the Pareto distribution with scale <math>x_m=\sigma/\xi</math> and shape <math>\alpha=1/\xi</math>.
If <math> X </math> <math>\sim</math> <math>GPD</math> <math>(</math><math>\mu = 0</math>, <math>\sigma</math>, <math>\xi</math> <math>)</math>, then <math> Y = \log (X) \sim exGPD(\sigma, \xi)</math> [1]. (exGPD stands for the exponentiated generalized Pareto distribution.)
GPD is similar to the Burr distribution.

Generating generalized Pareto random variables

Generating GPD random variables

If U is uniformly distributed on (0, 1], then

<math> X = \mu + \frac{\sigma (U^{-\xi}-1)}{\xi} \sim GPD(\mu, \sigma, \xi \neq 0)</math>

and

<math> X = \mu - \sigma \ln(U) \sim GPD(\mu,\sigma,\xi =0).</math>

Both formulas are obtained by inversion of the cdf.

In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.

GPD as an Exponential-Gamma Mixture

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.

<math>X|\Lambda \sim \operatorname{Exp}(\Lambda) </math>

and

<math>\Lambda \sim \operatorname{Gamma}(\alpha, \beta) </math>

then

<math>X \sim \operatorname{GPD}(\xi = 1/\alpha, \ \sigma = \beta/\alpha) </math>

Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that:<math>\xi</math> must be positive.

In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for <math>Y \sim \text{Exponential}(1)</math> and <math>Z \sim \text{Gamma}(1/\xi, 1)</math>, we have <math>\mu + \sigma \frac{Y}{\xi Z} \sim \text{GPD}(\mu,\sigma,\xi)</math>. This is a consequence of the mixture after setting <math>\beta=\alpha</math> and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.

Exponentiated generalized Pareto distribution

The exponentiated generalized Pareto distribution (exGPD)

Файл:ExGPDpdf.png

The pdf of the <math>exGPD(\sigma,\xi)</math> (exponentiated generalized Pareto distribution) for different values <math>\sigma</math> and <math>\xi</math>.

If <math> X \sim GPD</math> <math>(</math><math>\mu = 0</math>, <math>\sigma</math>, <math>\xi</math> <math>)</math>, then <math> Y = \log (X)</math> is distributed according to the exponentiated generalized Pareto distribution, denoted by <math> Y</math> <math>\sim</math> <math>exGPD</math> <math>(</math><math>\sigma</math>, <math>\xi</math> <math>)</math>.

The probability density function(pdf) of <math> Y </math> <math>\sim</math> <math>exGPD</math> <math>(</math><math>\sigma</math>, <math>\xi</math> <math>)\,\, (\sigma >0) </math> is

<math> g_{(\sigma, \xi)}(y) = \begin{cases} \frac{e^y}{\sigma}\bigg( 1 + \frac{\xi e^y}{\sigma} \bigg)^{-1/\xi -1}\,\,\,\, \text{for } \xi \neq 0, \\

                  \frac{1}{\sigma}e^{y - e^{y}/\sigma}  \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,  \text{for } \xi = 0 ,\end{cases}</math>

where the support is <math> -\infty < y < \infty </math> for <math> \xi \geq 0 </math>, and <math> -\infty < y \leq \log(-\sigma/\xi)</math> for <math> \xi < 0 </math>.

For all <math>\xi</math>, the <math>\log \sigma </math> becomes the location parameter. See the right panel for the pdf when the shape <math>\xi</math> is positive.

The exGPD has finite moments of all orders for all <math>\sigma>0</math> and <math>-\infty< \xi < \infty </math>.

Файл:Var exGPD.png

The variance of the <math>exGPD(\sigma,\xi)</math> as a function of <math>\xi</math>. Note that the variance only depends on <math>\xi</math>. The red dotted line represents the variance evaluated at <math>\xi=0</math>, that is, <math> \psi'(1) = \pi^2/6</math>.

The moment-generating function of <math> Y \sim exGPD(\sigma,\xi)</math> is

<math> M_Y(s) = E[e^{sY}] = \begin{cases} -\frac{1}{\xi}\bigg(-\frac{\sigma}{\xi}\bigg)^{s} B(s+1, -1/\xi) \,\,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, \infty), \xi < 0 , \\

                  \frac{1}{\xi}\bigg(\frac{\sigma}{\xi}\bigg)^{s} B(s+1, 1/\xi - s) \,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, 1/\xi), \xi > 0  , \\ 
                  \sigma^{s} \Gamma(1+s) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \text{for } s \in (-1, \infty), \xi = 0,  \end{cases}</math>

where <math>B(a,b) </math> and <math> \Gamma (a) </math> denote the beta function and gamma function, respectively.

The expected value of <math> Y </math> <math>\sim</math> <math>exGPD</math> <math>(</math><math>\sigma</math>, <math>\xi</math> <math>)</math> depends on the scale <math> \sigma</math> and shape <math> \xi </math> parameters, while the <math> \xi </math> participates through the digamma function:

<math> E[Y] = \begin{cases} \log\ \bigg(-\frac{\sigma}{\xi} \bigg)+ \psi(1) - \psi(-1/\xi+1) \,\,\,\,\,\,\,\,\,\,\,\, \,\, \text{for }\xi < 0 , \\

                  \log\ \bigg(\frac{\sigma}{\xi} \bigg)+ \psi(1) - \psi(1/\xi)  \,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\, \,\,\,  \text{for }\xi > 0 ,   \\ 
                  \log \sigma + \psi(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\,\,\,\,\, \text{for }\xi = 0.  \end{cases}</math>

Note that for a fixed value for the <math> \xi \in (-\infty,\infty) </math>, the <math> \log\ \sigma </math> plays as the location parameter under the exponentiated generalized Pareto distribution.

The variance of <math> Y </math> <math>\sim</math> <math>exGPD</math> <math>(</math><math>\sigma</math>, <math>\xi</math> <math>)</math> depends on the shape parameter <math> \xi </math> only through the polygamma function of order 1 (also called the trigamma function):

<math> Var[Y] = \begin{cases} \psi'(1) - \psi'(-1/\xi +1) \,\,\,\,\,\,\,\,\,\,\,\, \, \text{for }\xi < 0 , \\

                  \psi'(1) + \psi'(1/\xi) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,  \text{for }\xi > 0  , \\ 
                  \psi'(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\text{for }\xi = 0.  \end{cases}</math>

See the right panel for the variance as a function of <math>\xi</math>. Note that <math> \psi'(1) = \pi^2/6 \approx 1.644934 </math>.

Note that the roles of the scale parameter <math>\sigma</math> and the shape parameter <math>\xi</math> under <math>Y \sim exGPD(\sigma, \xi)</math> are separably interpretable, which may lead to a robust efficient estimation for the <math>\xi</math> than using the <math>X \sim GPD(\sigma, \xi)</math> [2]. The roles of the two parameters are associated each other under <math>X \sim GPD(\mu=0,\sigma, \xi)</math> (at least up to the second central moment); see the formula of variance <math>Var(X)</math> wherein both parameters are participated.

The Hill's estimator

Assume that <math> X_{1:n} = (X_1, \cdots, X_n) </math> are <math>n</math> observations (not need to be i.i.d.) from an unknown heavy-tailed distribution <math> F </math> such that its tail distribution is regularly varying with the tail-index <math>1/\xi </math> (hence, the corresponding shape parameter is <math>\xi </math>). To be specific, the tail distribution is described as

<math>

\bar{F}(x) = 1 - F(x) = L(x) \cdot x^{-1/\xi}, \,\,\,\,\,\text{for some }\xi>0,\,\,\text{where } L \text{ is a slowly varying function.}

</math>

It is of a particular interest in the extreme value theory to estimate the shape parameter <math>\xi</math>, especially when <math>\xi</math> is positive (so called the heavy-tailed distribution).

Let <math>F_u</math> be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions <math>F</math>, and large <math>u</math>, <math>F_u</math> is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate <math>\xi</math>: the GPD plays the key role in POT approach.

A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For <math> 1\leq i \leq n </math>, write <math> X_{(i)} </math> for the <math>i</math>-th largest value of <math> X_1, \cdots, X_n </math>. Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [3]) based on the <math>k</math> upper order statistics is defined as

<math>

\widehat{\xi}_{k}^{\text{Hill}} = \widehat{\xi}_{k}^{\text{Hill}}(X_{1:n}) = \frac{1}{k-1} \sum_{j=1}^{k-1} \log \bigg(\frac{X_{(j)}}{X_{(k)}} \bigg), \,\,\,\,\,\,\,\, \text{for } 2 \leq k \leq n.

</math>

In practice, the Hill estimator is used as follows. First, calculate the estimator <math>\widehat{\xi}_{k}^{\text{Hill}}</math> at each integer <math>k \in \{ 2, \cdots, n\}</math>, and then plot the ordered pairs <math>\{(k,\widehat{\xi}_{k}^{\text{Hill}})\}_{k=2}^{n}</math>. Then, select from the set of Hill estimators <math>\{\widehat{\xi}_{k}^{\text{Hill}}\}_{k=2}^{n}</math> which are roughly constant with respect to <math>k</math>: these stable values are regarded as reasonable estimates for the shape parameter <math>\xi</math>. If <math> X_1, \cdots, X_n </math> are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter <math>\xi</math> [4].

Note that the Hill estimator <math>\widehat{\xi}_{k}^{\text{Hill}}</math> makes a use of the log-transformation for the observations <math> X_{1:n} = (X_1, \cdots, X_n) </math>. (The Pickand's estimator <math>\widehat{\xi}_{k}^{\text{Pickand}}</math> also employed the log-transformation, but in a slightly different way [5].)

References

Шаблон:Reflist

External links

Mathworks: Generalized Pareto distribution

Шаблон:ProbDistributions

↑ ^1,0 ^1,1 Шаблон:Cite journal
↑ Шаблон:Cite book
↑ Шаблон:Cite journal
↑ Шаблон:Cite journal
↑ Шаблон:Cite book
↑ Шаблон:Cite book
↑ Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.

[norton-1] 1,0 ^1,1 Шаблон:Cite journal

[2] Шаблон:Cite book

[3] Шаблон:Cite journal

[4] Шаблон:Cite journal

[5] Шаблон:Cite book

[6] Шаблон:Cite book

[7] Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Generalized Pareto distribution

Содержание

Definition

Characterization

Special cases

Generating generalized Pareto random variables

Generating GPD random variables

GPD as an Exponential-Gamma Mixture

Exponentiated generalized Pareto distribution

The exponentiated generalized Pareto distribution (exGPD)

The Hill's estimator

See also

References

Further reading

External links

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты