Английская Википедия:97.5th percentile point

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

Шаблон:Short description Шаблон:Use dmy dates

Файл:NormalDist1.96.png
95% of the area under the normal distribution lies within 1.96 standard deviations away from the mean.

In probability and statistics, the 97.5th percentile point of the standard normal distribution is a number commonly used for statistical calculations. The approximate value of this number is 1.96, meaning that 95% of the area under a normal curve lies within approximately 1.96 standard deviations of the mean. Because of the central limit theorem, this number is used in the construction of approximate 95% confidence intervals. Its ubiquity is due to the arbitrary but common convention of using confidence intervals with 95% probability in science and frequentist statistics, though other probabilities (90%, 99%, etc.) are sometimes used.[1][2][3][4] This convention seems particularly common in medical statistics,[5][6][7] but is also common in other areas of application, such as earth sciences,[8] social sciences and business research.[9]

There is no single accepted name for this number; it is also commonly referred to as the "standard normal deviate", "normal score" or "Z score" for the 97.5 percentile point, the .975 point, or just its approximate value, 1.96.

If X has a standard normal distribution, i.e. X ~ N(0,1),

<math> \mathrm{P}(X > 1.96) \approx 0.025, \,</math>
<math> \mathrm{P}(X < 1.96) \approx 0.975, \,</math>

and as the normal distribution is symmetric,

<math> \mathrm{P}(-1.96 < X < 1.96) \approx 0.95. \,</math>

One notation for this number is z.975.[10] From the probability density function of the standard normal distribution, the exact value of z.975 is determined by

<math> \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{z_{.975}} e^{-x^2/2} \, \mathrm{d}x = 0.975.</math>

History

Файл:Youngronaldfisher2.JPG
Ronald Fisher

The use of this number in applied statistics can be traced to the influence of Ronald Fisher's classic textbook, Statistical Methods for Research Workers, first published in 1925: Шаблон:Blockquote In Table 1 of the same work, he gave the more precise value 1.959964.[11] In 1970, the value truncated to 20 decimal places was calculated to be

1.95996 39845 40054 23552...[12][13]

The commonly used approximate value of 1.96 is therefore accurate to better than one part in 50,000, which is more than adequate for applied work.

Some people even use the value of 2 in the place of 1.96, reporting a 95.4% confidence interval as a 95% confidence interval. This is not recommended but is occasionally seen.[14]

Software functions

The inverse of the standard normal CDF can be used to compute the value. The following is a table of function calls that return 1.96 in some commonly used applications:

Application Function call
Excel NORM.S.INV(0.975)
MATLAB norminv(0.975)
R qnorm(0.975)
Python (SciPy) scipy.stats.norm.ppf(0.975)
SAS probit(0.025);
SPSS x = COMPUTE IDF.NORMAL(0.975,0,1).
Stata invnormal(0.975)
Wolfram Language (Mathematica) InverseCDF[NormalDistribution[0, 1], 0.975][15][16]

See also

References

Шаблон:Reflist

Further reading