Английская Википедия:Equivalence test

Шаблон:Short description Equivalence tests are a variety of hypothesis tests used to draw statistical inferences from observed data. In these tests, the null hypothesis is defined as an effect large enough to be deemed interesting, specified by an equivalence bound. The alternative hypothesis is any effect that is less extreme than said equivalence bound. The observed data are statistically compared against the equivalence bounds. If the statistical test indicates the observed data is surprising, assuming that true effects are at least as extreme as the equivalence bounds, a Neyman-Pearson approach to statistical inferences can be used to reject effect sizes larger than the equivalence bounds with a pre-specified Type 1 error rate.

Equivalence testing originates from the field of clinical trials.^[1] One application, known as a non-inferiority trial, is used to show that a new drug that is cheaper than available alternatives works as well as an existing drug. In essence, equivalence tests consist of calculating a confidence interval around an observed effect size and rejecting effects more extreme than the equivalence bound when the confidence interval does not overlap with the equivalence bound. In two-sided tests, both upper and lower equivalence bounds are specified. In non-inferiority trials, where the goal is to test the hypothesis that a new treatment is not worse than existing treatments, only a lower equivalence bound is specified.

Файл:Equivalence Test.png

Mean differences (black squares) and 90% confidence intervals (horizontal lines) with equivalence bounds ΔL = -0.5 and ΔU= 0.5 for four combinations of test results that are statistically equivalent or not and statistically different from zero or not. Pattern A is statistically equivalent, pattern B is statistically different from 0, pattern C is practically insignificant, and pattern D is inconclusive (neither statistically different from 0 nor equivalent).

Equivalence tests can be performed in addition to null-hypothesis significance tests.^[2]^[3]^[4]^[5] This might prevent common misinterpretations of p-values larger than the alpha level as support for the absence of a true effect. Furthermore, equivalence tests can identify effects that are statistically significant but practically insignificant, whenever effects are statistically different from zero, but also statistically smaller than any effect size deemed worthwhile (see the first figure).^[6] Equivalence tests were originally used in areas such as pharmaceutics, frequently in bioequivalence trials. However, these tests can be applied to any instance where the research question asks whether the means of two sets of scores are practically or theoretically equivalent. As such, equivalence analyses have seen increased usage in almost all medical research fields. Additionally, the field of psychology has been adopting the use of equivalence testing, particularly in clinical trials. This is not to say, however, that equivalence analyses should be limited to clinical trials, and the application of these tests can occur in a range of research areas. In this regard, equivalence tests have recently been introduced in evaluation of measurement devices,^[7]^[8] artificial intelligence^[9] as well as exercise physiology and sports science.^[10] Several tests exist for equivalence analyses; however, more recently the two-one-sided t-tests (TOST) procedure has been garnering considerable attention. As outlined below, this approach is an adaptation of the widely known t-test.

TOST procedure

A very simple equivalence testing approach is the ‘two one-sided t-tests’ (TOST) procedure.^[11] In the TOST procedure an upper (Δ_U) and lower (–Δ_L) equivalence bound is specified based on the smallest effect size of interest (e.g., a positive or negative difference of d = 0.3). Two composite null hypotheses are tested: H₀₁: Δ ≤ –Δ_L and H₀₂: Δ ≥ Δ_U. When both these one-sided tests can be statistically rejected, we can conclude that –Δ_L < Δ < Δ_U, or that the observed effect falls within the equivalence bounds and is statistically smaller than any effect deemed worthwhile and considered practically equivalent".^[12] Alternatives to the TOST procedure have been developed as well.^[13] A recent modification to TOST makes the approach feasible in cases of repeated measures and assessing multiple variables.^[14]

Comparison between t-test and equivalence test

The equivalence test can be induced from the t-test.^[7] Consider a t-test at the significance level α_t-test with a power of 1-β_t-test for a relevant effect size d_r. If Δ=d_r as well as α_equiv.-test=β_t-test and β_equiv.-test=α_t-test coincide, i.e. the error types (type I and type II) are interchanged between the t-test and the equivalence test, then the t-test will obtain the same results as the equivalence test. To achieve this for the t-test, either the sample size calculation needs to be carried out correctly, or the t-test significance level α_t-test needs to be adjusted, referred to as the so-called revised t-test.^[7] Both approaches have difficulties in practice since sample size planning relies on unverifiable assumptions of the standard deviation, and the revised t-test yields numerical problems.^[7] Preserving the test behavior, those limitations can be removed by using an equivalence test.

The figure below allows a visual comparison of the equivalence test and the t-test when the sample size calculation is affected by differences between the a priori standard deviation <math display="inline">\sigma</math> and the sample's standard deviation <math display="inline">\widehat{\sigma}</math>, which is a common problem. Using an equivalence test instead of a t-test additionally ensures that α_equiv.-test is bounded, which the t-test does not do in case that <math display="inline">\widehat{\sigma} > \sigma</math> with the type II error growing arbitrary large. On the other hand, having <math display="inline">\widehat{\sigma} < \sigma</math> results in the t-test being stricter than the d_r specified in the planning, which may randomly penalize the sample source (e.g., a device manufacturer). This makes the equivalence test safer to use.

Файл:T-test vs equivalence test.png

Chances to pass (a) the t-test and (b) the equivalence test, depending on the actual error 𝜇. For more details, see^[7]

Literature

Шаблон:Cite journal

References

Шаблон:Reflist

[1] Шаблон:Cite journal

[2] Шаблон:Cite journal

[3] Шаблон:Cite book

[4] Шаблон:Cite journal

[5] Шаблон:Cite book

[6] Шаблон:Cite journal

[siebert2019-7] 7,0 ^7,1 ^7,2 ^7,3 ^7,4 Шаблон:Cite journal

[8] Шаблон:Cite book

[9] Шаблон:Cite journal

[Mazzolari2022-10] Шаблон:Cite journal

[11] Шаблон:Cite journal

[12] Шаблон:Cite journal

[13] Шаблон:Cite book

[14] Шаблон:Cite journal

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Equivalence test

Содержание

TOST procedure

Comparison between t-test and equivalence test

See also

Literature

References

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты