Английская Википедия:Controlling for a variable

In causal models, controlling for a variable means binning data according to measured values of the variable. This is typically done so that the variable can no longer act as a confounder in, for example, an observational study or experiment.

When estimating the effect of explanatory variables on an outcome by regression, controlled-for variables are included as inputs in order to separate their effects from the explanatory variables.^[1]

A limitation of controlling for variables is that a causal model is needed to identify important confounders (backdoor criterion is used for the identification). Without having one, a possible confounder might remain unnoticed. Another associated problem is that if a variable which is not a real confounder is controlled for, it may in fact make other variables (possibly not taken into account) become confounders while they weren't confounders before. In other cases, controlling for a non-confounding variable may cause underestimation of the true causal effect of the explanatory variables on an outcome (e.g. when controlling for a mediator or its descendant).^[2]^[3] Counterfactual reasoning mitigates the influence of confounders without this drawback.^[3]

Experiments

Шаблон:Unreferenced section Experiments attempt to assess the effect of manipulating one or more independent variables on one or more dependent variables. To ensure the measured effect is not influenced by external factors, other variables must be held constant. The variables made to remain constant during an experiment are referred to as control variables.

For example, if an outdoor experiment were to be conducted to compare how different wing designs of a paper airplane (the independent variable) affect how far it can fly (the dependent variable), one would want to ensure that the experiment is conducted at times when the weather is the same, because one would not want weather to affect the experiment. In this case, the control variables may be wind speed, direction and precipitation. If the experiment were conducted when it was sunny with no wind, but the weather changed, one would want to postpone the completion of the experiment until the control variables (the wind and precipitation level) were the same as when the experiment began.

In controlled experiments of medical treatment options on humans, researchers randomly assign individuals to a treatment group or control group. This is done to reduce the confounding effect of irrelevant variables that are not being studied, such as the placebo effect.

Observational studies

In an observational study, researchers have no control over the values of the independent variables, such as who receives the treatment. Instead, they must control for variables using statistics.

Observational studies are used when controlled experiments may be unethical or impractical. For instance, if a researcher wished to study the effect of unemployment (the independent variable) on health (the dependent variable), it would be considered unethical by institutional review boards to randomly assign some participants to have jobs and some not to. Instead, the researcher will have to create a sample which includes some employed people and some unemployed people. However, there could be factors that affect both whether someone is employed and how healthy he or she is. Part of any observed association between the independent variable (employment status) and the dependent variable (health) could be due to these outside, spurious factors rather than indicating a true link between them. This can be problematic even in a true random sample. By controlling for the extraneous variables, the researcher can come closer to understanding the true effect of the independent variable on the dependent variable.

In this context the extraneous variables can be controlled for by using multiple regression. The regression uses as independent variables not only the one or ones whose effects on the dependent variable are being studied, but also any potential confounding variables, thus avoiding omitted variable bias. "Confounding variables" in this context means other factors that not only influence the dependent variable (the outcome) but also influence the main independent variable.^[3]

OLS Regressions and control variables

The simplest examples of control variables in regression analysis comes from Ordinary Least Squares (OLS) estimators. The OLS framework assumes the following:

Linear relationship - OLS statistical models are linear. Hence the relationship between explanatory variables and the mean of Y must be linear.
Homoscedasticity - This requires homogeneity of variances, that is equal or similar variances across these data.
Independence/No Autocorrelation - Error terms from one (or more) observation can not be influenced by error terms of other observations.
Normality of Errors - The errors are jointly normal and uncorrelated, this implies that <math>

   (\epsilon_i)_{i\in N} 
 </math>  i.e. that the error terms are an independently and identically distributed set (iid). This implies that the unobservables between different groups or observations are independent.

No multicollinearity - Independent variables must not be highly correlated with each other. For regressions using matrix notation, the matrix must be full rank i.e.<math>

   X^{'}X 
 </math> is invertible.

Accordingly, a control variable can be interpreted as a linear explanatory variable that affects the mean value of Y (Assumption 1), but which does not present the primary variable of investigation, and which also satisfies the other assumptions above.^[4]

Example

Consider a study about whether getting older affects someone's life satisfaction. (Some researchers perceive a "u-shape": life satisfaction appears to decline first and then rise after middle age.^[5]) To identify the control variables needed here, one could ask what other variables determine not only someone's life satisfaction but also their age. Many other variables determine life satisfaction. But no other variable determines how old someone is (as long as they remain alive). (All people keep getting older, at the same rate, no matter what their other characteristics.) So, no control variables are needed here.^[6]

To determine the needed control variables, it can be useful to construct a directed acyclic graph.^[3]

References

Шаблон:Reflist

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Controlling for a variable

Содержание

Experiments

Observational studies

OLS Regressions and control variables

Example

See also

References

Further reading

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты