Английская Википедия:Exploratory causal analysis

Шаблон:Short description Шаблон:Technical Causal analysis is the field of experimental design and statistical analysis pertaining to establishing cause and effect.^[1]^[2] Exploratory causal analysis (ECA), also known as data causality or causal discovery^[3] is the use of statistical algorithms to infer associations in observed data sets that are potentially causal under strict assumptions. ECA is a type of causal inference distinct from causal modeling and treatment effects in randomized controlled trials.^[4] It is exploratory research usually preceding more formal causal research in the same way exploratory data analysis often precedes statistical hypothesis testing in data analysis^[5]^[6]

Motivation

Data analysis is primarily concerned with causal questions.^[3]^[4]^[7]^[8]^[9] For example, did the fertilizer cause the crops to grow?^[10] Or, can a given sickness be prevented?^[11] Or, why is my friend depressed?^[12] The potential outcomes and regression analysis techniques handle such queries when data is collected using designed experiments. Data collected in observational studies require different techniques for causal inference (because, for example, of issues such as confounding).^[13] Causal inference techniques used with experimental data require additional assumptions to produce reasonable inferences with observation data.^[14] The difficulty of causal inference under such circumstances is often summed up as "correlation does not imply causation".

Overview

ECA postulates that there exist data analysis procedures performed on specific subsets of variables within a larger set whose outputs might be indicative of causality between those variables.^[3] For example, if we assume every relevant covariate in the data is observed, then propensity score matching can be used to find the causal effect between two observational variables.^[4] Granger causality can also be used to find the causality between two observational variables under different, but similarly strict, assumptions.^[15]

The two broad approaches to developing such procedures are using operational definitions of causality^[5] or verification by "truth" (i.e., explicitly ignoring the problem of defining causality and showing that a given algorithm implies a causal relationship in scenarios when causal relationships are known to exist, e.g., using synthetic data^[3]).

Operational definitions of causality

Clive Granger created the first operational definition of causality in 1969.^[16] Granger made the definition of probabilistic causality proposed by Norbert Wiener operational as a comparison of variances.^[17]

Some authors prefer using ECA techniques developed using operational definitions of causality because they believe it may help in the search for causal mechanisms.^[5]^[18]

Verification by "truth"

Peter Spirtes, Clark Glymour, and Richard Scheines introduced the idea of explicitly not providing a definition of causality.^[3] Spirtes and Glymour introduced the PC algorithm for causal discovery in 1990.^[19] Many recent causal discovery algorithms follow the Spirtes-Glymour approach to verification.^[20]

Techniques

There are many surveys of causal discovery techniques.^[3]^[5]^[20]^[21]^[22]^[23] This section lists the well-known techniques.

Bivariate (or "pairwise")

Granger causality (there is also the Scholarpedia entry [1])
transfer entropy
convergent cross mapping

Multivariate

causation entropy^[24]
PC algorithm^[3]^[25]
FCI algorithm^[3]^[26]
LiNGAM^[27] [2]

Many of these techniques are discussed in the tutorials provided by the Center for Causal Discovery (CCD) [3].

Use-case examples

Social science

The PC algorithm has been applied to several different social science data sets.^[3]

Medicine

The PC algorithm has been applied to medical data.^[28] Granger causality has been applied to fMRI data.^[29] CCD tested their tools using biomedical data [4].

Physics

ECA is used in physics to understand the physical causal mechanisms of the system, e.g., in geophysics using the PC-stable algorithm (a variant of the original PC algorithm)^[30] and in dynamical systems using pairwise asymmetric inference (a variant of convergent cross mapping).^[31]

Criticism

There is debate over whether or not the relationships between data found using causal discovery are actually causal.^[3]^[25] Judea Pearl has emphasized that causal inference requires a causal model developed by "intelligence" through an iterative process of testing assumptions and fitting data.^[7]

Response to the criticism points out that assumptions used for developing ECA techniques may not hold for a given data set^[3]^[14]^[32]^[33]^[34] and that any causal relationships discovered during ECA are contingent on these assumptions holding true^[25]^[35]

Software Packages

Comprehensive toolkits

Tetrad is an open source GUI-based Java program that provides a collection of causal discovery algorithms.^[36] The algorithm library used by Tetrad is also available as a command-line tool, Python API, and R wrapper.^[37]
Java Information Dynamics Toolkit (JIDT) is an open source Java library for performing information-theoretic causal discovery (i.e., transfer entropy, conditional transfer entropy, etc.)[5]. Examples of using the library in MATLAB, GNU Octave, Python, R, Julia and Clojure are provided in the documentation [6].
pcalg is an R package that provides some of the same causal discovery algorithms provided in Tetrad [7].

Specific Techniques

Granger causality

R package [8]
Python package [9]

convergent cross mapping

R package [10]

LiNGAM

MATLAB/GNU Octave package [11]

There is also a collection of tools and data maintained by the Causality Workbench team [12] and the CCD team [13].

References

Шаблон:Reflist

Шаблон:Authority control

[1] Шаблон:Cite journal

[2] Шаблон:Cite journal

[CPS-3] 3,00 ^3,01 ^3,02 ^3,03 ^3,04 ^3,05 ^3,06 ^3,07 ^3,08 ^3,09 ^3,10 Шаблон:Cite book

[OE-4] 4,0 ^4,1 ^4,2 Шаблон:Cite book

[ECATSD-5] 5,0 ^5,1 ^5,2 ^5,3 Шаблон:Cite book

[6] Шаблон:Cite book

[why-7] 7,0 ^7,1 Шаблон:Cite book

[8] Шаблон:Cite book

[9] Шаблон:Cite book

[10] Шаблон:Cite book

[11] Шаблон:Cite book

[12] Шаблон:Cite book

[13] Шаблон:Cite book

[stone-14] 14,0 ^14,1 Шаблон:Cite journal

[15] Шаблон:Cite journal

[ICREMCM-16] Шаблон:Cite journal

[17] Шаблон:Cite web

[18] Шаблон:Cite book

[19] Шаблон:Cite journal

[survey1-20] 20,0 ^20,1 Шаблон:Cite journal

[21] Шаблон:Cite journal

[22] Шаблон:Cite journal

[23] Шаблон:Cite arXiv

[24] Шаблон:Cite journal

[Freedman-25] 25,0 ^25,1 ^25,2 Шаблон:Cite journal

[26] Шаблон:Cite journal

[27] Шаблон:Cite journal

[28] Шаблон:Cite journal

[29] Шаблон:Cite journal

[30] Шаблон:Cite journal

[31] Шаблон:Cite journal

[32] Шаблон:Cite journal

[33] Шаблон:Cite journal

[34] Шаблон:Cite book

[35] Шаблон:Cite book

[36] Шаблон:Cite web

[37] Шаблон:Cite web

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.