Английская Википедия:Alternating decision tree

An alternating decision tree (ADTree) is a machine learning method for classification. It generalizes decision trees and has connections to boosting.

An ADTree consists of an alternation of decision nodes, which specify a predicate condition, and prediction nodes, which contain a single number. An instance is classified by an ADTree by following all paths for which all decision nodes are true, and summing any prediction nodes that are traversed.

History

ADTrees were introduced by Yoav Freund and Llew Mason.^[1] However, the algorithm as presented had several typographical errors. Clarifications and optimizations were later presented by Bernhard Pfahringer, Geoffrey Holmes and Richard Kirkby.^[2] Implementations are available in Weka and JBoost.

Motivation

Original boosting algorithms typically used either decision stumps or decision trees as weak hypotheses. As an example, boosting decision stumps creates a set of <math>T</math> weighted decision stumps (where <math>T</math> is the number of boosting iterations), which then vote on the final classification according to their weights. Individual decision stumps are weighted according to their ability to classify the data.

Boosting a simple learner results in an unstructured set of <math>T</math> hypotheses, making it difficult to infer correlations between attributes. Alternating decision trees introduce structure to the set of hypotheses by requiring that they build off a hypothesis that was produced in an earlier iteration. The resulting set of hypotheses can be visualized in a tree based on the relationship between a hypothesis and its "parent."

Another important feature of boosted algorithms is that the data is given a different distribution at each iteration. Instances that are misclassified are given a larger weight while accurately classified instances are given reduced weight.

Alternating decision tree structure

An alternating decision tree consists of decision nodes and prediction nodes. Decision nodes specify a predicate condition. Prediction nodes contain a single number. ADTrees always have prediction nodes as both root and leaves. An instance is classified by an ADTree by following all paths for which all decision nodes are true and summing any prediction nodes that are traversed. This is different from binary classification trees such as CART (Classification and regression tree) or C4.5 in which an instance follows only one path through the tree.

Example

The following tree was constructed using JBoost on the spambase dataset^[3] (available from the UCI Machine Learning Repository).^[4] In this example, spam is coded as Шаблон:Val and regular email is coded as Шаблон:Val.

An ADTree for 6 iterations on the Spambase dataset.

The following table contains part of the information for a single instance.

An instance to be classified
Feature	Value
char_freq_bang	0.08
word_freq_hp	0.4
capital_run_length_longest	4
char_freq_dollar	0
word_freq_remove	0.9
word_freq_george	0
Other features	...

The instance is scored by summing all of the prediction nodes through which it passes. In the case of the instance above, the score is calculated as

Score for the above instance
Шаблон:Rh2 \| Iteration	0	1	2	3	4	5	6
Шаблон:Rh2 \| Instance values	Шаблон:N/A	.08 < .052 = f	.4 < .195 = f	0 < .01 = t	0 < 0.005 = t	Шаблон:N/A	.9 < .225 = f
Шаблон:Rh2 \| Prediction	-0.093	0.74	-1.446	-0.38	0.176	0	1.66

The final score of Шаблон:Val is positive, so the instance is classified as spam. The magnitude of the value is a measure of confidence in the prediction. The original authors list three potential levels of interpretation for the set of attributes identified by an ADTree:

Individual nodes can be evaluated for their own predictive ability.
Sets of nodes on the same path may be interpreted as having a joint effect
The tree can be interpreted as a whole.

Care must be taken when interpreting individual nodes as the scores reflect a re weighting of the data in each iteration.

Description of the algorithm

The inputs to the alternating decision tree algorithm are:

A set of inputs <math>(x_1,y_1),\ldots,(x_m,y_m)</math> where <math>x_i</math> is a vector of attributes and <math>y_i</math> is either -1 or 1. Inputs are also called instances.
A set of weights <math>w_i</math> corresponding to each instance.

The fundamental element of the ADTree algorithm is the rule. A single rule consists of a precondition, a condition, and two scores. A condition is a predicate of the form "attribute <comparison> value." A precondition is simply a logical conjunction of conditions. Evaluation of a rule involves a pair of nested if statements:

1  if (precondition)
2      if (condition)
3          return score_one
4      else
5          return score_two
6      end if
7  else
8      return 0
9  end if

Several auxiliary functions are also required by the algorithm:

<math>W_+(c)</math> returns the sum of the weights of all positively labeled examples that satisfy predicate <math>c</math>
<math>W_-(c)</math> returns the sum of the weights of all negatively labeled examples that satisfy predicate <math>c</math>
<math>W(c) = W_+(c) + W_-(c)</math> returns the sum of the weights of all examples that satisfy predicate <math>c</math>

The algorithm is as follows:

1  function ad_tree
2  input Set of Шаблон:Mvar training instances
3 
4  Шаблон:Math for all Шаблон:Mvar
5  Шаблон:Nowrap
6  Шаблон:Math a rule with scores Шаблон:Mvar and Шаблон:Math, precondition "true" and condition "true."
7  Шаблон:Nowrap
8  Шаблон:Nowrap
9  Шаблон:Nowrap
10      Шаблон:Nowrap Шаблон:Nowrap
11      Шаблон:Nowrap
12      Шаблон:Nowrap
13      Шаблон:Nowrap
14      Шаблон:Math new rule with precondition Шаблон:Mvar, condition Шаблон:Mvar, and weights Шаблон:Math and Шаблон:Math
15      Шаблон:Nowrap
16  end for
17  return set of Шаблон:Math

The set <math>\mathcal{P}</math> grows by two preconditions in each iteration, and it is possible to derive the tree structure of a set of rules by making note of the precondition that is used in each successive rule.

Empirical results

Figure 6 in the original paper^[1] demonstrates that ADTrees are typically as robust as boosted decision trees and boosted decision stumps. Typically, equivalent accuracy can be achieved with a much simpler tree structure than recursive partitioning algorithms.

References

Шаблон:Reflist

External links

An introduction to Boosting and ADTrees (Has many graphical examples of alternating decision trees in practice).
JBoost software implementing ADTrees.

[Freund99-1] 1,0 ^1,1 Шаблон:Cite book

[Pfahringer-2] Шаблон:Cite book

[3] Шаблон:Cite web

[4] Шаблон:Cite web

[1]

[2]

[3]

[4]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Alternating decision tree

Содержание

History

Motivation

Alternating decision tree structure

Example

Description of the algorithm

Empirical results

References

External links

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты