Английская Википедия:Chomsky–Schützenberger enumeration theorem

In formal language theory, the Chomsky–Schützenberger enumeration theorem is a theorem derived by Noam Chomsky and Marcel-Paul Schützenberger about the number of words of a given length generated by an unambiguous context-free grammar. The theorem provides an unexpected link between the theory of formal languages and abstract algebra.

Statement

In order to state the theorem, a few notions from algebra and formal language theory are needed.

Let <math>\mathbb{N}</math> denote the set of nonnegative integers. A power series over <math>\mathbb{N}</math> is an infinite series of the form

<math>f = f(x) = \sum_{k=0}^\infty a_k x^k = a_0 + a_1 x^1 + a_2 x^2 + a_3 x^3 + \cdots</math>

with coefficients <math>a_k</math> in <math>\mathbb{N}</math>. The multiplication of two formal power series <math>f</math> and <math>g</math> is defined in the expected way as the convolution of the sequences <math>a_n</math> and <math>b_n</math>:

<math>f(x)\cdot g(x) = \sum_{k=0}^\infty \left(\sum_{i=0}^k a_i b_{k-i}\right) x^k.</math>

In particular, we write <math>f^2 = f(x)\cdot f(x)</math>, <math>f^3 = f(x)\cdot f(x)\cdot f(x)</math>, and so on. In analogy to algebraic numbers, a power series <math>f(x)</math> is called algebraic over <math>\mathbb{Q}(x)</math>, if there exists a finite set of polynomials <math>p_0(x), p_1(x), p_2(x), \ldots , p_n(x)</math> each with rational coefficients such that

<math>p_0(x) + p_1(x) \cdot f + p_2(x)\cdot f^2 + \cdots + p_n(x)\cdot f^n = 0.</math>

A context-free grammar is said to be unambiguous if every string generated by the grammar admits a unique parse tree or, equivalently, only one leftmost derivation. Having established the necessary notions, the theorem is stated as follows.

Chomsky–Schützenberger theorem. If <math>L</math> is a context-free language admitting an unambiguous context-free grammar, and <math>a_k := | L \ \cap \Sigma^k |</math> is the number of words of length <math>k</math> in <math>L</math>, then <math>G(x)=\sum_{k = 0}^\infty a_k x^k</math> is a power series over <math>\mathbb{N}</math> that is algebraic over <math>\mathbb{Q}(x)</math>.

Proofs of this theorem are given by Шаблон:Harvtxt, and by Шаблон:Harvtxt.

Usage

Asymptotic estimates

The theorem can be used in analytic combinatorics to estimate the number of words of length n generated by a given unambiguous context-free grammar, as n grows large. The following example is given by Шаблон:Harvtxt: the unambiguous context-free grammar G over the alphabet {0,1} has start symbol S and the following rules

S → M | U

M → 0M1M | ε

U → 0S | 0M1U.

To obtain an algebraic representation of the power series Шаблон:Tmath associated with a given context-free grammar G, one transforms the grammar into a system of equations. This is achieved by replacing each occurrence of a terminal symbol by x, each occurrence of ε by the integer '1', each occurrence of '→' by '=', and each occurrence of '|' by '+', respectively. The operation of concatenation at the right-hand-side of each rule corresponds to the multiplication operation in the equations thus obtained. This yields the following system of equations:

S = M + U

M = M²x² + 1

U = Sx + MUx²

In this system of equations, S, M, and U are functions of x, so one could also write Шаблон:Tmath, Шаблон:Tmath, and Шаблон:Tmath. The equation system can be resolved after S, resulting in a single algebraic equation:

Шаблон:Tmath.

This quadratic equation has two solutions for S, one of which is the algebraic power series Шаблон:Tmath. By applying methods from complex analysis to this equation, the number <math>a_n</math> of words of length n generated by G can be estimated, as n grows large. In this case, one obtains <math>a_n \in O(2+\epsilon)^n</math> but <math>a_n \notin O(2-\epsilon)^n</math> for each <math>\epsilon>0</math>.^[1]

The following example is from Шаблон:Harvtxt:<math display="block"> \left\{\begin{array} { l } { S \rightarrow X Y } \\ { T \rightarrow a T | T b T | Y c Y } \\ { Y \rightarrow Y a Y | c Y | a b T a Y Y a | X } \\ { X \rightarrow a | b | c } \end{array} \Rightarrow \left\{\begin{array}{l} s(z)=x(z) y(z) \\ t(z)=z t(z)+z t(z)^2+z y(z)^2 \\ y(z)=z y(z)^2+z y(z)+z^4 t(z) y(z)^2+x(z) \\ x(z)=3 z \end{array}\right.\right. </math>which simplifies to<math display="block"> s(z)^8-27\left(z^3-z^2\right) s(z)^5+\ldots+59049 z^{10}=0 </math>

Inherent ambiguity

In classical formal language theory, the theorem can be used to prove that certain context-free languages are inherently ambiguous. For example, the Goldstine language <math>L_G</math> over the alphabet <math>\{a,b\}</math> consists of the words <math>a^{n_1}ba^{n_2}b\cdots a^{n_p}b</math> with <math>p\ge 1</math>, <math>n_i>0</math> for <math>i \in \{1,2,\ldots,p\}</math>, and <math>n_j \neq j</math> for some <math>j \in \{1,2,\ldots,p\}</math>.

It is comparably easy to show that the language <math>L_G</math> is context-free.Шаблон:Sfnp The harder part is to show that there does not exist an unambiguous grammar that generates <math>L_G</math>. This can be proved as follows: If <math>g_k</math> denotes the number of words of length <math>k</math> in <math>L_G</math>, then for the associated power series holds <math>G(x) = \sum_{k=0}^\infty g_k x^k = \frac{1-x}{1-2x}- \frac1x \sum_{k \ge 1} x^{k(k+1)/2-1} </math>. Using methods from complex analysis, one can prove that this function is not algebraic over <math>\mathbb{Q}(x)</math>. By the Chomsky-Schützenberger theorem, one can conclude that <math>L_G</math> does not admit an unambiguous context-free grammar.^[2]

Notes

Шаблон:Reflist

References

Шаблон:Refbegin

Шаблон:Refend

Шаблон:Noam Chomsky

↑ See Шаблон:Harvtxt for a detailed exposition.
↑ See Шаблон:Harvtxt for detailed account.

[1] See Шаблон:Harvtxt for a detailed exposition.

[2] See Шаблон:Harvtxt for detailed account.

[1]

[2]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Chomsky–Schützenberger enumeration theorem

Содержание

Statement

Usage

Asymptotic estimates

Inherent ambiguity

Notes

References

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты