Английская Википедия:Chinese character sets

A Chinese character set (Шаблон:Zh) is a group of Chinese characters. Since the size of a set is the number of elements in it, an introduction to Chinese character sets will also introduce the Chinese character numbers in them. Шаблон:Sfn

There are different Chinese character sets for different purposes. The following is an introduction to some representative character sets in history, in modern languages and in information technology.

Historical development

Along with the development of writing systems, the number of Chinese characters kept growing, as shown by the character sets of the dictionaries. Шаблон:Sfn Шаблон:Sfn

Number of characters in monolingual Chinese dictionaries
Year	Шаблон:Abbr	Шаблон:Abbr
300 BC	Erya	4,300Шаблон:Efn Шаблон:Sfn
100 AD	Shuowen Jiezi	9,516
230	Shenglei	11,520
350	Zilin	12,824
543	Yupian	16,917
601	Qieyun	12,150
732	Tangyun	15,000
753	Yunhai jingyuan	26,911Шаблон:Sfn
997	Longkan Shoujian	26,430Шаблон:Sfn
1011	Guangyun	26,194
1039	Jiyun	53,525
1066	Leipian	31,319
1615	Zihui	33,179
1675	Zhengzitong	33,440
1716	Kangxi Dictionary	46,933
1915	Zhonghua Da Zidian	48,200
1968	Zhongwen Da Cidian	49,888
1989	Hanyu Da Zidian	54,678
1994	Zhonghua Zihai	85,568Шаблон:Sfn
2017	Dictionary of Chinese Character Variants	106,330^[1]

Number of characters in bilingual Chinese dictionaries
Year	Dictionary	Language	Number of characters
2003	ABC Chinese–English Comprehensive Dictionary	English	9,638 Шаблон:Efn Шаблон:Sfn
2003	Dai Kan-Wa Jiten	Japanese	50,305^[2]
2008	Han-Han Dae Sajeon	Korean	53,667^[3]Шаблон:Additional citation needed

The total number of Chinese characters is well above 100,000 if the variants are counted, as shown by the tables above.

Modern Chinese characters

Due to the dynamic development of languages, there is no definite number of modern Chinese characters. However a reasonable estimation can be made by a survey of the character sets of relevant standard lists and influential dictionaries in the countries and regions where Chinese characters are used.Шаблон:Sfn

Mainland China

The important standards in the People's Republic of China include List of Frequently Used Characters in Modern Chinese (Шаблон:Zhi, of 3,500 characters),^[4] and the List of Commonly Used Characters in Modern Chinese (Шаблон:Zhi with 7,000 characters, including the 3,500 characters in the previous list).^[5] But the current standard is the Table of General Standard Chinese Characters, which was released by the State Council in June 2013 to replace the previous two lists and some other standards. It includes 8,105 characters of the Simplified Chinese writing system, 3,500 as primary, 3,000 as secondary, and 1,605 as tertiary. In addition, there are 2,574 Traditional characters and 1,023 variants.^[6]

From 1990 to 1991, the National Leading Group for Teaching Chinese as a Foreign Language and the Chinese Proficiency Test Center of Beijing Language Institute jointly developed the "汉语水平词汇与汉字等级大纲" (Outline of the Graded Vocabulary and Characters for HSK). The Chinese character outline contains 2,905 characters, divided into four grades: 800 Grade A characters, 804 Grade B characters, 601 Grade C characters, and 700 Grade D characters. Шаблон:Snf

The most popular modern Chinese character dictionary and word dictionary are Xinhua ZidianШаблон:Sfn and Xiandai Hanyu Cidian.Шаблон:Sfn They each includes over 13,000 characters of Simplified characters, Traditional characters and some variants.

Taiwan

In Taiwan, there are the Chart of Standard Forms of Common National Characters (Шаблон:Zhi) with 4,808 characters, and the Chart of Standard Forms of Less-Than-Common National Characters (Шаблон:Zhi), with 6,341 common national characters. Both lists were released by the Ministry of Education, with a total of 11,149 characters of the Traditional Chinese writing system.

Hong Kong

In Hong Kong, there is the List of Graphemes of Commonly-Used Chinese Characters for elementary and junior secondary education, totally 4,762 characters. This list was released by the Education Bureau, and is very influential in the educational circles.

Japan

In Japan, there are the jōyō kanji (frequently-used Chinese characters, designated by the Japanese Ministry of Education, including 2,136 characters) and jinmeiyō kanji (for use in personal names, currently including 983 characters).

Korea

In Korea, there are the Basic Hanja for educational use (Шаблон:Lang, a subset of 1,800 Hanja defined in 1972 by a South Korea educational standard), and the Table of Hanja for Personal Name Use (Шаблон:Lang), published by the Supreme Court of Korea in March 1991.^[7] The list expanded gradually, and to year 2015 there were 8,142 hanja permitted to be used in Korean names.^[8]

Overall estimates

With consideration of all the character sets mentioned above, the total number of modern Chinese characters in the world is over 10,000, probably around 15,000.Шаблон:Sfn Шаблон:Sfn Such an estimation should not be counted as too rough, considering that there are totally over 100,000 Chinese characters, as mentioned above.

A college graduate who is literate in written Chinese knows between three and four thousand characters. Specialists in classical literature or history, who would often encounter characters no longer in use, are estimated to have a working vocabulary of between 5,000 and 6,000 characters.Шаблон:Sfn

Information Technology

Шаблон:See also

The following sections will introduce the Chinese character sets of some encoding standards used in information technology, including GB, Big5 and Unicode.

The GB standard

GB stands for Guobiao, "Guojia Biaozhun" (国家标准, or ‘national standard’) in Putonghua, and is the prefix for reference numbers of official standards issued by the People's Republic of China.

The first GB Chinese character encoding standard is GB2312, which was released in 1980. It includes 6,763 Chinese characters, with 3,755 frequently-used ones sorted by Pinyin, and the rest by radicals (indexing components). GB2312 was designed for simplified Chinese characters. Traditional characters which have been simplified are not covered. GB2312 is still in use on some computers and the WWW, though newer versions with extended character sets, such as GB13000.1 and GB18030, have been released.Шаблон:Snf The latest version of GB encoding is GB18030. It supports both simplified and traditional Chinese characters, and is consistent with Unicode’s character set.^[9]

The Big5 standard

Big5 encoding was designed by five big IT companies in Taiwan in the early 1980s, and has been the de facto standard for representing traditional Chinese in computers ever since. Big5 is popularly used in Taiwan, Hong Kong and Macau. The original Big5 standard included 13,053 Chinese characters, with no simplified characters of the Mainland. Chinese characters in the Big5 character set are arranged in radical order. Extended versions of Big5 include Big-5E and Big5-2003, which include some simplified characters and Hong Kong Cantonese characters.^[10]

The Unicode standard

Unicode is the most influential international standard for multilingual character encoding. It is consistent with (or virtually equivalent to) standard ISO/IEC10646. The full version of Unicode represents a character with a 4-byte digital code, providing a huge encoding space to cover all characters of all languages in the world. The Basic Multilingual Plane (BMP) is a 2-byte kernel version of Unicode with 2^16=65,536 code points for important characters of many languages. There are 27,522 characters in the CJKV (China, Japan, Korea and Vietnam) Ideographs Area, including all the simplified and traditional Chinese characters in GB2312 and Big5 traditional. Шаблон:Snf

In Unicode 15.0, there is a multilingual character set of 149,813 characters, among which 98,682 (about 2/3) are Chinese characters sorted by Kangxi Radicals (康熙部首). Even very rarely-used characters are available. ^[11]

All the 5,009 characters of the Hong Kong Supplementary Character Set (HKSCS)^[12] are included in Unicode. HKSCS was developed by the Hong Kong government as a collection of locally specific Chinese characters not available on the computer in the early days.

Unicode is becoming more and more popular. It is reported that UTF-8 (Unicode) is used by 98.1% of all the websites. It is widely believed that Unicode will ultimately replace all other information interchange codes and internal codes for digital devices.^[13]

Notes

Шаблон:Notelist

References

Citations

Шаблон:Reflist

Works cited

Шаблон:Refbegin

Шаблон:Refend

↑ Шаблон:Cite web
↑ Шаблон:Cite web
↑ Шаблон:Cite web
↑ 现代汉语常用字表 Шаблон:Webarchive [List of Frequently Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
↑ 现代汉语通用字表 Шаблон:Webarchive [List of Commonly Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
↑ Шаблон:Cite web
↑ National Academy of the Korean Language (1991) Шаблон:Webarchive
↑ Шаблон:Cite news
↑ Шаблон:Cite web
↑ Шаблон:Cite web
↑ https://www.unicode.org/versions/stats/
↑ http://www.ogcio.gov.hk/en/business/tech_promotion/ccli/hkscs/
↑ https://w3techs.com/technologies/details/en-utf8

[1] Шаблон:Cite web

[2] Шаблон:Cite web

[3] Шаблон:Cite web

[4] 现代汉语常用字表 Шаблон:Webarchive [List of Frequently Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.

[5] 现代汉语通用字表 Шаблон:Webarchive [List of Commonly Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.

[content_2469793-6] Шаблон:Cite web

[7] National Academy of the Korean Language (1991) Шаблон:Webarchive

[8] Шаблон:Cite news

[gb18030-2022-kenlunde-9] Шаблон:Cite web

[10] Шаблон:Cite web

[11] ttps://www.unicode.org/versions/stats/

[12] ttp://www.ogcio.gov.hk/en/business/tech_promotion/ccli/hkscs/

[13] ttps://w3techs.com/technologies/details/en-utf8

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.