Английская Википедия:Chinese character encoding

Шаблон:Short description Шаблон:More citations needed In computing, Chinese character encodings can be used to represent text written in the CJK languages—Chinese, Japanese, Korean—and (rarely) obsolete Vietnamese, all of which use Chinese characters. Several general-purpose character encodings accommodate Chinese characters, and some of them were developed specifically for Chinese.

In addition to Unicode (with the set of CJK Unified Ideographs), local encoding systems exist. The Chinese Guobiao (or GB, "national standard") system is used in Mainland China and Singapore, and the (mainly) Taiwanese Big5 system is used in Taiwan, Hong Kong and Macau as the two primary "legacy" local encoding systems. Guobiao is usually displayed using simplified characters and Big5 is usually displayed using traditional characters. There is however no mandated connection between the encoding system and the font used to display the characters; font and encoding are usually tied together for practical reasons.

The issue of which encoding to use can also have political implications, as GB is the official standard of the People's Republic of China and Big5 is a de facto standard of Taiwan.

In contrast to the situation with Japanese, there has been relatively little overt opposition to Unicode, which solves many of the issues involved with GB and Big5. Unicode is widely regarded as politically neutral, has good support for both simplified and traditional characters, and can be easily converted to and from the GB and Big5. Furthermore, Unicode has the advantage of not being limited only to Chinese, since it contains character codes for (nearly) every language.

Guobiao

Шаблон:Main

The Guobiao (GB) line of character encodings start with the Simplified Chinese charset GB 2312 published in 1980. Two encoding schemes existed for GB 2312: a one-or-two byte 8-bit EUC-CN encoding commonly used, and a 7-bit encoding called HZ^[1] for usenet posts.^[2]Шаблон:Rp A traditional variant called GB/T 12345 was published in 1990.

The EUC-CN form was later extended into GBK to include all Unicode 1.1 CJK Ideographs in 1993, abandoning the ISO-2022 model. By doing so, GBK includes Traditional Chinese characters in addition to simplified ones in GB2312.^[3] GBK gained popularity through the widespread Code page 936 implementation found in Microsoft Windows 95.

In 2000, GB 18030 was published as GBK's successor. This new encoding includes a four-byte UTF which encodes all Unicode codepoints not previously encoded.^[4] In 2005, GB 18030 was published to contain reference glyphs for scripts used by ethnic minorities in China, as well as glyphs from CJK Unified Ideographs Extension B due to the update of Unicode.

Adobe-GB1 is the corresponding PostScript charset for GB encodings.

Big5

Шаблон:MainThe Big5 family of character encodings start with the initial definition by the consortium of five companies in Taiwan that developed it.^[5] It is a double-byte character set (DBCS) somehow similar to Shift JIS, often combined with a MBCS like ASCII. Quite a few vendors as well as official extensions exist, of which ETEN, HKSCS (Hong Kong) and Big5-2003 (as a part of CNS 11643 by Taiwan) are the most well-known ones.^[6] Adobe-CNS1 is the PostScript charset corresponding to the Big5 family of encodings.

Conversion

Prior to GBK which includes both traditional and simplified characters, conversion between Traditional Chinese and Simplified Chinese charsets was complicated by the need of transcribing text between the two variants of Chinese, as one charset cover many of the other's characters only in its own variant. The conversion between traditional and simplified Chinese is usually problematic, because the simplification of some traditional forms merged two or more different characters into one simplified form. The traditional to simplified (many-to-one) conversion is technically simple. The opposite conversion often results in a data loss when converting to GB 2312: in mapping one-to-many when assigning traditional glyphs to the simplified glyphs, some characters will inevitably be the wrong choices in some of the usages. Thus simplified to traditional conversion often requires usage context or common phrase lists to resolve conflicts. This issue is less of a problem with newer standards such as GBK, GB 18030 and Unicode which have separate code points for both simplified and traditional characters. Шаблон:Citation needed

One other issue is that many of the encoding systems are missing characters. While the missing characters are often literary and not commonly used in ordinary text, this does become a problem because people's names often contain these characters. An example of the problem is the Taiwanese politician Wang Chien-shien who has a Шаблон:Transliteration (Шаблон:Lang) character in his name which is not in some character systems, and former Premier of the People's Republic of China Zhu Rongji, whose Шаблон:Transliteration (Шаблон:Lang) character is not in GB 2312. The newest GB standard, GB 18030 has the complete character repertoire of Unicode 4.0, including the Unihan extensions in the Supplementary Ideographic Plane.^[2]Шаблон:Rp

References

Шаблон:Reflist

External links

Шаблон:CJK computing

↑ Шаблон:IETF RFC
↑ ^2,0 ^2,1 Шаблон:Cite book
↑ Шаблон:Cite web
↑ Authoritative mapping table between GB18030-2000 and Unicode. ICU – International Components for Unicode. 2001-02-21. Accessed 2016-10-13.
↑ Шаблон:Cite web
↑ Шаблон:Cite web

[1] Шаблон:IETF RFC

[cjkv-info-proc-2] 2,0 ^2,1 Шаблон:Cite book

[3] Шаблон:Cite web

[4] Authoritative mapping table between GB18030-2000 and Unicode. ICU – International Components for Unicode. 2001-02-21. Accessed 2016-10-13.

[5] Шаблон:Cite web

[6] Шаблон:Cite web

[1]

[2]

[3]

[4]

[5]

[6]

Партнерские ресурсы
Криптовалюты	Обмен криптовалют - www.bestchange.ru Криптовалютная биржа CoinEx Криптовалютная биржа Binance HIVE OS - операционная система для майнинга e4pool - Мультивалютный пул для майнинга.
Магазины	AliExpress — глобальная виртуальная (в Интернете) торговая площадка, предоставляющая возможность покупать товары производителей из КНР; computeruniverse.net - Интернет-магазин компьютеров(Промо код 5 Евро на первую покупку:FWWC3ZKQ);
Хостинг	DigitalOcean - американский провайдер облачных инфраструктур, с главным офисом в Нью-Йорке и с центрами обработки данных по всему миру;
Разное	Викиум - Онлайн-тренажер для мозга Like Центр - Центр поддержки и развития предпринимательства. Gamersbay - лучший магазин по бустингу для World of Warcraft. Ноотропы OmniMind N°1 - Усиливает мозговую активность. Повышает мотивацию. Улучшает память. Санкт-Петербургская школа телевидения - это федеральная сеть образовательных центров, которая имеет филиалы в 37 городах России. Lingualeo.com — интерактивный онлайн-сервис для изучения и практики английского языка в увлекательной игровой форме. Junyschool (Джунискул) – международная школа программирования и дизайна для детей и подростков от 5 до 17 лет, где ученики осваивают компьютерную грамотность, развивают алгоритмическое и креативное мышление, изучают основы программирования и компьютерной графики, создают собственные проекты: игры, сайты, программы, приложения, анимации, 3D-модели, монтируют видео. Умназия - Интерактивные онлайн-курсы и тренажеры для развития мышления детей 6-13 лет SkillBox - это один из лидеров российского рынка онлайн-образования. Среди партнеров Skillbox ведущий разработчик сервисного дизайна AIC, медиа-компания Yoola, первое и самое крупное русскоязычное аналитическое агентство Tagline, онлайн-школа дизайна и иллюстрации Bang! Bang! Education, оператор PR-рынка PACO, студия рисования Draw&Go, агентство performance-маркетинга Ingate, scrum-студия Sibirix, имидж-лаборатория Персона. «Нетология» — это университет по подготовке и дополнительному обучению специалистов в области интернет-маркетинга, управления проектами и продуктами, дизайна, Data Science и разработки. В рамках Нетологии студенты получают ценные теоретические знания от лучших экспертов Рунета, выполняют практические задания на отработку полученных навыков, общаются с экспертами и единомышленниками. Познакомиться со всеми продуктами подробнее можно на сайте https://netology.ru, линейка курсов и профессий постоянно обновляется. StudyBay Brazil – это онлайн биржа для португалоговорящих студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт. Автор24 — самая большая в России площадка по написанию учебных работ: контрольные и курсовые работы, дипломы, рефераты, решение задач, отчеты по практике, а так же любой другой вид работы. Сервис сотрудничает с более 70 000 авторов. Более 1 000 000 работ уже выполнено. StudyBay – это онлайн биржа для англоязычных студентов и авторов! Студент получает уникальную работу любого уровня сложности и больше свободного времени, в то время как у автора появляется дополнительный заработок и бесценный опыт.

Английская Википедия:Chinese character encoding

Содержание

Guobiao

Big5

Conversion

See also

References

Further reading

External links

Навигация

Действия на странице

Действия на странице

Персональные инструменты

Навигация

Поиск

Инструменты