Английская Википедия:Internationalized domain name
Шаблон:Short description Шаблон:Redirect
An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabetШаблон:Efn or in the Latin alphabet-based characters with diacritics or ligatures.Шаблон:Efn These writing systems are encoded by computers in multibyte Unicode. Internationalized domain names are stored in the Domain Name System (DNS) as ASCII strings using Punycode transcription.
The DNS, which performs a lookup service to translate mostly user-friendly names into network addresses for locating Internet resources, is restricted in practiceШаблон:Efn to the use of ASCII characters, a practical limitation that initially set the standard for acceptable domain names. The internationalization of domain names is a technical solution to translate names written in language-native scripts into an ASCII text representation that is compatible with the DNS. Internationalized domain names can only be used with applications that are specifically designed for such use; they require no changes in the infrastructure of the Internet.
IDN was originally proposed in December 1987 by Martin Dürst[1][2] and implemented in 1990 by Tan Juay Kwang and Leong Kok Yong under the guidance of Tan Tin Wee.Шаблон:Citation needed After much debate and many competing proposals, a system called Internationalizing Domain Names in Applications (IDNA)[3] was adopted as a standard, and has been implemented in several top-level domains.
In IDNA, the term internationalized domain name means specifically any domain name consisting only of labels to which the IDNA ToASCII algorithm (see below) can be successfully applied. In March 2008, the IETF formed a new IDN working group to update[4] the current IDNA protocol. In April 2008, UN-ESCWA together with the Public Interest Registry (PIR) and Afilias launched the Arabic Script in IDNs Working Group (ASIWG), which comprised experts in DNS, ccTLD operators, business, academia, as well as members of regional and international organizations. Operated by Afilias's Ram Mohan, ASIWG aims to develop a unified IDN table for the Arabic script, and is an example of community collaboration that helps local and regional experts engage in global policy development, as well as technical standardization.[5]
In October 2009, the Internet Corporation for Assigned Names and Numbers (ICANN) approved the creation of internationalized country code top-level domains (IDN ccTLDs) in the Internet that use the IDNA standard for native language scripts.[6][7] In May 2010, the first IDN ccTLDs were installed in the DNS root zone.[8]
Internationalizing Domain Names in Applications
Internationalizing Domain Names in Applications (IDNA) is a mechanism defined in 2003 for handling internationalized domain names containing non-ASCII characters.
Although the Domain Name System supports non-ASCII characters, applications such as e-mail and web browsers restrict the characters that can be used as domain names for purposes such as a hostname. Strictly speaking, it is the network protocols these applications use that have restrictions on the characters that can be used in domain names, not the applications that have these limitations or the DNS itself.Шаблон:Citation needed To retain backward compatibility with the installed base, the IETF IDNA Working Group decided that internationalized domain names should be converted to a suitable ASCII-based form that could be handled by web browsers and other user applications.Шаблон:Citation needed IDNA specifies how this conversion between names written in non-ASCII characters and their ASCII-based representation is performed. Шаблон:Citation needed
An IDNA-enabled application can convert between the internationalized and ASCII representations of a domain name. It uses the ASCII form for DNS lookups but can present the internationalized form to users who presumably prefer to read and write domain names in non-ASCII scripts such as Arabic or Hiragana. Applications that do not support IDNA will not be able to handle domain names with non-ASCII characters, but will still be able to access such domains if given the (usually rather cryptic) ASCII equivalent.
ICANN issued guidelines for the use of IDNA in June 2003, and it was already possible to register .jp domains using this system in July 2003 and .info[9] domains in March 2004. Several other top-level domain registries started accepting registrations in 2004 and 2005. IDN Guidelines were first created[10] in June 2003, and have been updated[11] to respond to phishing concerns in November 2005. An ICANN working group focused on country-code domain names at the top level was formed in November 2007[12] and promoted jointly by the country code supporting organization and the Governmental Advisory Committee. Additionally, ICANN supports the community-led Universal Acceptance Steering Group, which seeks to promote the usability of IDNs and other new gTLDS in all applications, devices, and systems.[13]
Mozilla 1.4, Netscape 7.1, and Opera 7.11 were among the first applications to support IDNA. A browser plugin is available for Internet Explorer 6 to provide IDN support. Internet Explorer 7.0[14] and Windows Vista's URL APIs provide native support for IDN.[15]
ToASCII and ToUnicode
The conversions between ASCII and non-ASCII forms of a domain name are accomplished by a pair of algorithms called ToASCII and ToUnicode. These algorithms are not applied to the domain name as a whole, but rather to individual labels. For example, if the domain name is www.example.com, then the labels are www, example, and com. ToASCII or ToUnicode is applied to each of these three separately.
The details of these two algorithms are complex. They are specified in RFC 3490. Following is an overview of their workings.
ToASCII leaves ASCII labels unchanged. It fails if the label is unsuitable for the Domain Name System. For labels containing at least one non-ASCII character, ToASCII applies the Nameprep algorithm. This converts the label to lowercase and performs other normalization. ToASCII then translates the result to ASCII, using Punycode.[16] Finally, it prepends the four-character string "xn--
".[17] This four-character string is called the ASCII Compatible Encoding (ACE) prefix. It is used to distinguish labels encoded in Punycode from ordinary ASCII labels. The ToASCII algorithm can fail in several ways. For example, the final string could exceed the 63-character limit of a DNS label. A label for which ToASCII fails cannot be used in an internationalized domain name.
The function ToUnicode reverses the action of ToASCII, stripping off the ACE prefix and applying the Punycode decode algorithm. It does not reverse the Nameprep processing, since that is merely a normalization and is by nature irreversible. Unlike ToASCII, ToUnicode always succeeds, because it simply returns the original string if decoding fails. In particular, this means that ToUnicode does not affect a string that does not begin with the ACE prefix.
Example of IDNA encoding
IDNA encoding may be illustrated using the example domain Bücher.example
. (Шаблон:Lang-de.) This domain name has two labels, Bücher and example. The second label is pure ASCII and is left unchanged. The first label is processed by Nameprep to give bücher
, and then converted to Punycode to result in bcher-kva
. It is then prefixed with xn--
to produce xn--bcher-kva
. The resulting name suitable for use in DNS records and queries is therefore xn--bcher-kva.example
.
Arabic Script IDN Working Group (ASIWG)
Шаблон:Tone While the Arab region represents 5 percent of the world's population, it accounts for a mere 2.6 percent of global Internet usage. Moreover, the percentage of Internet users among the population in the Arab world is a low of 11 percent, compared to the global rate of 21.9 percent. However, Internet usage in the region has grown by 1,426 percent between the years 2000 and 2008, which represents a large increase, particularly compared to the average world growth rate of 305.5 percent over the same period. It is reasonable to infer, therefore, that the usage growth could have been even more significant if DNS was available in Arabic characters. The introduction of IDNs offers many potential new opportunities and benefits for Arab Internet users by allowing them to establish domains in their native languages and alphabets, and to create a whole range of services and localized applications on top of those domains.[18]
Top-level domain implementation
In 2009, ICANN decided to implement a new class of top-level domains, assignable to countries and independent regions, similar to the rules for country code top-level domains. However, the domain names may be any desirable string of characters, symbols, or glyphs in the language-specific, non-Latin alphabet or script of the applicant's language, within certain guidelines to assure sufficient visual uniqueness.
The process of installing IDN country code domains began with a long period of testing in a set of subdomains in the test
top-level domain. Eleven domains used language-native scripts or alphabets, such as "δοκιμή",[19] meaning test in Greek.
These efforts culminated in the creation of the first internationalized country code top-level domains (IDN ccTLDs) for production use in 2010.
In the Domain Name System, these domains use an ASCII representation consisting of the prefix "xn--
" followed by the Punycode translation of the Unicode representation of the language-specific alphabet or script glyphs. For example, the Cyrillic name of Russia's IDN ccTLD is "рф". In Punycode representation, this is "p1ai
", and its DNS name is "xn--p1ai
".
Non-IDNA or non-ICANN registries that support non-ASCII domain names
Other registries support non-ASCII domain names. The company ThaiURL.com in Thailand supports ".com" registrations via its own IDN encoding, ThaiURL. However, since most modern browsers only recognize IDNA/Punycode IDNs, ThaiURL-encoded domains must be typed in or linked to in their encoded form, and they will be displayed thus in the address bar. This limits their usefulness; however, they are still valid and universally accessible domains.
Several registries support Punycode emoji characters as emoji domains.
ASCII spoofing concerns
Шаблон:Main The use of Unicode in domain names makes it potentially easier to spoof websites as the visual representation of an IDN string in a web browser may make a spoof site appear indistinguishable from the legitimate site being spoofed, depending on the font used. For example, the Unicode character U+0430 -- Cyrillic small letter a -- can look identical to the Unicode character U+0061 (Latin small letter a), used in English. As a concrete example, using Cyrillic letters а, е, і, р (a; then "Ie"/"Ye" U+0435, looking essentially identical to Latin letter e; then U+0456, essentially identical to Latin letter i; and "Er" U+0440, essentially identical to Latin letter p), the URL wіkіреdіа.org is formed, which is virtually indistinguishable from the visual representation of the legitimate wikipedia.org (possibly depending on typefaces).
Top-level domains accepting IDN registration
Many top-level domains have started to accept internationalized domain name registrations at the second or lower levels. Afilias (.INFO) offered the first gTLD IDN second-level registrations in 2004 in the German language.[20]
DotAsia, the registrar for the TLD Asia, conducted a 70-day sunrise period starting May 11, 2011 for second-level domain registrations in the Chinese, Japanese and Korean scripts.[21]
Timeline
- 1996-12: Martin Dürst's original Internet Draft proposing UTF-5 (the first example of what is known today as an ASCII-compatible encoding (ACE)) – UTF-5 was first defined at the University of Zürich[22][23][24]
- 1998-03: Early Research on IDN at National University of Singapore (NUS), Center for Internet Research (formerly Internet Research and Development Unit – IRDU) led by Tan Tin Wee (T. W. Tan)[25] (IDN Project team – Tan Juay Kwang and Leong Kok Yong) and subsequently continued under a team at Bioinformatrix Pte. Ltd. (BIX Pte. Ltd.) – an NUS spin-off company led by S. Subbiah.
- 1998-06: Korean Language Domain Name System is developed by Kang, Hee-Seung at KAIST (Korea Advanced Institute of Science and Technology)[26]
- 1998-07: Geneva INET'98 conference with a BoFШаблон:Clarify discussion on iDNS and APNG General Meeting and Working Group meeting.
- 1998-07: Asia Pacific Networking Group (APNG, now still in existence[27] and distinct from a gathering known as APSTAR)[28] iDNS Working Group formed.[29]
- 1998-10: James Seng, a former student of Tan Tin Wee at Sheares Hall, NUS, and student researcher at Technet and IRDU, Computer Center, NUS, was recruited by CEO S. Subbiah to lead further IDN development at BIX Pte. Ltd.
- 1999-02: iDNS Testbed launched by BIX Pte. Ltd. under the auspices of APNG with participation from CNNIC, JPNIC, KRNIC, TWNIC, THNIC, HKNIC, and SGNIC led by James Seng[30]
- 1999-02: Presentation of Report on IDN at Joint APNG-APTLD meeting, at APRICOT'99
- 1999-03: Endorsement of the IDN Report at APNG General Meeting 1 March 1999.
- 1999-06: Grant application by APNG jointly with the Centre for Internet Research (CIR), the National University of Singapore, to the International Development Research Center (IDRC), a Canadian Government funded international organization to work on IDN for IPv6. This APNG Project was funded under the Pan Asia R&D Grant administered on behalf of IDRC by the Canadian Committee on Occupational Health and Safety (CCOHS). Principal Investigator: Tan Tin Wee of National University of Singapore.[31]
- 1999-07 Tout, Walid R. (WALID Inc.) filed IDNA patent application number US1999000358043 "Method and system for internationalizing domain names". Published 2001-01-30.[32]
- 1999-07: Internet Draft on UTF5 by James Seng, Martin Dürst and Tan Tin Wee.[33] Renewed 2000.[34]
- 1999-08: APTLD and APNG forms a working group to look into IDN issues chaired by Kilnam Chon.[35]
- 1999-10: BIX Pte. Ltd. and National University of Singapore together with New York Venture Capital investors, General Atlantic Partners, spun off the IDN effort into 2 new Singapore companies – i-DNS.net International Inc. and i-Email.net Pte. Ltd. that created the first commercial implementation of an IDN solution for both domain names and IDN email addresses respectively.
- 1999-11: IETF IDN Birds-of-FeatherШаблон:Clarify in Washington was initiated by i-DNS.net at the request of IETF officials.
- 1999-12: i-DNS.net InternationalPte. Ltd. launched the first commercial IDN. It was in Taiwan and in Chinese characters under the top-level IDN TLD ".gongsi" (meaning loosely ".com") with endorsement by the Minister of Communications of Taiwan and some major Taiwanese ISPs with reports of over 200 000 names sold in a week in Taiwan, Hong Kong, Singapore, Malaysia, China, Australia and USA.
- Late 1999: Kilnam Chon initiated Task Force on IDNS which led to the formation of MINC, the Multilingual Internet Names Consortium.[36]
- 2000-01: IETF IDN Working Group formed chaired by James Seng and Marc Blanchet.
- 2000-01: The second-ever commercial IDN launch was IDN TLDs in the Tamil Language, corresponding to .com, .net, .org, and .edu. These were launched in India with IT Ministry support by i-DNS.net International.
- 2000-02: Multilingual Internet Names Consortium (MINC) Proposal BoFШаблон:Clarify at IETF Adelaide.[37]
- 2000-03: APRICOT 2000 Multilingual DNS session.[38]
- 2000-04: WALID Inc. (with IDNA patent-pending application 6182148) started Registration & Resolving Multilingual Domain Names.
- 2000-05: Interoperability Testing WG, MINC meeting. San Francisco, chaired by Bill Manning and Y. Yoneya, 12 May 2000.Шаблон:Citation needed
- 2000-06: Inaugural Launch of the Multilingual Internet Names Consortium (MINC) in Seoul[39] to drive the collaborative roll-out of IDN starting from the Asia Pacific.[40]
- 2000-07: Joint Engineering TaskForce (JET) was initiated in Yokohama to study technical issues led by JPNIC (K.Konishi)and TWNIC (Kenny Huang).
- 2000-07: Official Formation of CDNC (Chinese Domain Name Consortium) to resolve issues related to and to deploy Han Character domain names, founded by CNNIC, TWNIC, HKNIC and MONIC in May 2000.[41][42]
- 2001-03: ICANN Board IDN Working Group formed.
- 2001-07: Japanese Domain Name Association: JDNA Launch Ceremony (July 13, 2001) in Tokyo, Japan.
- 2001-07: Urdu Internet Names System (July 28, 2001) in Islamabad, Pakistan, Organised Jointly by SDNP and MINC.[43]
- 2001-07: Presentation on IDN to the Committee Meeting of the Computer Science and Telecommunications Board, National Academies USA (JULY 11–13, 2001) at University of California School of Information Management and Systems, Berkeley, CA.[44]
- 2001-08: MINC presentation and outreach at the Asia Pacific Advanced Network annual conference, Penang, Malaysia, 20 August 2001
- 2001-10: Joint MINC-CDNC Meeting in Beijing 18–20 October 2001.
- 2001-11: ICANN IDN Committee formed,[45] Ram Mohan (Afilias) appointed as Charter Member.
- 2001-12: Joint ITU-WIPO Symposium on Multilingual Domain Names organized in association with MINC, 6–7 December 2001, International Conference Center, Geneva.
- 2003-01: ICANN IDN Guidelines Working Group formed with membership from leading gTLD and ccTLD registries.
- 2003-01: Free implementation of stringprep, Punycode, and IDNA are released in GNU Libidn.
- 2003-03: Publication of RFC 3454, RFC 3490, RFC 3491 and RFC 3492.
- 2003-06: Publication of ICANN IDN guidelines for registries.[46] Adopted by .cn, .info, .jp, .org, and .tw registries.
- 2004-05: Publication of RFC 3743, Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean.
- 2005-03: First Study Group 17 of ITU-T meeting on Internationalized Domain Names.[47]
- 2005-05: .IN ccTLD (India) creates an expert IDN Working Group to create solutions for 22 official languages. Ram Mohan was appointed lead for technical implementation. C-DAC appointed a linguistic expert.
- 2006-04: ITU Study Group 17 meeting in Korea gave final approval to the Question on Internationalized Domain Names.[48]
- 2006-06: Workshop on IDN at ICANN meeting at Marrakech, Morocco.
- 2006-11: ICANN GNSO IDN Working Group created to discuss policy implications of IDN TLDs. Ram Mohan elected Chair of the IDN Working Group.[49]
- 2006-12: ICANN meeting in São Paulo discusses status of Шаблон:Clarify span
- 2007-01: Tamil and Malayalam variant table work completed by India's C-DAC and Afilias.
- 2007-03: ICANN GNSO IDN Working Group completes work, Ram Mohan presents a report at ICANN Lisboa meeting.[50]
- 2007-10: Eleven IDNA top-level domains were added to the root nameservers in order to evaluate the use of IDNA at the top level of the DNS.[51][52]
- 2008-01: ICANN: Successful Evaluations of .test IDN TLDs.[53]
- 2008-02: IDN Workshop: IDNs in Indian Languages and Scripts,[54] ICANN, DIT, Afilias, C-DAC, NIXI lead.
- 2008-04: IETF IDNAbis WG chaired by Vint Cerf continues the work to update IDNA.[55]
- 2008-04: Arabic Script IDN Working Group (ASIWG)[56] founded by Ram Mohan (Afilias) and Alexa Raad (PIR) in Dubai.
- 2008-06: ICANN board votes to develop final fast-track implementation proposal for a limited number of IDN ccTLDS.[57]
- 2008-06: Arabic Script IDN Working Group (ASIWG) membership[58] expands to Egypt, Iran, Kuwait, Pakistan, Saudi Arabia, Syria, UAE, Malaysia, UN ESCWA, APTLD, ISOC Africa, and invited experts Michael Everson and John Klensin.
- 2008-10: ICANN Seeks Interest in IDN ccTLD Fast-Track Process.[59]
- 2009-09: ICANN puts IDN ccTLD proposal on agenda for Seoul meeting in October 2009.[60]
- 2009-10: ICANN approves the registration of IDN names in the root of the DNS through the IDN ccTLD Fast-Track process at its meeting in Seoul, 26–30 October 2009.[61]
- 2010-01: ICANN announces that Egypt, the Russian Federation, Saudi Arabia, and the United Arab Emirates were the first countries to have passed the Fast-Track String Evaluation within the IDN ccTLD domain application process.[62]
- 2010-05: The first implementationsШаблон:Clarify go live. They are the ccTLDs in the Arabic alphabet for Egypt, Saudi Arabia, and the United Arab Emirates.[8]
- 2010-08: The IETF publishes the updated "IDNA2008" specifications as RFC 5890–5894.
- 2010-12: ICANN Board IDN Variants Working Group formed[63] to oversee and track the IDN Variant Issues Project. Members of the working group are Ram Mohan (Chair), Jonne Soininen, Suzanne Woolf, and Kuo-Wei Wu.
- 2012-02: International email was standardized, utilizing IDN.[64]
See also
References
Шаблон:Notelist Шаблон:Reflist
External links
- Шаблон:IETF RFC "Preparation of Internationalized Strings ('stringprep')"
- Шаблон:IETF RFC "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework"
- Шаблон:IETF RFC "Internationalized Domain Names in Applications (IDNA): Protocol"
- Шаблон:IETF RFC "The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)"
- Шаблон:IETF RFC "Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)"
- ICANN Internationalized Domain Names.
- IDN Language Table Registry
- Unicode Technical Report #36 – Security Considerations for the Implementation of Unicode and Related Technology
Шаблон:CcTLD Шаблон:Unicode navigation
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite IETF
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite press release
- ↑ Шаблон:Cite news
- ↑ 8,0 8,1 Шаблон:Cite press release
- ↑ Mohan, Ram, German IDN, German Language Table Шаблон:Webarchive, March 2003
- ↑ Dam, Mohan, Karp, Kane & Hotta, IDN Guidelines 1.0, ICANN, June 2003
- ↑ Karp, Mohan, Dam, Kane, Hotta, El Bashir, IDN Guidelines 2.0, ICANN, November 2005
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ RFC 3492, Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA), A. Costello, The Internet Society (March 2003)
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ IANA Report on Delegation of Eleven Evaluative Internationalised Top-Level DomainsШаблон:Dead link
- ↑ Шаблон:Cite web
- ↑ Dot-Asia releases IDN dates, Managing Internet IP, April 14, 2011.
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite news
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Mohan, Ram, GNSO IDN Working Group, Outcomes Report (PDF), ICANN
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ IDNAbis overview (2008)
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Proposed Final Implementation Plan: IDN ccTLD Fast Track Process, 30 September 2009
- ↑ Regulator approves multi-lingual web addresses, Silicon Republic, 30.10.2009
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite web
- ↑ Шаблон:Cite IETF