English has been the dominant language of the internet since its invention, because it was invented in English. But as that dominance has been eroded by the global spread of the internet, so the internet has evolved to deal with the change. Both what is written and how it's written have proven to be technical and social challenges.
At its inception, the internet was only available in ASCII, which stands for American Standard Code for Information Interchange. The original ASCII was a 7-bit encoding, meaning it had space for just seven binary bits (1s and 0s), which allowed for just 128 characters, all in the English alphabet. It simply wasn't possible to communicate through a teletype machine (which ASCII was developed for) in anything but Latin characters and for many years this was true of the internet as well.
In the late Eighties, engineers at Apple and Xerox proposed a new encoding, one which used 16 bits, and could thus support far, far more characters. This system came to be known as Unicode: a universal encoding which would be capable of representing every single one of the world's writing systems, from Հայոց գրեր (Armenian) to ᑎᑎᕋᐅᓯᖅ ᓄᑖᖅ (Inuktitut) - as well as diacritics and non-English latin characters like á and ß.
The availability of Unicode supported the rapid expansion of other languages online and by 2010 it was found on more than half of the pages on the web. And in the same year, the first internationalised top-level domains became available. Like the pages that they served, internet addresses - both the site names and the top-level domain names - had been restricted to ASCII. But in 2010, the first internationalised country code to-level domains became available, meaning that alongside
.org, there was now
.السعودية (Saudi Arabia) and many more.
Like the minority or unrecognised national groups passed over by ISO country codes, which were discussed in the story of
.scot, many minority language groups have also been poorly served by the state-based architecture of the internet.
The first non-national language group to make a claim for its own domain was Catalan, with over four million native speakers, and a history of advocacy for independence. Catalan speakers had long been reluctant to use
.it domains, despite living in those countries, and wanted a top-level domain which would highlight the distinct Catalan language and culture. In 2005, ICANN created the much envied
.cat domain, with most of the thanks due to Amadeu Abril i Abril, an EU lawyer and diplomat and one of only two Spaniards on ICANN's board - and a Catalan to boot. The success of the Catalans in turn inspired a host of other groups to make plans to put their language across on the web. Basque (
.eus), Breton (
.bzh), and Galician (
.gal) organisations were all successful in getting their claim across - and, crucially, retaining control of the domain by a cultural organisation based in the language itself, which ensured it would be protected. (Although, admittedly, a few feline domain hacks have snuck through.)
DotCYM was set up in Aberystwyth in 2006 to campaign for a Welsh-language domain name. The Welsh language has about three quarters of a million speakers, more than many independent nations. It deserved a domain of its own. But the campaign hit a problem: "cym", based on cymru, the Welsh word for Wales, is the international three-letter code for the Cayman Islands. While the original set of country-code top-level domain names uses the ISO two-letter codes (e.g.
.de), ICANN has specified that the corresponding three-letter codes are also reserved for future use (
.deu). After four years of wrangling,
.cym was awarded not to the Welsh, but to the tiny British territory in the Caribbean, population 56,000.
Undaunted, the campaign tried again in 2011, this time following a public vote which chose to call for
.cymru as the domain for the Welsh language. But unlike in Scotland, where the Scottish government lent their support to
.scot and the Edinburgh-based non-profit which pushed for it, the Welsh government refused to support a local bid for a Welsh-language domain, saying that it had to support open market competition.
In 2014, DotCYM got its wish for a Welsh-language domain, but not for control of it. ICANN created both
.cymru and the English-language
.wales domains, but they were delegated to Nominet, the English company which runs
.uk. To add insult to injury, when Nominet launched its sales websites for the new domains, it used Google Translate to produce the Welsh content.
DotCYM shut its doors, noting on its website that "there is no reason for the continuation of the company after the Welsh Government gave ownership of the Welsh name and brand online to a private English company that’s not answerable to the Welsh people." As of 2015,
.cym remains the property of the Cayman Islands, and unused.