Sat, Nov 25, 2006 - Page 9 News List

How engineers built the Web's Tower of Babel

New standards mean that all the world's languages can now finally be used on the World Wide Web

By Kieren McCarthy  /  THE GUARDIAN , LONDON

At a UN meeting last month, a bespectacled Swede made a small, barely noticed announcement that nevertheless represented a pivotal moment in the history of the Internet.

"Regarding the technical implementation for the World Wide Web, we are done," Patrik Faltstrom told the Internet Governance Forum.

By "we are done," he meant that following a decade of hard work by a global consortium of engineers and linguists, they had finally decided on a document that will enable all the world's languages to be fully represented on the Internet. People will be able to type in addresses in their own language, search in their own language and move around the Internet in their own language.

The challenge was every bit as immense as it sounds. The Internet was designed to work with the English alphabet -- "a" to "z," and the numbers "1" through "9." Useful symbols rapidly made their way into the system -- plus, minus, dash, and so on -- each represented with a particular code (or, as Internet engineers insist on calling it, an "identifier"). Agreeing on identifiers was easy at first, but as Internet use spread across the globe, people started asking for more to be added to fit other languages, whether an accent on a letter, or an entirely different alphabet.

Global balancing act

As languages have spread and developed, some elements have changed and some stayed the same. Some have grown to have different meanings. Some look identical and are anything but.

One thing is for certain: Everyone is unshakeable in their belief that their language is as valid as any other. No matter how wonderful the Internet is, it does not override culture and history. The result has been a very careful balance.

"No script and no person will be happy with the definition of identifiers," Faltstrom said.

"Everyone will be unhappy. We just have to find a standard that makes people the least unhappy," he said.

It can be difficult for an English speaker to grasp the problem. For example, the small dots over the "a" and "o" in Faltstrom's surname carry significance and meaning. Because it is a Western language, we are able to view it as an "a" and an "o" with some dots. Not so with different alphabets. Fortunately, there is a real-world example that makes this global balancing act more understandable.

Richard Haigh is a British Web designer from the city of Nottingham and the proud owner of "?.com." He has decided he wants to use the site to cover the debate over Britain's possible future adoption of the euro.

"When it does kick off, I want to provide somewhere where people can voice their concerns," he explains.

Despite having "no personal belief either way," he thinks that he's on to something unique with his pound-symbol domain name.

But Haigh doesn't actually own "?.com." He owns "xn--9a.com" -- the identifier used to represent the pound symbol. In fact, ?.com. doesn't (strictly speaking) exist.

Why? Ask John Klensin, who is, along with Faltstrom, the person most responsible for unusual additions to the Internet's domain name system. He is blunt about Haigh's Web address.

"The ?.com. domain shouldn't exist -- it has been prohibited all along," he said.

When told it clearly does exist, he is unremitting: "If [the Web address] resolves, it is probably another bug. Somehow it has been sneaked through."

This story has been viewed 2498 times.
TOP top