Engineers seek to teach the Web new languages

Programmers who set online standards want the Internet to use characters from other languages, such as Japanese, but there are drawbacks.

4 min read
Responding to exploding use of the Internet in other countries, programmers who devise online standards want to reinvent the ABCs of the Web and email.

The Internet Engineering Task Force (IETF) is working on a project that will allow foreign characters to become part of Internet domain names, which now only permit letters geared toward modern, Western European languages.

Although the content of Web sites can be written or translated into virtually any language using any alphabet, corresponding email addresses and Web sites' location names, or URLs, must be spelled in the Roman alphabet. IETF wants to broaden the pool of characters to include Chinese, Japanese, Russian, Arabic, Thai, Greek, Hindi, Hebrew, Sanskrit and other foreign symbols.

IETF members working on the Internationalized Domain Name project are studying the technological and cultural issues surrounding foreign-language characters online, said Thomas Narten, a senior software engineer at IBM and an area director for IETF. The group, a division of the nonprofit Internet Society, will then begin writing standards to accommodate foreign characters.

The project's biggest hurdle involves enlarging the potential pool of letters and symbols from the American Standard Code for Information Interchange, or ASCII.

ASCII characters include letters, numbers, punctuation and control codes (such as a character that marks the end of a line). The 128-character set is too small to support complicated Asian languages, particularly Chinese and Japanese, which have thousands of characters and combinations. Nor does it support many of the accents, slashes and other symbols common in Western languages such as Norwegian, German and French.

"It's a hard problem with a lot of gotchas," Narten said. "But the market is demanding it. If the IETF doesn't deliver a solution, someone else will do one, and it won't be a standard."

IETF's impetus to overcome the ASCII challenge is compelling: Foreign businesses are clamoring to bolster their online presence with Web sites and email in their native tongues.

More domain names are now registered outside the United States than inside. The most popular domain name suffix after the American ".com," ".net" and ".org" is Japan's ".jp"--even though URLs don't accommodate Japanese characters.

According to Forrester Research, Americans will account for only one-third of the Net's population by 2003. This year will mark the first year that English speakers become a minority on the Internet, where American-style English has been the lingua franca since its inception.

Consumer pressure will likely goad the IETF to set foreign character standards by the end of the year, said Mike Roberts, CEO of the Internet Corp. for Assigned Names and Numbers (ICANN). The Marina del Ray, Calif.-based organization is responsible for global policies affecting Internet addresses.

"The big picture perspective is that engineers have needed to expand the permitted character set for Internet standards for 5 to 6 years," Roberts said. "This is a very hot item and has a lot of lets-get-this-done-now pressure. We'll see some standards in months, not years."

The availability of Chinese, Japanese, Russian, Arabic and other characters presents a potential boon for foreign businesses, which have generally lagged behind the United States and Western Europe in terms of general user acceptance of the Internet. The availability of foreign characters could hasten the pace of Internet adoption abroad, experts said.

"Think of the poor Chinese customer who may be interested in buying something online," said Alex Pressman, president of Uniscape.com, which provides multilingual Web sites and international e-commerce consulting. "If they have to type in 'etoys.com,' instead of some Chinese word for 'toys,' it's just not the same thing. This could be really great for marketers."

China is perhaps the most promising online frontier. Internet connections for China's 1.3 billion citizens more than quadrupled, from 2 million to 9 million, last year alone. Connections are expected to reach more than 20 million by end of 2000, according to accounting firm Ernst & Young.

But some experts said the availability of foreign-character URLs and emails presents as many problems as it solves.

If a Chinese person can use nothing but Chinese and never has to learn foreign letters, the Internet's impact as a global marketplace erodes, said Desmond Wong, national director for business development in China at Ernst & Young.

Wong did not deny the need for Chinese characters on the Internet. The nation represents one-sixth of the world's population and has the world's largest wireless communications network--key as the Internet becomes the mainstay of handheld computers and phones, not just desktop computers.

But he wondered whether Chinese characters online would increase China's isolation. He likened the situation to the VHS vs. Betamax struggle that forced video companies to make two kinds of tapes for VCRs in the '70s and '80s, before VHS prevailed.

"If one-sixth of the world's population use VHS, and others use Beta, how do the groups communicate?" Wong asked. "Chinese folks can get on using Chinese domain names, but will an American or English or French person be able to access that? I don't think so."