In addition, CNET News.com has found that contrary to Google co-founder Sergey Brin'swhen their search results are censored, the company frequently filters out sites without revealing it.
Some of the blackballing appeared to be a mistake. The University of Pennsylvania's entire engineering school server--which hosted one Falun Gong site--was blocked from Google's Google.cn China site. So was an Essex County Web site, which sports the word "sex"--as in "Essex"--in its domain name. Google.cn also doesn't display search.msn.com to someone who's hunting for the rival Microsoft service.
And the results can be haphazard. A search in English on "Tiananmen Square" turned up some sites but not others. Tsquare.tv, a site devoted to the protest and subsequent massacre, was filtered out, but Wikipedia's write-up appeared. And an image search revealed the iconic photo of a student blocking a column of tanks before the 1989 massacre. Search results also appear to vary depending on whether they're done in English or in Chinese characters.
In a series of conversations starting Wednesday, Google representatives responded to CNET News.com's queries by saying that some Web site blockages are human errors that should be expected when any new service is introduced, and others represent a concerted attempt to comply with Chinese censorship laws. By Thursday, a handful of blackballed sites, such as the engineering school and Budweiser.com, had been cleared to appear on Google.cn, though Guinness.com had not.
When launching its China-based search site this week, Google defended its decision to comply with the dictates of China's ruling Communist Party by saying the new service expands access to information for Chinese users. But its choice has been controversial, not least because Google's corporate motto is "Don't be evil."
Google's China launch comes as scrutiny of search engine providers' blasted Google for "collaborating with (democracy activists') persecutors."is increasing and criticism of their choice to comply with repressive regimes is growing. Congress is in the next few weeks, and on Wednesday, Rep. Chris Smith
Because access from China to the U.S. Google.com site is limited for financial and political reasons, the vast majority of Chinese are forced to turn to domestic search engines instead. Google's Brin has estimated that Google.com is available to only half of the country's users. Other reports say that when search terms such as "Tiananmen Square" are typed in on Google.com, the site immediately becomes unreachable for a few hours.
Bill Albert, a spokesman for the Washington, D.C.-based National Campaign to Prevent Teen Pregnancy, said it was "discouraging" to find that his group has been banned from Google.cn, especially since it hasn't been blackballed by Yahoo's China site or by Microsoft's Chinese version of MSN. "While our focus is on U.S. rates of teen pregnancy and birth we do have a lot of people coming from foreign countries, and we certainly would like to keep that line of communication open," Albert said.
A search for "teen pregnancy" through Google's U.S. Web site lists the group's home page as the first result. But in an identical search through Google.cn, the campaign's Web site is not listed. Google does not inform users that it was deleted.
Google said in a statement Wednesday that its filters are "intended to block the minimum required to comply with (Chinese) laws and regulations."
In a second statement to CNET News.com, the company added: "As with most brand-new services, our launch is immediately followed by a process of identifying and correcting bugs or other technical issues. Google.cn is no exception, and we will continue to refine our processes to ensure that we are filtering the minimum necessary, and that notices are properly displayed in all instances results have been filtered." (Google refuses to make its list of off-limits Web sites public.)
The buggy Chinese filtering stands out as a rare black eye for a company that prides itself on superior search technology, has a $126 billion market capitalization and boasts on its payroll one of the world's highest concentrations of computer science doctoral degrees.
A September 2000 Chinese government directive says that Internet content providers must restrict information that may "harm the dignity and interests of the state" or that foster "evil cults" or "damage the social stability." Alcohol and teen pregnancy sites are not listed as off-limits categories.
Many Web sites censored from Google's Chinese results touch on topics known to be unpopular with the Communist Party: the Tiananmen protest and massacre, political criticism in general, Tibet, Taiwan and Falun Gong (a growing movement that combines traditional Chinese breathing exercises with meditation and that's been renounced by the Chinese government as a cult). But others are more puzzling, such as jokes and alcohol.
Similarly, Lesbian.com is permitted by Yahoo and Microsoft, but not Google. Neworder.box.sk, a computer security site, and the matchmaking site Date.com are blackballed only by Google.
"Our focus tends to be more North America and Europe, but we are a bit concerned because we have been expanding into other regions, and China does represent a large potential market for us," said Michael Ellis, privacy and security manager for Date.com.
Scaling the Chinese firewall
To test the effectiveness of search censorship in China, CNET News.com wrote a computer program to check 4,600 Internet host names compiled by the Open Net Initiative for use in earlier tests of Chinese filtering. Web sites that were indexed by Google.com and MSN.com but not their Chinese counterparts were identified. Only a subset was tested against Yahoo because its Chinese Web site was frequently nonresponsive, and the program tested only host names, not individual Web pages.
The results showed that Google blocked the most sites, filtering out about 13 percent of the host names tested compared with MSN's 10 percent. But while both MSN and Google deleted pornography and political sites from search listings, Google also singled out more humor sites and more sites related to homosexuality--and it was the only search engine to block information related to alcohol, dating and marijuana.
Danene Sorace, director of the Network for Family Life Education at Rutgers University, said she's not pleased that the university's Sex, etc. site is being filtered out by Google.cn. "The challenge, of course, is that sexual health information often gets mixed up in pornography," Sorace said. "What we are about is about sexual health, and that often gets lost when you apply these kinds of filtering programs."
Google.cn's censorship was not just overinclusive. Like the other search engines, it frequently was underinclusive as well. The pro-marijuana site HighTimes.com is blocked, but its alternate domain name of 420.com was not (420 is a slang term associated with marijuana use). Bacardi.com was missing, but the company's French, German, Canadian and Italian country-code sites were still available. While Penthouse.com and Playboy.com were invisible, searching on the magazines' titles offered an Amazon.com subscription link.
Mickey Spiegel, senior researcher in the Asia division of Human Rights Watch (blocked by Google and Yahoo but not Microsoft), said Google.cn was "a step backwards in terms of freedom of expression issues."
"It will leave the Chinese populace with less and less ability to, in a sense, think for themselves about some of the issues facing them today," Spiegel said. "They are going to have a restricted diet of info, and that is going to color how they view the world. It's a big story, and it's a stain on their image."
Adrienne Verrilli, communications director for the Sexuality Information and Education Council (blocked from Google.cn), said valuable, life-saving Web sites often get blocked in censorship sweeps.
"I guess the Chinese people aren't allowed to get good sexual health information," Verrilli said. "That's unfortunate and disappointing. We have such good information for the Chinese, who are going to be steeped in their own HIV/AIDs crisis very shortly."
Google's Brin told Fortune magazine this week that "if there's any kind of material blocked by local regulations, we put a message to that effect at the bottom of the search engine." Tests show, however, that the message tends to appear only for political sites such as Tibet and Falun Gong, and not the other categories of information censored from Google.cn.
Google's earlier missteps
This is not the first time that the world's most famous search company has encountered problems when trying to sort out the difference between what's sex and what's not.
A 2004 investigation by CNET News.comthat Google's SafeSearch filter technology incorrectly blocked many innocuous Web sites based solely on strings of letters such as "sex," "girls" or "porn" embedded in their domain names. PartsExpress.com, ALittleGirlsBoutique.com, RomansInSussex.co.uk, ArkansasExtermination.com and BassExpert.com were incorrectly identified as pornographic.
Many of the same problems have plagued Internet filters for the last decade. One 1996 report, for instance, showed that CyberPatrol blocked National Rifle Association and gay and lesbian Web sites, and CyberSitter cordoned off Usenet newsgroups such as alt.feminism and soc.support.fat-acceptance. In a famously embarrassing incident in 1996, America Online's errant dirty-word filter prevented residents of the British town Scunthorpe from signing up as new customers.
China's government has an extensive Internet filtering process in place that controls which overseas Web sites its citizens can access. (A 2005 study by the Open Net Initiative called it "quite thorough.") With that filtering as a guide, foreign companies are expected to build their own lists of Web sites to delete from Chinese search listings.
Sharon Hom, executive director of Human Rights in China, with offices in Hong Kong and New York, said her group conducted its own test on Wednesday on both Google's U.S. and China search sites in English and Chinese. Searching for "HRIC" in English on Google.com, the group's Web site was the top result, and using the Chinese interface it was the second result. Doing the same search in Chinese on Google.cn the site did not appear in the first 100 results.
Hom said Google justifies its action by saying it must make trade-offs to be able to provide fast, accessible search. "What Google has, unfortunately, done is taken its enormous clout and technology and put it at the service of the Chinese government, who already have the most state-of-the-art surveillance and censorship in the world," she said.
It's not just Google's Web search site that looks different to Chinese users. A search for "Tibet" on Google News through the Google.com site shows links to articles about a benefit for Tibet House, a speech by exiled Tibetan leader the Dalai Lama, and at the fifth spot, a story about the Chinese government censoring information.
That's a sharp contrast with news search results on Google.cn. In English, a search there for news articles about Tibet brings up four results: one about archaeology in Tibet, one with translations of seemingly random sentences, a girl's blog about her first love, and a news story about camel farming that mentions Tibet once. Using Chinese characters to search for "Tibet" news on Google.cn brought up thousands of sites but none among the top 10 results that mentioned Google, Chinese censorship or anything controversial.
A search on news at Google.cn for "Tibet" and "freedom" in English returned no results, while 144 appeared with the same search on Google.com.
CNET News.com's Elinor Mills and Anne Broache contributed to this report.