Microsoft Corp's Beijing-based researchers are analyzing Web surfing patterns to guess computer users' gender, age and other demographic information, a technology the advocacy group Reporters Without Borders worries could be misused by the Chinese government.
As detailed in a paper presented at the May 2007 WWW conference in Banff, Canada, Microsoft's researchers looked at the Web surfing history for people whose gender and age they knew, then applied that data to predict how likely a gender or age group was to visit certain Web sites. The researchers grouped similar Web sites together, assuming people of similar demographic profiles visited similar sites.
Researchers found that with the resulting formulas, their guesses about gender improved about 30 percent, and guesses about age improved 50 percent, compared with baseline algorithms.
In the paper, the researchers said they planned to extend their research to include attributes such as occupation and geographic location.
Reporters Without Borders, a press freedom advocacy group, said in an e-mailed statement on Friday that it was concerned the Chinese government could use this type of technology to track down Internet users who access controversial material online.
"We believe it is unacceptable to carry out this kind of sensitive research in a country such as China where 50 people are currently in prison because of what they posted online," the organization wrote.
Microsoft declined to grant an interview on the subject, but said in a written statement sent through its outside public relations agency that the focus of the research is not personal identification, and that no information that could lead Microsoft to identify individual users was used in the research.
Separately, Microsoft on Friday added copyrighted books to its online library, saying it has permission to offer the works to searchers on the Internet.
By making deals with authors and publishing houses to include their works in the Live Search Books index, Microsoft sidesteps a controversy triggered by Google's plan to offer the world's written works online.
"We have paid particular attention to ensuring that we are only including books in our index that our publishing partners have given us permission to include," Live Search program manager Betsy Aoki said in a posting on the Redmond, Washington-based technology giant's Web site.
"So our customers and partners can feel secure in our stance on copyright protection," she said.
Listed among the publishers adding their books to Microsoft's virtual shelves are McGraw-Hill Cos, Cambridge University Press, Rodale and Simon and Schuster.
Searchers can read books online or follow links to Web sites where they can buy them.
Google launched its book project in 2004 aiming to scan all literary works and post them online. The firm has stored on its searchable database classic works in the public domain, along with copyrighted books submitted with or without the publishers' permission.
After outcries from publishing houses and authors, Google modified its online library to offer only summaries of copyrighted works along with information regarding where to buy or borrow the books.