Home / Business Focus
Sun, Jun 03, 2007 - Page 12 News List

Google engineers try to read users' minds

Millions of times a day, users click away from Google, disappointed that they couldn't find exactly what they were looking for. And this is where a secretive section of the company's inner sanctum comes in

NY TIMES NEWS SERVICE , MOUNTAIN VIEW, CALIFORNIA

At other times, complaints highlight more complex problems. In 2005, Bill Brougher, a Google product manager, complained that typing the phrase "teak patio Palo Alto" didn't return a local store called the Teak Patio.

So Singhal fired up one of Google's prized and closely guarded internal programs, called Debug, which shows how its computers evaluate each query and each Web page. He discovered that Theteakpatio.com did not show up because Google's formulas were not giving enough importance to links from other sites about Palo Alto.

It was also a clue to a bigger problem: finding local businesses is important to users, but Google often has to rely on only a handful of sites for clues about which businesses are best. Within two months of Brougher's complaint, Singhal's group had written a new mathematical formula to handle queries for hometown shops.

But Singhal often doesn't rush to fix everything he hears about, because each change can affect the rankings of many sites. "You can't just react on the first complaint," he says. "You let things simmer."

The reticent Manber (he declines to give his age), would discuss his search-quality group only in the vaguest of terms. It operates in small teams of engineers. Some, like Singhal's, focus on systems that process queries after users type them in. Others work on features that improve the display of results, like extracting snippets -- the short, descriptive text that gives users a hint about a site's content.

Other members of Manber's team work on what happens before users can even start a search: maintaining a giant index of all the world's Web pages. Google has hundreds of thousands of customized computers scouring the Web to serve that purpose. In its early years, Google built a new index every six to eight weeks. Now it rechecks many pages every few days.

And Google does more than simply build an outsized, digital table of contents for the Web. Instead, it actually makes a copy of the entire Internet -- every word on every page -- that it stores in each of its huge customized data centers so it can comb through the information faster. Google recently developed a new system that can hold far more data and search through it far faster than the company could before.

As Google compiles its index, it calculates a number it calls PageRank for each page it finds. This was the key invention of Google's founders, Page and Sergey Brin. PageRank tallies how many times other sites link to a given page. Sites that are more popular, especially with sites that have high PageRanks themselves, are considered likely to be of higher quality.

Singhal has developed a far more elaborate system for ranking pages, which involves more than 200 types of information, or what Google calls "signals." PageRank is but one signal. Some signals are on Web pages -- like words, links, images and so on. Some are drawn from the history of how pages have changed over time. Some signals are data patterns uncovered in the trillions of searches that Google has handled over the years.

This story has been viewed 3387 times.
TOP top