Mon, Dec 08, 2008 - Page 9 News List

The search for quality on the Web

The explosion of information on the Internet is overwelming. Semantic technology will help ensure that knowlege is not left to the mercy of popularity and money

By Riza Berkan

In the not-so-distant future, students will be able to graduate from high school without ever touching a book. Twenty years ago, they could graduate from high school without ever using a computer. In only a few decades, computer technology and the Internet have transformed the core principles of information, knowledge and education.

Indeed, today you can fit more books on the hard disk of your laptop computer than in a bookstore carrying 60,000 titles. The number of Web pages on the Internet is rumored to have exceeded 500 billion, enough to fill 10 modern aircraft carriers with the equivalent number of 500-page, 453g books.

Such analogies help us visualize the immensity of the information explosion and ratify the concerns that come with it. Web search engines are the only mechanism with which to navigate this avalanche of information, so they should not be mistaken for an optional accessory, one of the buttons to play with, or a tool to locate the nearest pizza store. Search engines are the single most powerful distribution points of knowledge, wealth, and yes, misinformation.

When we talk about Web search, the first name that pops up is, of course, Google. It is not far-fetched to say that Google made the Internet what it is today. It shaped a new generation of people who are strikingly different from their parents. Baby boomers might be the best placed to appreciate this, since they experienced Rock ‘n’ Roll as kids and Google as parents.

Google’s design was based on statistical algorithms. But search technologies that are based on statistical algorithms cannot address the quality of information, simply because high-quality information is not always popular, and popular information is not always high-quality. You can collect statistics until the cows come home, but you cannot expect statistics to produce an effect beyond what they are good for.

In addition, statistics collection systems are backward-looking. They need time for people to make referrals and time to collect them. Therefore, new publications and dynamic pages that change their content frequently are already beyond the scope of the popularity methods, and searching this material is vulnerable to rudimentary techniques of manipulation.

For example, the inefficiencies of today’s search engines have created a new industry called Search Engine Optimization, which focuses on strategies to make Web pages rank high against the popularity criteria of Google-esque search engines. It is a billion-dollar industry. If you have enough money, your Web page can be ranked higher than many others that are more credible or higher quality. Since the emergence of Google, quality information has never been so vulnerable to the power of commercialism.

Information quality, molded in the shadow of Web search, will determine the future of mankind, but ensuring quality will require a revolutionary approach, a technological breakthrough beyond statistics. This revolution is underway, and it is called semantic technology.

The underlying idea behind semantic technology is to teach computers how the world operates. For example, when a computer encounters the word “bill,” it would know that “bill” has 15 different meanings in English. When the computer encounters the phrase “killed the bill,” it would deduce that “bill” can only be a proposed law submitted to a legislature, and that “kill” could mean only “stop.”

This story has been viewed 2387 times.

Comments will be moderated. Remarks containing abusive and obscene language, personal attacks of any kind or promotion will be removed and the user banned.

TOP top