Computers may be good at crunching numbers, but can they crunch feelings?
The rise of blogs and social networks has fueled a bull market in personal opinion: reviews, ratings, recommendations and other forms of online expression. For computer scientists, this fast-growing mountain of data is opening a tantalizing window onto the collective consciousness of Internet users.
An emerging field known as sentiment analysis is taking shape around one of the computer world’s unexplored frontiers: translating the vagaries of human emotion into hard data.
This is more than just an interesting programming exercise. For many businesses, online opinion has turned into a kind of virtual currency that can make or break a product in the marketplace.
Yet many companies struggle to make sense of the caterwaul of complaints and compliments that swirl around their products online. As sentiment analysis tools begin to take shape, they could not only help businesses improve their bottom lines, but also eventually transform the experience of searching for information online.
Several new sentiment analysis companies are trying to tap into the growing business interest in what is being said online.
“Social media used to be this cute project for 25-year-old consultants,” said Margaret Francis, vice president for product at Scout Labs in San Francisco. Now, she said, top executives “are recognizing it as an incredibly rich vein of market intelligence.”
Scout Labs, which is backed by the venture capital firm started by the CNet founder Halsey Minor, introduced a subscription service that allows customers to monitor blogs, news articles, online forums and social networking sites for trends in opinions about products, services or topics in the news.
In early May, the ticket marketplace StubHub used Scout Labs’ monitoring tool to identify a sudden surge of negative blog sentiment after rain delayed a Yankees-Red Sox game.
Stadium officials mistakenly told hundreds of fans that the game had been canceled, and StubHub denied fans’ requests for refunds, on the grounds that the game had actually been played. But after spotting trouble brewing online, the company offered discounts and credits to the affected fans. It is revaluating its bad weather policy.
“This is a canary in a coal mine for us,” said John Whelan, StubHub’s director of customer service.
Jodange, based in Yonkers, New York, offers a service geared toward online publishers that lets them incorporate opinion data drawn from more than 450,000 sources, including mainstream news sources, blogs and Twitter.
Based on research by Claire Cardie, a former Cornell computer science professor, and Jan Wiebe of the University of Pittsburgh, the service uses a sophisticated algorithm that not only evaluates sentiments about particular topics, but also identifies the most influential opinion holders.
Jodange, whose early investors include the National Science Foundation, is working on a new algorithm that could use opinion data to predict future developments, like forecasting the effect of newspaper editorials on a company’s stock price.
For casual Web surfers, simpler incarnations of sentiment analysis are sprouting up in the form of lightweight tools like Tweetfeel, Twendz and Twitrratr. These sites allow users to take the pulse of Twitter users about particular topics.
A quick search on Tweetfeel, for example, reveals that 77 percent of recent tweeters liked the movie Julie and Julia. But the same search on Twitrratr reveals a few misfires. The site assigned a negative score to a tweet reading “julie and julia was truly delightful!!” That same message ended with “we all felt very hungry afterwards” — and the system took the word “hungry” to indicate a negative sentiment.
While the more advanced algorithms used by Scout Labs, Jodange and Newssift employ advanced analytics to avoid such pitfalls, none of these services works perfectly. “Our algorithm is about 70 to 80 percent accurate,” said Francis, who added that its users can reclassify inaccurate results so the system learns from its mistakes.
Translating the slippery stuff of human language into binary values will always be an imperfect science, however. “Sentiments are very different from conventional facts,” said Seth Grimes, the founder of the suburban Maryland consulting firm Alta Plana, who points to the many cultural factors and linguistic nuances that make it difficult to turn a string of written text into a simple pro or con sentiment. “’Sinful’ is a good thing when applied to chocolate cake,” he said.
The simplest algorithms work by scanning keywords to categorize a statement as positive or negative, based on a simple binary analysis (“love” is good, “hate” is bad). But that approach fails to capture the subtleties that bring human language to life: irony, sarcasm, slang and other idiomatic expressions. Reliable sentiment analysis requires parsing many linguistic shades of gray.
“We are dealing with sentiment that can be expressed in subtle ways,” said
Bo Pang, a researcher at Yahoo who
co-wrote Opinion Mining and Sentiment Analysis, one of the first academic books on sentiment analysis.
To get at the true intent of a statement, Pang developed software that looks at several different filters, including polarity (is the statement positive or negative?), intensity (what is the degree of emotion being expressed?) and subjectivity (how partial or impartial is the source?).
For example, a preponderance of adjectives often signals a high degree of subjectivity, while noun- and verb-heavy statements tend toward a more neutral point of view.
As sentiment analysis algorithms grow more sophisticated, they should begin to yield more accurate results that may eventually point the way to more sophisticated filtering mechanisms. They could become a part of everyday Web use.
“I see sentiment analysis becoming a standard feature of search engines,” said Grimes, who suggests that such algorithms could begin to influence both general-purpose Web searching and more specialized searches in areas like e-commerce, travel reservations and movie reviews.
Pang envisions a search engine that fine-tunes results for users based on sentiment. For example, it might influence the ordering of search results for certain kinds of queries like “best hotel in San Antonio.”
As search engines begin to incorporate more and more opinion data into their results, the distinction between fact and opinion may start blurring to the point where, as David Byrne once put it, “facts all come with points of view.”
Following the shock complete failure of all the recall votes against Chinese Nationalist Party (KMT) lawmakers on July 26, pan-blue supporters and the Chinese Communist Party (CCP) were giddy with victory. A notable exception was KMT Chairman Eric Chu (朱立倫), who knew better. At a press conference on July 29, he bowed deeply in gratitude to the voters and said the recalls were “not about which party won or lost, but were a great victory for the Taiwanese voters.” The entire recall process was a disaster for both the KMT and the Democratic Progressive Party (DPP). The only bright spot for
Aug. 11 to Aug. 17 Those who never heard of architect Hsiu Tse-lan (修澤蘭) must have seen her work — on the reverse of the NT$100 bill is the Yangmingshan Zhongshan Hall (陽明山中山樓). Then-president Chiang Kai-shek (蔣介石) reportedly hand-picked her for the job and gave her just 13 months to complete it in time for the centennial of Republic of China founder Sun Yat-sen’s birth on Nov. 12, 1966. Another landmark project is Garden City (花園新城) in New Taipei City’s Sindian District (新店) — Taiwan’s first mountainside planned community, which Hsiu initiated in 1968. She was involved in every stage, from selecting
Water management is one of the most powerful forces shaping modern Taiwan’s landscapes and politics. Many of Taiwan’s township and county boundaries are defined by watersheds. The current course of the mighty Jhuoshuei River (濁水溪) was largely established by Japanese embankment building during the 1918-1923 period. Taoyuan is dotted with ponds constructed by settlers from China during the Qing period. Countless local civic actions have been driven by opposition to water projects. Last week something like 2,600mm of rain fell on southern Taiwan in seven days, peaking at over 2,800mm in Duona (多納) in Kaohsiung’s Maolin District (茂林), according to
As last month dawned, the Democratic Progressive Party (DPP) was in a good position. The recall campaigns had strong momentum, polling showed many Chinese Nationalist Party (KMT) lawmakers at risk of recall and even the KMT was bracing for losing seats while facing a tsunami of voter fraud investigations. Polling pointed to some of the recalls being a lock for victory. Though in most districts the majority was against recalling their lawmaker, among voters “definitely” planning to vote, there were double-digit margins in favor of recall in at least five districts, with three districts near or above 20 percent in