It is a rare criticism of elite American university students that they do not think big enough. But that is exactly the complaint from some of the largest technology companies and the federal government.
At the heart of this criticism is data. Researchers and workers in fields as diverse as biotechnology, astronomy and computer science will soon find themselves overwhelmed with information. Better telescopes and genome sequencers are as much to blame for this data glut as are faster computers and bigger hard drives.
While consumers are just starting to comprehend the idea of buying external hard drives for the home capable of storing a terabyte of data, computer scientists need to grapple with data sets thousands of times as large and growing ever larger. (A single terabyte equals 1,000 gigabytes and could store about 1,000 copies of the Encyclopedia Britannica.)
The next generation of computer scientists has to think in terms of what could be described as Internet scale. Facebook, for example, uses more than 1 petabyte of storage space to manage its users’ 40 billion photos. (A petabyte is about 1,000 times as large as a terabyte, and could store about 500 billion pages of text.)
It was not long ago that the notion of one company having anything close to 40 billion photos would have seemed tough to fathom. Google, meanwhile, churns through 20 times that amount of information every single day just running data analysis jobs. In short order, DNA sequencing systems too will generate many petabytes of information a year.
“It sounds like science fiction, but soon enough, you’ll hand a machine a strand of hair, and a DNA sequence will come out the other side,” said Jimmy Lin, an associate professor at the University of Maryland, during a technology conference last week.
The big question is whether the person on the other side of that machine will have the wherewithal to do something interesting with an almost limitless supply of genetic information.
At the moment, companies like IBM and Google have their doubts.
For the most part, university students have used rather modest computing systems to support their studies. They are learning to collect and manipulate information on personal computers or what are known as clusters, where computer servers are cabled together to form a larger computer. But even these machines fail to churn through enough data to really challenge and train a young mind meant to ponder the mega-scale problems of tomorrow.
“If they imprint on these small systems, that becomes their frame of reference and what they’re always thinking about,” said Jim Spohrer, a director at IBM’s Almaden Research Center.
Two years ago, IBM and Google set out to change the mindset at universities by giving students broad access to some of the largest computers on the planet. The companies then outfitted the computers with software that Internet companies use to tackle their toughest data analysis jobs.
And, rather than building a big computer at each university, the companies created a system that let students and researchers tap into giant computers over the Internet.
This year, the National Science Foundation, a federal government agency, issued a vote of confidence for the project by splitting US$5 million among 14 universities that want to teach their students how to grapple with big data questions.
The types of projects the 14 universities have already tackled veer into the mind-bending.
For example, Andrew Connolly, an associate professor at the University of Washington, has turned to the high-powered computers to aid his work on the evolution of galaxies. Connolly works with data gathered by large telescopes that inch their way across the sky taking pictures of various objects.
The largest public database of such images available today comes from the Sloan Digital Sky Survey, which has about 80 terabytes of data, Connolly said. A new system called the Large Synoptic Survey Telescope is set to take more detailed images of larger chunks of the sky and produce about 30 terabytes of data each night. Connolly’s graduate students have been set to work trying to figure out ways of coping with this much information.
Purdue looks to carry out techniques used to map the interactions between people in social networks into the biological realm. Researchers are creating complex diagrams that illuminate the links between chemical reactions taking place in cells.
A similar effort at the University of California, Santa Barbara, centers on making a simple software interface — akin to the Google search bar — that will let researchers examine huge biological data sets for answers to specific queries.
Lin has encouraged his students to illuminate data with the help of Hadoop, an open-source software package that companies like Facebook and Yahoo use to split vast amounts of information into more manageable chunks.
One of these projects included a deep dive into the reams of documents released after the government’s probe into Enron, to create an analysis system that could identify how one employee’s internal communications had been connected to those from other employees and who had originated a specific decision.
Lin shares the opinion of numerous other researchers that learning these types of analysis techniques will be vital for students in the coming years.
“Science these days has basically turned into a data-management problem,” Lin said.
By donating their computing wares to the universities, Google and IBM hope to train a new breed of engineers and scientists to think in Internet scale.
Of course, it’s not all good will backing these gestures. IBM is looking for big data experts that can complement its consulting in areas like health care and financial services. It has already started working with customers to put together analytics systems built on top of Hadoop. Meanwhile, Google promotes just about anything that creates more information to index and search.
Nonetheless, the universities and the government benefit from IBM and Google providing access to big data sets for experiments, simpler software and their computing wares.
“Historically, it has been tough to get the type of data these researchers need out of industry,” said James French, a research director at the National Science Foundation.
“But we’re at this point where a biologist needs to see these types of volumes of information to begin to think about what is possible in terms of commercial applications,” he said.
Taiwan is rapidly accelerating toward becoming a “super-aged society” — moving at one of the fastest rates globally — with the proportion of elderly people in the population sharply rising. While the demographic shift of “fewer births than deaths” is no longer an anomaly, the nation’s legal framework and social customs appear stuck in the last century. Without adjustments, incidents like last month’s viral kicking incident on the Taipei MRT involving a 73-year-old woman would continue to proliferate, sowing seeds of generational distrust and conflict. The Senior Citizens Welfare Act (老人福利法), originally enacted in 1980 and revised multiple times, positions older
The Chinese Nationalist Party (KMT) has its chairperson election tomorrow. Although the party has long positioned itself as “China friendly,” the election is overshadowed by “an overwhelming wave of Chinese intervention.” The six candidates vying for the chair are former Taipei mayor Hau Lung-bin (郝龍斌), former lawmaker Cheng Li-wen (鄭麗文), Legislator Luo Chih-chiang (羅智強), Sun Yat-sen School president Chang Ya-chung (張亞中), former National Assembly representative Tsai Chih-hong (蔡志弘) and former Changhua County comissioner Zhuo Bo-yuan (卓伯源). While Cheng and Hau are front-runners in different surveys, Hau has complained of an online defamation campaign against him coming from accounts with foreign IP addresses,
Taiwan’s business-friendly environment and science parks designed to foster technology industries are the key elements of the nation’s winning chip formula, inspiring the US and other countries to try to replicate it. Representatives from US business groups — such as the Greater Phoenix Economic Council, and the Arizona-Taiwan Trade and Investment Office — in July visited the Hsinchu Science Park (新竹科學園區), home to Taiwan Semiconductor Manufacturing Co’s (TSMC) headquarters and its first fab. They showed great interest in creating similar science parks, with aims to build an extensive semiconductor chain suitable for the US, with chip designing, packaging and manufacturing. The
Former Chinese Nationalist Party (KMT) lawmaker Cheng Li-wun (鄭麗文) on Saturday won the party’s chairperson election with 65,122 votes, or 50.15 percent of the votes, becoming the second woman in the seat and the first to have switched allegiance from the Democratic Progressive Party (DPP) to the KMT. Cheng, running for the top KMT position for the first time, had been termed a “dark horse,” while the biggest contender was former Taipei mayor Hau Lung-bin (郝龍斌), considered by many to represent the party’s establishment elite. Hau also has substantial experience in government and in the KMT. Cheng joined the Wild Lily Student