How algorithms rule the world

The NSA revelations highlight the role sophisticated algorithms play in sifting through masses of data, but more surprising is their widespread use in our everyday lives — should we be more wary of their power?

By Leo Hickman  /  The Guardian, LONDON

Sat, Jul 06, 2013 - Page 9

On Aug. 4, 2005, the police department of Memphis, Tennessee, made so many arrests over a three-hour period that it ran out of vehicles to transport the detainees to jail. Three days later, 1,200 people had been arrested across the city — a new police department record. Operation Blue Crush was hailed a huge success.

Larry Godwin, the city’s new police director at the time, quickly rolled out the scheme and by 2011 crime across the city had fallen by 24 percent. When it was revealed that Blue Crush faced budget cuts earlier this year, there was a public outcry.

“CRUSH” policing is now perceived to be so successful that it has reportedly been mimicked across the globe, including in countries such as Poland and Israel. In 2010, it was reported that two police forces in Britain were using it, but their identities were not revealed.

CRUSH stands for “Criminal Reduction Utilizing Statistical History.”

Translated, it means predictive policing, or, more accurately, police officers guided by algorithms. A team of criminologists and data scientists at the University of Memphis first developed the technique using IBM predictive analytics software. Put simply, they compiled crime statistics from across the city over time and overlaid it with other datasets — such as social housing maps and outside temperatures — then instructed algorithms to search for correlations in the data to identify crime “hot spots.” The police then flooded those areas with highly targeted patrols.

“It’s putting the right people in the right places on the right day at the right time,” said Richard Janikowski, an associate professor in the Department of Criminology and Criminal Justice at the University of Memphis, when the scheme was launched.

However, not everyone is comfortable with the idea. Some critics have dubbed it Minority Report policing, in reference to the science-fiction film in which psychics are used to guide a “precrime” police unit.

The use of algorithms in policing is one example of their increasing influence on our lives and, as their ubiquity spreads, so too does the debate on whether we should allow ourselves to become so reliant on them — and who, if anyone, is policing their use.

Such concerns were sharpened further by the continuing revelations about how the US National Security Agency (NSA) has been using algorithms to help it interpret the colossal amounts of data it has collected from its covert dragnet of international telecommunications.

“For datasets the size of those the NSA collect, using algorithms is the only way to operate for certain tasks,” said James Ball, the Guardian’s data editor and part of the paper’s NSA files reporting team.

“The problem is how the rules are set: It’s impossible to do this perfectly. If you’re, say, looking for terrorists, you’re looking for something very rare. Set your rules too tight and you’ll miss lots of, probably most, potential terror suspects, but set them more broadly and you’ll drag lots of entirely blameless people into your dragnet, who will then face further intrusion or even formal investigation. We don’t know exactly how the NSA or GCHQ [Britain’s Government Communications Headquarters] use algorithms — or how extensively they’re applied, but we do know they use them, including on the huge data trawls revealed in the Guardian,” Ball said.

From dating Web sites and London trading floors to online retailing and Internet searches (Google’s search algorithm is now a more closely guarded commercial secret than the recipe for Coca-Cola), algorithms are increasingly determining our collective futures.

“Bank approvals, store cards, job matches and more all run on similar principles,” Ball said. “The algorithm is the god from the machine powering them all, for good or ill.”

So what is an algorithm?

Panos Parpas, a lecturer in the quantitative analysis and decision science (“QUADS”) section of the Department of Computing at Imperial College London, says that wherever we use computers, we rely on algorithms.

“There are lots of types, but algorithms, explained simply, follow a series of instructions to solve a problem. It’s a bit like how a recipe helps you to bake a cake. Instead of having generic flour or a generic oven temperature, the algorithm will try a range of variations to produce the best cake possible from the options and permutations available,” Parpas said.

Parpas added that algorithms are not a new phenomenon.

“They’ve been used for decades — back to Alan Turing and the codebreakers, and beyond — but the current interest in them is due to the vast amounts of data now being generated, and the need to process and understand it. They are now integrated into our lives. On the one hand, they are good because they free up our time and do mundane processes on our behalf. The questions being raised about algorithms at the moment are not about algorithms per se, but about the way society is structured with regard to data use and data privacy. It’s also about how models are being used to predict the future. There is currently an awkward marriage between data and algorithms. As technology evolves, there will be mistakes, but it is important to remember they are just a tool. We shouldn’t blame our tools,” he said.

The “mistakes” Parpas refers to are events such as the “flash crash” of May 6, 2010, when the Dow Jones Industrial Average fell 1,000 points in just a few minutes, only to see the market regain itself 20 minutes later.

The reasons for the sudden plummet has never been fully explained, but most financial observers blame a “race to the bottom” by competing quantitative trading (quants) algorithms widely used to perform high-frequency trading.

Scott Patterson, a Wall Street Journal reporter and author of The Quants, likens the use of algorithms on trading floors to flying a plane on autopilot. The vast majority of trades these days are performed by algorithms, but when things go wrong, as what happened during the flash crash, humans can intervene.

“By far the most complicated algorithms are to be found in science, where they are used to design new drugs or model the climate, but they are done within a controlled environment with clean data. It is easy to see if there is a bug in the algorithm,” Parpas said. “The difficulties come when they are used in the social sciences and financial trading, where there is less understanding of what the model and output should be, and where they are operating in a more dynamic environment. Scientists will take years to validate their algorithm, whereas a trader has just days to do so in a volatile environment.”

Most investment banks now have a team of computer science doctorates coding algorithms, said Parpas, who used to work on such a team.

“With City trading, everyone is running very similar algorithms,” he said. “They all follow each other, meaning you get results such as the flash crash. They use them to speed up the process and to break up big trades to disguise them from competitors when a big investment is being made. They will run new algorithms for a few days to test them, before letting them loose with real money. In currency trading, an algorithm lasts for about two weeks before it is surpassed by a new one. In equities, which is a less complicated market, they will run for a few months before a new one replaces them. It takes a day or two to write a currency algorithm. It’s hard to find out information about them because, for understandable reasons, they don’t like to advertise when they are successful. Goldman Sachs, though, has a strong reputation for having a brilliant team of algorithm scientists. PhD students in this field will usually be employed within a few months by an investment bank.”

The idea that the world’s financial markets — and, hence, the well-being of our pensions, shareholdings, savings, etc — are now largely determined by algorithmic vagaries is unsettling enough for some. However, as the NSA revelations have revealed, the bigger questions surrounding algorithms center on governance and privacy.

How are they being used to access and interpret “our” data, and by whom?

Ian Brown, associate director of the University of Oxford’s Cyber Security Centre, says we all urgently need to consider the implications of allowing commercial interests and governments to use algorithms to analyze our habits.

“Most of us assume that ‘big data’ is munificent. The laws in the US and UK say that much of this [the NSA revelations] is allowed, it’s just that most people don’t realize yet, but there is a big question about oversight. We now spend so much of our time online that we are creating huge data-mining opportunities,” Brown said.

Brown says algorithms are now programmed to look for “indirect, non-obvious” correlations in data.

“For example, in the US, healthcare companies can now make assessments about a good or bad insurance risk based, in part, on the distance you commute to work,” he said. “They will identity the low-risk people and market their policies at them. Over time, this creates or exacerbates societal divides.”

University of Pennsylvania professor Oscar Gandy has done research into “secondary racial discrimination,” whereby credit and health insurance, which relies greatly on zip codes, can discriminate against racial groups because they happen to live very close to other racial groups that score badly.

Brown harbors similar concerns over the use of algorithms to aid policing, as seen in Memphis where Crush’s algorithms have reportedly linked some racial groups to particular crime.

“If you have a group that is disproportionately stopped by the police, such tactics could just magnify the perception they have of being targeted,” Brown said.

Viktor Mayer-Schonberger, professor of Internet governance and regulation at the Oxford Internet Institute, also warns against humans seeing causation when an algorithm identifies a correlation in vast swaths of data.

“This transformation presents an entirely new menace: penalties based on propensities,” he writes in his book Big Data: A Revolution That Will Transform How We Live, Work and Think, which is co-authored by Kenneth Cukier, The Economist data editor.

“That is the possibility of using big-data predictions about people to judge and punish them even before they’ve acted. Doing this negates ideas of fairness, justice and free will. In addition to privacy and propensity, there is a third danger. We risk falling victim to a dictatorship of data, whereby we fetishize the information, the output of our analyzes and end up misusing it. Handled responsibly, big data is a useful tool of rational decisionmaking. Wielded unwisely, it can become an instrument of the powerful, who may turn it into a source of repression, either by simply frustrating customers and employees or, worse, by harming citizens,” Mayer-Schonberger writes.

Mayer-Schonberger presents two very different real-life scenarios to illustrate how algorithms are being used. First, he explains how the analytics team working for US retailer Target can now calculate whether a woman is pregnant and, if so, when she is due to give birth.

“They noticed that these women bought lots of unscented lotion at around the third month of pregnancy and that a few weeks later they tended to purchase supplements, such as magnesium, calcium and zinc. The team ultimately uncovered around two dozen products that, used as proxies, enabled the company to calculate a ‘pregnancy prediction’ score for every customer who paid with a credit card or used a loyalty card or mailed coupons. The correlations even let the retailer estimate the due date within a narrow range, so it could send relevant coupons for each stage of the pregnancy,” he writes.

Harmless targeting, some might say, but what happens, as has already reportedly occurred, when a father is mistakenly sent diaper discount vouchers instead of his teenage daughter, who a retailer has identified is pregnant before her own father knows?

Mayer-Schonberger’s second example throws up even more potential dilemmas and pitfalls.

“Parole boards in more than half of all US states use predictions founded on data analysis as a factor in deciding whether to release somebody from prison or to keep him incarcerated,” he writes.

Christopher Steiner, author of Automate This: How Algorithms Came to Rule Our World, has identified a wide range of instances where algorithms are being used to provide predictive insights — often within the creative industries.

In his book, he tells the story of a Web site developer called Mike McCready who has developed an algorithm to analyze and rate hit records. Using a technique called advanced spectral deconvolution, the algorithm breaks up each hit song into its component parts — melody, tempo, chord progression and so on — and then uses that to determine common characteristics across a range of No. 1 records.

McCready’s algorithm correctly predicted — before they were even released — that the debut albums by both Norah Jones and Maroon 5 contained a disproportionately high number of hit records.

The next logical step — for profit-seeking record companies, perhaps — is to use algorithms to replace the human songwriter, but is that really an attractive proposition?

“Algorithms are not yet writing pop music,” Steiner said.

He pauses, then laughs.

“Not that we know of, anyway. If I were a record company executive or pop artist, I wouldn’t tell anyone if I’d had a No. 1 written by an algorithm,” he said.

Steiner says we should not automatically see algorithms as a malign influence on our lives, but we should debate their ubiquity and their wide range of uses.

“We’re already halfway towards a world where algorithms run nearly everything. As their power intensifies, wealth will concentrate towards them. They will ensure the 1 percent to 99 percent divide gets larger. If you’re not part of the class attached to algorithms, then you will struggle. The reason why there is no popular outrage about Wall Street being run by algorithms is because most people don’t yet know or understand it,” Steiner said.

However, Steiner says we should welcome their use when they are used appropriately to aid and speed our lives.

“Retail algorithms don’t scare me,” he said. “I find it useful when Amazon tells me what I might like. In the US, we know we will not have enough GP [general practitioner] doctors in 15 years, as not enough are being trained, but algorithms can replace many of their tasks. Pharmacists are already seeing some of their prescribing tasks replaced by algorithms. Algorithms might actually start to create new, mundane jobs for humans. For example, algorithms will still need a human to collect blood and urine samples for them to analyze.”

There can be a fine line, though, between “good” and “bad” algorithms.

“I don’t find the NSA revelations particularly scary. At the moment, they just hold the data,” Steiner said. “Even the best data scientists would struggle to know what to do with all that data, but it’s the next step that we need to keep an eye on. They could really screw up someone’s life with a false prediction about what they might be up to.”