The overall volume of metadata collected by the NSA is reflected in the agency’s secret budget request to US Congress this year. The budget document, disclosed by Snowden, showed that the agency is pouring money and manpower into creating a metadata repository capable of taking in 20 billion “record events” daily and making them available to NSA analysts within 60 minutes.
The spending includes support for the “Enterprise Knowledge System,” which has a US$394 million multi-year budget and is designed to “rapidly discover and correlate complex relationships and patterns across diverse data sources on a massive scale,” according to a 2008 document.
The data is automatically computed to speed queries and discover new targets for surveillance.
A top-secret document titled Better Person Centric Analysis describes how the agency looks for 94 “entity types,” including phone numbers, e-mail addresses and IP addresses. In addition, the NSA correlates 164 “relationship types” to build social networks and what the agency calls “community of interest” profiles, using queries like “travelsWith, hasFather, sentForumMessage, employs.”
A 2009 PowerPoint presentation provided more examples of data sources available in the “enrichment” process, including location-based services like GPS, online social networks, billing records and bank codes for transactions in the US and overseas.
At a US Senate Intelligence Committee hearing on Thursday last week, Alexander was asked if the agency ever collected or planned to collect bulk records about Americans’ locations based on cellphone tower data.
He replied that it was not doing so as part of the call log program authorized by the Patriot Act, but said a fuller response would be classified.
If the NSA does not immediately use the phone and e-mail logging data of a US citizen, it can be stored for later use, at least under certain circumstances, according to several documents.
One 2011 memo said that after a court ruling narrowed the scope of the agency’s collection, the data in question was “being buffered for possible ingest” later.
A year earlier, an internal briefing paper from the NSA Office of Legal Counsel showed that the agency was allowed to collect and retain raw traffic, which includes both metadata and content, about “US persons” for up to five years online and for an additional 10 years offline for “historical searches.”