The leaked presentation about PRISM detailed the ability for the NSA to collect emails, photos, videos, and more from nine companies including: Microsoft, Google, Facebook, and Apple. There has been a tremendous amount of debate about whether PRISM allows “direct” access to these companies’ servers, somewhat limited access through some type of intermediate portal, or only the fulfillment of individual legally-required requests. There have also been denials by all nine of the companies listed as “Current Providers” regarding whether PRISM actually has direct access.
(Note: Backblaze did not share customer data with PRISM. This post is about whether the NSA is using Backblaze Storage Pod technology that we open sourced.)
Regardless what the NSA actually has gotten access to and how they have done so, it’s clear that their intent is to collect an astounding amount of digital data. How much data?
On November 1, 2009, TechCrunch reported (based on a book review of The Secret Sentry: The Untold History of the National Security Agency) that the NSA intends to build a new $2 billion Utah-based datacenter to store a yottabyte of surveillance data by the end of 2015.
It’s nearly impossible to wrap your head around how much data is in a yottabyte, but since Backblaze is in the business of backing up mass volumes of data, we decided to give some context with a blog post a couple weeks later: NSA might want some Backblaze pods
We estimated it would cost over $100 trillion and require datacenters the size of Delaware and Rhode Island combined to store that much data.
To put it in other terms, Facebook announced that it has 250 petabytes of data. A yottabyte is a billion petabytes. The NSA intends to have the capacity to store all of Facebook’s data 4 million times over.
So, why do we think the NSA may be using Backblaze Storage Pods?
With the goal of storing an outrageous amount of data, there would be several design goals:
- Density – being able to fit as much of that data in as small a space as possible is critical in order to not require building state-sized data centers.
- Cost – while the NSA is the largest spy-agency spender in the world, when measuring storage costs in billions and trillions, cost does become a significant driver of possibility.
Both of these are important and something the Backblaze Storage Pods have been optimized for, but the NSA may benefit most from one unusual aspect of our systems: open source hardware.
When SGI bought super-computer maker Cray in 1996, our CTO who worked there at the time said the running joke was, “SGI sold no units this quarter, but made a healthy profit.” That wasn’t magic accounting. It was the NSA requiring purchases not be disclosed. The problem was, revenue still had to be reported.
If the NSA purchased billions of dollars of storage equipment from EMC, NetApp, or Hitachi, it would be nearly impossible to keep that size of an order under wraps.
The beauty of open source hardware is that the NSA can build these systems themselves. How much does the NSA value secrecy?
The CIA and other government agencies
For a brief moment several years ago, Backblaze sold Storage Pods. (We haven’t since and don’t now.) At the time, two government agencies bought Storage Pods from us with exactly this goal in mind: they wanted a few units to test, if they worked, they would build a team internally to manufacture their own Storage Pods based on the open source hardware specifications.
Neither of those agencies was going to use that storage for surveillance.
On September 8, 2011, however, we received an email titled, “Visit with the CIA”. It stated:
You also might be aware that the CIA has a new five year funded initiative to centralize data services into a large private cloud. I have information on that if you are interested. The project is in discovery mode now and I am assisting the Office of the CTO in becoming aware of all the potential technologies that might be deployed. I am organizing a technology tour ( my fourth one) concentrating on infrastructure and security.
A few days later they sent us a presentation:
And a week later a few folks arrived at our office and the agenda was:
The meeting agenda should be an overview of your products and technology as it relates to the mission of the new private cloud project ( see attachments) that is now funded beginning this year. The meeting should be very interactive allowing discussions about applications of your technology.
During the meeting, the team was interested in how to store large volumes of data and cost efficiency. Neither the identity of nor the data of our customers was ever of interest or brought up.
So does the NSA store surveillance data on Backblaze Storage Pods?
We don’t know for sure and certainly the NSA is certainly not publishing their storage architecture. However, between the multiple government agencies using and exploring Backblaze Storage Pods and the pods characteristics as highly-dense, cost-efficient, and open source systems, certainly makes them a very likely candidate. Perhaps another leak will answer that question in the future!