Backblaze just ordered 100 petabytes’ worth of hard drives, and yes, we’ll use nearly all of them in Q4. In fact, we’ll begin the process of sourcing the Q1 hard drive order in the next few weeks.
What are we doing with all those hard drives? Let’s take a look.
Our First 10 Petabyte Backblaze Vault
Ken clicked the submit button and 10 Petabytes of Backblaze Cloud Storage came online ready to accept customer data. Ken (aka the Pod Whisperer), is one of our Datacenter Operations Managers at Backblaze, and with that one click he activated Backblaze Vault 1093, which was built with 1,200 Seagate 10 TB drives (model: ST10000NM0086). After formatting and configuration of the disks, there is 10.12 Petabytes of free space remaining for customer data. Back in 2011, when Ken started at Backblaze, he was amazed that we had amassed as much as 10 Petabytes of data storage.
The Seagate 10 TB drives we deployed in vault 1093 are helium-filled drives. We had previously deployed 45 HGST 8 TB helium-filled drives where we learned one of the benefits of using helium drives — they consume less power than traditional air-filled drives. Here’s a quick comparison of the power consumption of several high-density drive models we deploy:
|MFR||Model||Fill||Size||Idle (1)||Operating (2)|
|Seagate||ST8000DM002||Air||8 TB||7.2 watts||9.0 watts|
|Seagate||ST8000NM0055||Air||8 TB||7.6 watts||8.6 watts|
|HGST||HUH728080ALE600||Helium||8 TB||5.1 watts||7.4 watts|
|Seagate||ST10000NM0086||Helium||10 TB||4.8 watts||8.6 watts|
|(1) Idle: Average Idle in watts as reported by the manufacturer.
(2) Operating: The maximum operational consumption in watts as reported by the manufacturer — typically for read operations.
I’d like 100 Petabytes of Hard Drives To Go, Please
The 1,200 Seagate 10 TB drives are just the beginning. The next Backblaze Vault will be configured with 12 TB drives which will give us 12.2 petabytes of storage in one vault. We are currently building and adding two to three Backblaze Vaults a month to our cloud storage system, so we are going to need more drives. When we did all of our “drive math,” we decided to place an order for 100 petabytes of hard drives comprised of 10 and 12 TB models. Gleb, our CEO and occasional blogger, exhaled mightily as he signed the biggest purchase order in company history. Wait until he sees the one for Q1.
400 Petabytes of Cloud Storage
When we added Backblaze Vault 1093, we crossed over 400 Petabytes of total available storage. For those of you keeping score at home, we reached 350 Petabytes about 3 months ago as you can see in the chart below.
Backblaze Vault Primer
All of the storage capacity we’ve added in the last two years has been on our Backblaze Vault architecture, with vault 1093 being the 60th one we have placed into service. Each Backblaze Vault is comprised of 20 Backblaze Storage Pods logically grouped together into one storage system. Today, each Storage Pod contains sixty 3 ½” hard drives, giving each vault 1,200 drives. Early vaults were built on Storage Pods with 45 hard drives, for a total of 900 drives in a vault.
A Backblaze Vault accepts data directly from an authenticated user. Each data blob (object, file, group of files) is divided into 20 shards (17 data shards and 3 parity shards) using our erasure coding library. Each of the 20 shards is stored on a different Storage Pod in the vault. At any given time, several vaults stand ready to receive data storage requests.
Drive Stats for the New Drives
In our Q3 2017 Drive Stats report, due out in late October, we’ll start reporting on the 10 TB drives we are adding. It looks like the 12 TB drives will come online in Q4. We’ll also get a better look at the 8 TB consumer and enterprise drives we’ve been following. Stay tuned.
Other Big Data Clouds
We have always been transparent here at Backblaze, including about how much data we store, how we store it, even how much it costs to do so. Very few others do the same. But, if you have information on how much data a company or organization stores in the cloud, let us know in the comments. Please include the source and make sure the data is not considered proprietary. If we get enough tidbits we’ll publish a “big cloud” list.