The Backblaze Storage Pod

Our secret to the least expensive cloud storage in the world!

The History

Offering both an unlimited online backup service and the world's least expensive cloud storage service takes a lot of data storage. When Backblaze started out we needed a way to store our customer's data inexpensively and safely. We developed the Backblaze Storage Pod as the key building block of our cloud storage. Later we grouped 20 Storage Pods into a Backblaze Vault to optimize the reliability and durability of the entire system. A Storage Pod consists of 45, and now 60, hard drives in a 4U sized server made from commercially available parts. Over the years we've introduced new and improved designs and with each new version we open-source the hardware design. That's right, anyone can build their own high capacity storage server for as little as a nickel ($0.05) per Gigabyte, and many people have done exactly that. Join the millions of people who have read how we've been making data storage affordable since 2009.

Photo of a backblaze server

The Ecosystem

After the first blog post, a few resourceful people started building Backblaze Storage Pods for themselves and began to send in pictures and stories of their home made pods. They loved the fact that we published the parts list along with the instructions so they could modify their pod to meet their unique needs. Even companies like Shutterfly and Netflix were inspired to build their own storage pods.

Resources

Storage Pods and Vault Pods

The Backblaze Storage Pod is just one part of building a cloud storage service. In order to boot the machine you will need to have a software layer. Backblaze uses all free software, Debian and Apache, to connect the pods to our network. Starting at the bottom, there are 45 hard drives connected through SATA controllers. We then use the fdisk tool on Linux to create one partition per drive.

At this point a Storage Pod can have one of two personalities: an individual Storage Pod or a Backblaze Vault Storage Pod. For the individual Storage Pods we cluster 15 hard drives into a single RAID6 volume with two parity drives (out of the 15). The RAID Array is created with the mdadm utility.

For Backblaze Vault Storage Pods each is one of 20 pods needed to create a Backblaze Vault. A Backblaze Vault divides up a file into 20 pieces (17 data and 3 parity) and places a piece of the file on each of the 20 Storage Pods in the Vault. We use our own implementation of Reed-Solomon to encode and distribute the files across the 20 pods, achieving 99.99999% data durability. We open-sourced our Reed-Solomon encoding implementation as well.

For both Storage Pods and Vault Pods, we use the EXT4 file system and only allow access to these totally self-contained systems through HTTPS running custom Backblaze application layer logic in Apache Tomcat. One of the most important concepts here is that to store or retrieve data with a Backblaze Storage Pod or a Backblaze Vault Pod, it is always through HTTPS. There is no iSCSI, no NFS, no SQL, no Fibre Channel. None of those technologies scales as cheaply, reliably, goes as big, nor can be managed as easily as pods with their own IP address waiting for requests on HTTPS. We built our own software layer to monitor pod and vaults, decide where to store data and how to encrypt it, deduplicate it, and index it.