Petabytes on a budget: How to build cheap cloud storage

By | September 1st, 2009

Backblaze 67 Terabyte Server

To get the latest updates and information
on ALL of the Storage Pod versions. Click Here.

At Backblaze, we provide unlimited storage to our customers for only $5 per month, so we had to figure out how to store hundreds of petabytes of customer data in a reliable, scalable way—and keep our costs low. After looking at several overpriced commercial solutions, we decided to build our own custom Backblaze Storage Pods: 67 terabyte 4U servers for $7,867.

What we actually provide for our customers is online backup for home and business online backup for work. However, in this post, we’ll share how to make one of these storage pods, and you’re welcome to use this design. Our hope is that by sharing, others can benefit and, ultimately, refine this concept and send improvements back to us. Evolving and lowering costs is critical to our continuing success at Backblaze.

Below is a video that shows a 3-D model of the Backblaze Storage Pod. Continue reading to learn the exact details of the design.

You can download the full 3-D model of the Backblaze Storage Pod here.

Backblaze Needs Plenty of Reliable, Cheap Storage

To say that Backblaze needs lots of storage is an understatement. We’re a backup service, so our datacenter contains a complete copy of all of our customers’ data, plus multiple versions of files that change. In rough terms, every time one of our customers buys a hard drive, Backblaze needs another hard drive. A long time ago we stopped measuring storage in our datacenter in gigabytes or terabytes and started measuring in petabytes.

To get a sense of what this looks like, here is a shot of me deploying new pods in our datacenter. The small stack of six pods in the rack I’m working on contains just under half a petabyte of storage.

Tim in Datacenter

To offer our service at a reasonable price, we need affordable storage at a multi-petabyte scale.

No One Sells Cheap Storage, so We Designed It

Before realizing that we had to solve this storage problem ourselves, we considered Amazon S3, Dell or Sun Servers, NetApp Filers, EMC SAN, etc. As we investigated these traditional off-the-shelf solutions, we became increasingly disillusioned by the expense. When you strip away the marketing terms and fancy logos from any storage solution, data ends up on a hard drive. But when we priced various off-the-shelf solutions, the cost was 10 times as much (or more) than the raw hard drives. Here’s a comparison chart of the price for one petabyte from various venders:
Cost of a Petabyte Chart
Based on the expense, we decided to build our own Backblaze Storage Pods. We had two primary goals: Keep upfront costs low by using consumer-grade drives and readily available commodity components and be as power and space efficient as possible by using green components and squeezing a lot of storage into a small box.

The result is a 4U rack-mounted Linux-based server that contains 67 terabytes at a material cost of $7,867, the bulk of which goes to purchase the drives themselves. This translates to just three-tenths of one penny per gigabyte per month over the course of three years. Even including the surrounding costs—such as electricity, bandwidth, space rental, and IT administrators’ salaries—Backblaze spends one-tenth of the price in comparison to using Amazon S3, Dell Servers, NetApp Filers, or an EMC SAN.

What Makes a Backblaze Storage Pod

A Backblaze Storage Pod is a self-contained unit that puts storage online. It’s made up of a custom metal case with commodity hardware inside. Specifically, one pod contains one Intel Motherboard with four SATA cards plugged into it. The nine SATA cables run from the cards to nine port multiplier backplanes that each have five hard drives plugged directly into them (45 hard drives in total).
Backblaze Pod Items

Above is an exploded diagram, and you can see a detailed parts list in Appendix A at the bottom of this post. The two most important factors to note are that the cost of the hard drives dominates the price of the overall pod and that the rest of the system is made entirely of commodity parts.

Wiring It Up: How to Assemble a Backblaze Storage Pod

The power wiring diagram of a Backblaze Storage Pod is seen below. Power supply units (PSUs) provide most of their power on two different voltages: 5V and 12V. We use two power supplies in the pod because 45 drives draw a lot of 5V power, yet high wattage ATX PSUs provide most of their power on 12V. This is not an accident: 1,500 watt and larger ATX power supplies are designed for powerful 3-D graphics cards that need the extra power on the 12V rail. We could have switched to a power supply designed for servers, but two ATX PSUs are cheaper.
Server Power Wiring Diagram

PSU1 powers the front three fans and port multiplier backplanes 1,2,3,4, and 7. PSU2 powers everything else. (See Appendix A for a detailed list of the custom connectors on each PSU.) To power the port multiplier backplanes, the power cables run from the PSUs through four holes in the divider metal plate that holds the fans at the center of the case (near the base of the fans) and then continue to the underside of the nine backplanes. Each port multiplier backplane has two molex male connectors on the underside. Hard drives draw the most power during initial spin-up, so if you power up both PSUs at the same time, it can draw a large (14 amp) spike of 120V power from the socket. We recommend powering up PSU1 first, waiting until the drives are spun-up (and the power draw decreases to a reasonable level), and then powering up PSU2. Fully booted, the entire pod will draw approximately 4.8 amps idle and up to 5.6 amps under heavy load.

Below is a picture of a partially assembled Backblaze Storage Pod (click on the photo for a larger image). The metal case has screws mounted on the bottom, facing upward, where we attach nylon standoffs (the small white pieces in the picture below). Nylon helps dampen vibration, and this dampening is a critical aspect of server design. The circuit boards shown on top of the nylon standoffs are a few of the nine SATA port multiplier backplanes that take a single SATA connection on their underside and allow five hard drives to be mounted vertically and plugged into the topside of the board. All the power and SATA cables run underneath the port multiplier backplanes. One of the backplanes in the picture below is fully populated with hard drives to show the positioning.

Backblaze Server Partial Assembly

A note about drive vibration: The drives vibrate too much if you leave them sitting as shown in the picture above, so we add an “anti-vibration sleeve” (essentially a rubber band) around the hard drive in between the red metal grid and the drives. This seats the drives tightly in the rubber. We also lay a large (16″ x 17″ x 1/8″) piece of foam along top of the hard drives after all 45 are in the case. The lid then screws down on top of the foam to hold the drives securely. In the future, we will dedicate an entire blog post to vibration.

The SATA wiring diagram is seen below.
SATA Wiring Diagram
The Intel Motherboard has four SATA cards plugged into it: three SYBA two-port SATA cards and one Addonics four-port card. The nine SATA cables connect to the top of the SATA cards and run in tandem with the power cables. All nine SATA cables measure 36 inches and use locking 90-degree connectors on the backplane end and non-locking straight connectors into the SATA cards.

A note about SATA chipsets: Each of the port multiplier backplanes has a Silicon Image SiI3726 chip so that five drives can be attached to one SATA port. Each of the SYBA two-port PCIe SATA cards has a Silicon Image SiI3132, and the four-port PCI Addonics card has a Silicon Image SiI3124 chip. We use only three of the four available ports on the Addonics card because we have only nine backplanes. We don’t use the SATA ports on the motherboard because, despite Intel’s claims of port multiplier support in their ICH10 south bridge, we noticed strange results in our performance tests. Silicon Image pioneered port multiplier technology, and their chips work best together.

A Backblaze Storage Pod Runs Free Software

A Backblaze Storage Pod isn’t a complete building block until it boots and is on the network. The pods boot 64-bit Debian 4 Linux and the JFS file system, and they are self-contained appliances, where all access to and from the pods is through HTTPS. Below is a layer cake diagram.
Software Layering Cake Diagram
Starting at the bottom, there are 45 hard drives exposed through the SATA controllers. We then use the fdisk tool on Linux to create one partition per drive. On top of that, we cluster 15 hard drives into a single RAID6 volume with two parity drives (out of the 15). The RAID6 is created with the mdadm utility. On top of that is the JFS file system, and the only access we then allow to this totally self-contained storage building block is through HTTPS running custom Backblaze application layer logic in Apache Tomcat 5.5. After taking all this into account, the formatted (useable) space is 87 percent of the raw hard drive totals. One of the most important concepts here is that to store or retrieve data with a Backblaze Storage Pod, it is always through HTTPS. There is no iSCSI, no NFS, no SQL, no Fibre Channel. None of those technologies scales as cheaply, reliably, goes as big, nor can be managed as easily as stand-alone pods with their own IP address waiting for requests on HTTPS.

A Backblaze Storage Pod is a Building Block

We have been extremely happy with the reliability and excellent performance of the pods, and a Backblaze Storage Pod is a fully contained storage server. But the intelligence of where to store data and how to encrypt it, deduplicate it, and index it is all at a higher level (outside the scope of this blog post). When you run a datacenter with thousands of hard drives, CPUs, motherboards, and power supplies, you are going to have hardware failures—it’s irrefutable. Backblaze Storage Pods are building blocks upon which a larger system can be organized that doesn’t allow for a single point of failure. Each pod in itself is just a big chunk of raw storage for an inexpensive price; it is not a “solution” in itself.

Cloud Storage: The Next Step

The first step to building cheap cloud storage is to already have cheap storage, and above we demonstrate how to create your own. If all you need is cheap storage, this may suffice. If you need to build a cloud, you’ve got more work ahead of you.

Building a cloud includes not only deploying a large quantity of hardware, but, critically, deploying software to manage it. At Backblaze we have developed software that de-duplicates and chops data into blocks; encrypts and transfers it for backup; reassembles, decrypts, re-duplicates, and packages the data for recovery; and monitors and manages the entire cloud storage system. This process is proprietary technology that we have developed over the years.

You may have your own system for this process and incorporate the Backblaze Storage Pod design, or you may simply seek inexpensive storage that won’t be deployed as part of a cloud. In either case, you’re free to use the storage pod design above. If you do, we would appreciate credit at Backblaze and welcome any insights, though this isn’t a requirement. Please note that because we’re not selling the design or the storage pods themselves, we provide no support nor warranties.

Coming next: In the next few weeks, we’ll talk about iPhone vibration sensors, swiss cheese pod designs, why electricity costs more than bandwidth, and more about the design of big cloud storage.

Credits and Standing on the Shoulders of Giants

The Backblaze Storage Pod design would not have been possible without an enormous amount of help, usually requested with little notice, from some amazingly smart and generous people who answered our questions, worked with us, and provided key insights at critical moments. First, we thank Chris Robertson for the inspiration to build our own storage and his early work on prototypes; Kurt Schaefer for advice on metal work and the concept of “furniture” for circuit boards; Dominic Giampaolo from Apple Computer for his advice on hard drives, vibration, and certifications; Stuart Cheshire from Apple Computer and Nick Tingle from Alcatel-Lucent for low-level network advice; Aaron Emigh (EVP & GM, Core Technology) at Six Apart for his help on initial design work; Gary Orenstein for insight into drive reliability and the storage industry in general; Jonathan Beck for invaluable advice on vibration, fans, cooling, and case design; Steve Smith (Senior Design Manager), Imran Pasha (Director of Software Engineering), and Alex Chervet (Director of Strategic Marketing) at Silicon Image who helped us debug SATA protocol problems and loaned us 10 different SATA cards to test against; James Lee at Chyang Fun Industries in Taiwan for customizing SATA boards to simplify our design; Wes Slimick, Richard Crockett, Don Shields, and Robert Knowles from Western Digital for their help debugging Western Digital drive logs; Christa Carey, Jennifer Hurd, and Shirley Evely at Protocase for putting up with hundreds of small 3-D case design tweaks; Chester Yeung at Central Computer for coming through quickly and repeatedly with locally supplied parts when it really mattered; Mason Lee at Zippy for power supply advice and custom wiring harnesses; and Angela Lai for knowing just the right people and providing gracious introductions.

Finally, we thank the thousands of engineers who slaved away for millions of hours to bring us the pod components that are either inexpensive or totally free, such as the Intel Processor, Gigabit Ethernet, ridiculously dense hard drives, Linux, Tomcat, JFS, etc. We realize we’re standing on the shoulders of giants.

Appendix A: Detailed Backblaze Storage Pod Parts List

1.5 TB SATA Data Drive
Seagate ST31500341AS 1.5TB Barracuda 7200.11 SATA 3Gb/s 3.5″
4U EnclosureDownload the 3-D model
Custom Designed 4U Red Backblaze Storage Pod Enclosure
760 Watt Power Supply
Zippy PSM-5760 760 Watt Power Supply with Custom Wiring (qtys of 200+)
Port Muliplier Backplanes
Chyang Fun Industry (CFI Group) CFI-B53PM 5 Port Backplane (SiI3726)
3.3 GHz Intel Core 2 CPU
Intel E8600 Wolfdale 3.33 GHz LGA 775 65W Dual-Core Processor
2 Port PCIe SATA II Card
Syba SD-SA2PEX-2IR PCI Express SATA II Controller Card (SiI3132)
4 Port PCI SATA II Card
Addonics ADSA4R5 4-Port SATA II PCI Controller (SiI3124)
Intel BOXDG43NB LGA 775 G43 ATX Motherboard
Case Fan
Mechatronics G1238M12B1-FSR 120 x 38 mm 2,800 RPM 12V Fan
4GB DDR2 800 RAM
Kingston KVR800D2N6K2/4G 2x2GB 240-Pin SDRAM DDR2 800 (PC2 6400)
80 GB PATA Boot Drive
Western Digital Caviar WD800BB 80GB 7200 RPM IDE Ultra ATA100 3.5″
On/Off Switch
FrozenCPU ele-302 Bulgin Vandal Momentary LED Power Switch 12″ 2-pin
SATA II Cable, 90 Degrees/straight with Locking Connectors
Nylon Backplane Standoffs
Fastener SuperStore 1/4″ Round Nylon Standoffs Female/Female 4-40 x 3/4″
HD Anti-Vibration Sleeves
Aero Rubber Co. 3.0 x .500 inch EPDM (0.03″ Wall)
Power Supply Vibration Dampener
Vantec VDK-PSU Power Supply Vibration Dampener
Fan Mount (front)
Acousti Ultra Soft Anti-Vibration Fan Mount AFM02
Fan Mount (middle)
Acousti Ultra Soft Anti-Vibration Fan Mount AFM03
Nylon Screws
Small Parts MPN-0440-06P-C Nylon Pan Head Phillips Screw 4-40 x 3/8″
Foam Rubber Pad
House of Foam 16″ x 17″ x 1/8″ Foam Rubber Pad

Custom wiring harnesses for PSU1 (1st Zippy power supply):

  • 5x 4-pin 90-degree molex HD connectors with two connectors each. Length should be 36″ to the farthest connector, 32.5″ to the closest (3.5″ apart)
  • 3x 4-pin 12V fan connectors that should be 32″ in length with extender and RPM signal that can attach to motherboard

Custom wiring harnesses for PSU2 (2nd Zippy power supply):

  • 1x 24-pin motherboard connector, 8″
  • 1x 4-pin ATX12V for CPU, 8″
  • 4x 4-pin 90-degree molex HD connectors with two connectors each. Length should be 36″ to the farthest connector, 32.5″ to the closest (3.5″ apart)
  • 1x 4-pin 90-degree molex connector, 12″ long
  • 3x 3-pin, 12V fan connectors, 12″ long, with extender for RPM signal that can attach to motherboard

SATA Chipsets

  • SiI3726 on each port multiplier backplane to attach five drives to one SATA port.
  • SiI3132 on each of the three PCIe SATA cards to attach two backplanes each (six ports total)
  • SiI3124 on the one PCI SATA card to attach up to four port multiplier backplanes (we only use three of the four ports)
Tim Nufire

Tim Nufire

Chief Cloud Officer at Backblaze
Chief Cloud Officer and co-founder - While Tim stays busy fussing with the Backblaze cloud, designing Storage Pods and managing Operations, he'd much rather be taking Grommit, his Goldendoodle, for a walk.
Category:  Cloud Storage · Storage Pod
  • Scott

    Love this write-up as well as the ones on all the improved versions that followed.

  • Pingback: The Evolution and Future of Cloud Storage Pods()

  • Pingback: Amazon, Google, and Microsoft Aren’t the Only Cloud Innovators Around()

  • Pingback: What kind of computer do I need to buy, to run 1 petabyte of RAM on it? How much will it cost? | Create website | Learn PC()

  • Wow, very informative read. I’m looking for a solution as I’m running out of storage space (DAS) and alphabet letters for my PC (40TB). Need to build a server and just DIY NAS it.

  • mixtile

    Mixtile( is a community-based hardware designer and manufacturer based in
    Silicon Valley and Shenzhen. We provide rapid and customized designs
    of connected devices and help innovative makers and hardware startups to
    productize their ideas and achieve business success. We organize and
    sponsor meetups in Mountain View about hardware, manufacturing and other
    tech topics.

  • Debian remembered me

  • JulianEL

    Would love to try this out on FreeBSD and ZFS. (would need more RAM though… and maybe one ssd for metadata cache.)

  • will kenderdine
    We are thankful for the time and care you took in creating this post!……
    clearly a very useful site! If I may ask, could you venture a look at my
    post… we totally appreciate your time in helping us = )

  • Magnum

    ZFS (Raidz-2)!!!

  • whats-in-name

    Bravo! Really Nice one, It is a beauty! and fun to read the article.

  • HA

    Do you sell the pods?

  • Ken

    JFS really? Why not ZFS ?

  • Tyler Schock

    Great Build!! I’m all about DIY solutions, especially in computer related projects….But I do have one question, Why didn’t you plan for larger drives to save money on the raw storage? You spent $80 dollars a Gig, when you could have stayed around the $45 to $65 mark per Gig. That’s about one third savings in your primary expenditure, the hard drives (which are faster, larger, and arguably more reliable – referring to the WD4003FZEX as an example – )

    I love the design of the custom case, it satisfies cooling needs and it’s well laid out. We’ve been looking into expanding our storage and this just might do the trick, or at least inspire something along the same lines.

    This is great work. Keep it up!

    • MurderMostFowl

      You probably already realized this by now, but this article was written in 2009. Price/TB has gone down significantly since then.
      ( I’m an old guy who remembers dropping $400 a 270MB Quantum IDE HDD back in the day. I was in heaven… prior to that I was using a 40MB Plus Hardcard 40 which by that time had become ancient and slow )

      The first RAID system I ever go to work on was a Fast Wide SCSI II array with an unbelievably large RAID5 array of 8x8GB drives. :-) Those were the days!

      • Quantum Fireball to be exact right? I have the exact same drive collecting dust somewhere.

        • MurderMostFowl

          Yep! That was it! Loved that little drive

      • Crysta

        I still remember the 34 MB drive in the PC my family had back in the days… Back when the 1.44MB Floppy was considered “roomy”…

        We still have a C64 (yes, commodore 64) that still works, I just havent used it in so long that I havent a clue what to do!

        • tinynot

          and of course the TI, [texas instruments}, anyone remember that little gem? i remember back when we hooked up the natendo right to the TV, inline with the entena, right guys, two wires to our rabbit ears! and the very rich had the only phones in their cars, we had either home land lines, or used the payphones which were almost everywhere, now they are only like in some venues, and the airports, maybe. the 1MB sticks of ram was like $100, back in the early 90’S,
          i know. that is what i started off with for me 486dx2 i think it was with the HDD being a WD[western digital,] 350MB/$100. i waited while the 286 ans 386 passed me by, right after them 1086 things. i was in my early 20’s and didnt have much money back then. those in this generation for the most part do not realize how fast tech is now, and how fast the speed of new tech is going, leaps and bounds. i seen the other day on 60 mins the new AI stuff they have now, WOW. and the way the military/gov’s of the world is going to use it, really scares me, alot.

  • Marcus2012

    This is actually pretty awesome, I wonder what the write speed is though, and does it support ZFS, and copying the ENTIRE system to two duplicate systems for redundancy?

  • petabytesaresweet

    Man, this is an excellent and quite thorough writeup. Would you mind sharing how you monitor if one of the drives goes down?

  • Philippe

    Great post! I would like to know the available storage out of the 45 x 1.5 TB.. Considering the Raid architecture.

    • djbolivia

      It sounded like 87% free storage capacity after overhead, so I’m guessing 87% of 67.5 TB?

  • Akshar Dave

    How much are you paying for power and cooling? You might want to look at HP SL4500 servers which can house 60 x 6TB drives = 360TB in 4.3 U … Ideal for what you need.

    • tinynot

      WOW, 60 X 6TB drives, and they are all SSD? all i want for XMAS is a 15TB samsung, is that too much to ask? i dont think so, well. and a 2TB flashdrive, (a real one). nice server Akshar, i would love to see it, is there a url for images of it? or is it hidden so that no one may copy it? i fully understand if it is.

  • I’d love some more info on deployment! How many of these do you build in a day? How many techs do you have building pods all day? How often to you deploy these to the data center? And most curiously, when you get shipments of supplies, do they come in huge containers? You must have huge bins of spare parts sitting around all of the time, right?

    • Hi Flavio! We don’t share some of those stats, but we do get a lot of these deployed every month. You can take a look at ->, they build these systems and can answer most shipping/building questions for you!

  • Taylor Brazelton

    Excellent overview. Really enjoyed the good read on the backblaze servers.

  • whycantibeanon

    Thanks for the excellent technical overwiew!

  • Pingback: $94 trillion petabyte | Backblaze Blog()

  • Pingback: Ignite: Lean Startup - Petabytes on a Budget, the video | Backblaze Blog()

  • Pingback: Is the Backblaze cloud bigger than iCloud? | Backblaze Blog()

  • Pingback: Your x-ray on a pod: Vanderbilt builds Backblaze Storage Pods | Backblaze Blog()

  • Pingback: NSA might want some Backblaze pods | Backblaze Blog()

  • Pingback: Fallout of the Backblaze Storage Pod post | Backblaze Blog()

  • Pingback: VW takes Backblaze Storage Pod for a ride | Backblaze Blog()

  • Pingback: Seagate ships 3 TB drives | Backblaze Blog()

  • Pingback: 5 years of Twitter in 1/3rd a Backblaze Pod | Backblaze Blog()

  • Pingback: Backblaze is committed to unlimited backup | Backblaze Blog()

  • Pingback: Backblaze online backup almost acquired - Breaking down the breakup | Backblaze Blog()