The Cost of Cloud Storage

By | June 29th, 2017

the cost of the cloud as a percentage of revenue

This week, we’re celebrating the one year anniversary of the launch of Backblaze B2 Cloud Storage. Today’s post is focused on giving you a peek behind the curtain about the costs of providing cloud storage. Why? Over the last 10 years, the most common question we get is still “how do you do it?” In this multi-billion dollar, global industry exhibiting exponential growth, none of the other major players seem to be willing to discuss the underlying costs. By exposing a chunk of the Backblaze financials, we hope to provide a better understanding of what it costs to run “the cloud,” and continue our tradition of sharing information for the betterment of the larger community.

Context
Backblaze built one of the industry’s largest cloud storage systems and we’re proud of that accomplishment. We bootstrapped the business and funded our growth through a combination of our own business operations and just $5.3M in equity financing ($2.8M of which was invested into the business – the other $2.5M was a tender offer to shareholders). To do this, we had to build our storage system efficiently and run as a real, self-sustaining, business. After over a decade in the data storage business, we have developed a deep understanding of cloud storage economics.

Definitions
I promise we’ll get into the costs of cloud storage soon, but some quick definitions first:

    Revenue: Money we collect from customers.
    Cost of Goods Sold (“COGS”): The costs associated with providing the service.
    Operating Expenses (“OpEx”): The costs associated with developing and selling the service.
    Income/Loss: What is left after subtracting COGS and OpEx from Revenue.

I’m going to focus today’s discussion on the Cost of Goods Sold (“COGS”): What goes into it, how it breaks down, and what percent of revenue it makes up. Backblaze is a roughly break-even business with COGS accounting for 47% of our revenue and the remaining 53% spent on our Operating Expenses (“OpEx”) like developing new features, marketing, sales, office rent, and other administrative costs that are required for us to be a functional company.

This post’s focus on COGS should let us answer the commonly asked question of “how do you provide cloud storage for such a low cost?”

Breaking Down Cloud COGS

Providing a cloud storage service requires the following components (COGS and OpEX – below we break out COGS):
cloud infrastructure costs as a percentage of revenue

  • Hardware: 23% of Revenue
  • Backblaze stores data on hard drives. Those hard drives are “wrapped” with servers so they can connect to the public and store data. We’ve discussed our approach to how this works with our Vaults and Storage Pods. Our infrastructure is purpose built for data storage. That is, we thought about how data storage ought to work, and then built it from the ground up. Other companies may use different storage media like Flash, SSD, or even tape. But it all serves the same function of being the thing that data actually is stored on. For today, we’ll think of all this as “hardware.”

    We buy storage hardware that, on average, will last 5 years (60 months) before needing to be replaced. To account for hardware costs in a way that can be compared to our monthly expenses, we amortize them and recognize 1/60th of the purchase price each month.

    Storage Pods and hard drives are not the only hardware in our environment. We also have to buy the cabinets and rails that hold the servers, core servers that manage accounts/billing/etc., switches, routers, power strips, cables, and more. (Our post on bringing up a data center goes into some of this detail.) However, Storage Pods and the drives inside them make up about 90% of all the hardware cost.

  • Data Center (Space & Power): 8% of Revenue
  • “The cloud” is a great marketing term and one that has caught on for our industry. That said, all “clouds” store data on something physical like hard drives. Those hard drives (and servers) are actual, tangible things that take up actual space on earth, not in the clouds.

    At Backblaze, we lease space in colocation facilities which offer a secure, temperature controlled, reliable home for our equipment. Other companies build their own data centers. It’s the classic rent vs buy decision; but it always ends with hardware in racks in a data center.

    Hardware also needs power to function. Not everyone realizes it, but electricity is a significant cost of running cloud storage. In fact, some data center space is billed simply as a function of an electricity bill.

    Every hard drive storing data adds incremental space and power need. This is a cost that scales with storage growth.

    I also want to make a comment on taxes. We pay sales and property tax on hardware, and it is amortized as part of the hardware section above. However, it’s valuable to think about taxes when considering the data center since the location of the hardware actually drives the amount of taxes on the hardware that gets placed inside of it.

  • People: 7% of Revenue
  • Running a data center requires humans to make sure things go smoothly. The more data we store, the more human hands we need in the data center. All drives will fail eventually. When they fail, “stuff” needs to happen to get a replacement drive physically mounted inside the data center and filled with the customer data (all customer data is redundantly stored across multiple drives). The individuals that are associated specifically with managing the data center operations are included in COGS since, as you deploy more hard drives and servers, you need more of these people.

    Customer Support is the other group of people that are part of COGS. As customers use our services, questions invariably arise. To service our customers and get questions answered expediently, we staff customer support from our headquarters in San Mateo, CA. They do an amazing job! Staffing models, internally, are a function of the number of customers and the rate of acquiring new customers.

  • Bandwidth: 3% of Revenue
  • We have over 350 PB of customer data being stored across our data centers. The bulk of that has been uploaded by customers over the Internet (the other option, our Fireball service, is 6 months old and is seeing great adoption). Uploading data over the Internet requires bandwidth – basically, an Internet connection similar to the one running to your home or office. But, for a data center, instead of contracting with Time Warner or Comcast, we go “upstream.” Effectively, we’re buying wholesale.

    Understanding how that dynamic plays out with your customer base is a significant driver of how a cloud provider sets its pricing. Being in business for a decade has explicit advantages here. Because we understand our customer behavior, and have reached a certain scale, we are able to buy bandwidth in sufficient bulk to offer the industry’s best download pricing at $0.02 / Gigabyte (compared to $0.05 from Amazon, Google, and Microsoft).

    Why does optimizing download bandwidth charges matter for customers of a data storage business? Because it has a direct relationship to you being able to retrieve and use your data, which is important.

  • Other Fees: 6% of Revenue
  • We have grouped the remaining costs inside of “Other Fees.” This includes fees we pay to our payment processor as well as the costs of running our Restore Return Refund program.

    A payment processor is required for businesses like ours that need to accept credit cards securely over the Internet. The bulk of the money we pay to the payment processor is actually passed through to pay the credit card companies like AmEx, Visa, and Mastercard.

    The Restore Return Refund program is a unique program for our consumer and business backup businesses. Customers can download any of their files directly from our website. We also offer customers the ability to order a hard drive with some or all of their data on it, we then FedEx to the customer wherever in the world the customer might be. Any customer can opt to return the drive to us for a full refund. Customers love the program, but it does cost Backblaze money. We choose to subsidize the cost associated with this service in an effort to provide the best customer experience we can.

The Big Picture

At the beginning of the post, I mentioned that Backblaze is, effectively, a break even business. The reality is that our products drive a profitable business but those profits are invested back into the business to fund product development and growth. That means growing our team as the size and complexity of the business expands; it also means being fortunate enough to have the cash on hand to fund “reserves” of extra hardware, bandwidth, data center space, etc. In our first few years as a bootstrapped business, having sufficient buffer was a challenge. Having weathered that storm, we are particularly proud of being in a financial place where we can afford to make things a bit more predictable.

All this adds up to answer the question of how Backblaze has managed to carve out its slice of the cloud market – a market that is a key focus for some of the largest companies of our time. We have innovated a novel, purpose built storage infrastructure with our Vaults and Pods. That infrastructure allows us to keep costs very, very low. Low costs enable us to offer the world’s most affordable, reliable cloud storage.

Does reliable, affordable storage matter? For a company like Vintage Aerial, it enables them to digitize 50 years’ worth of aerial photography of rural America and share that national treasure with the world. Having the best download pricing in the storage industry means Austin City Limits, a PBS show out of Austin, can digitize and preserve over 550 concerts.

We think offering purpose built, affordable storage is important. It empowers our customers to monetize existing assets, make sure data is backed up (and not lost), and focus on their core business because we can handle their data storage needs.

Tim Nufire

Tim Nufire

Chief Cloud Officer at Backblaze
Chief Cloud Officer and co-founder - While Tim stays busy fussing with the Backblaze cloud, designing Storage Pods and managing Operations, he'd much rather be taking Grommit, his Goldendoodle, for a walk.
Category:  Cloud Storage
  • Brian Walter

    This is great, what percentage of revenue do you spend on R&D?

  • Pingback: The Growth in Cloud Storage Continues: 400 Petabytes of Storage()

  • Pingback: 2017 Hard Drive Reliability By Manufacturer and Model()

  • Pingback: Transparency in Cloud Storage Costs – Akshaya IT Services()

  • Pingback: Cloud Storage Providers: How Much Do They Really Charage?()

  • iwod

    I am looking at this and thinking, That is a lot of Opex, or BB should drive more revenue so Opex % should be lower?

    I know this is not going to be a popular opinion, but why B2 dont offer something like Dropbox?

  • Piotr Masłowski

    Hi,
    At 1st, sorry for my english. : D
    2. Thank you for video about Reed-Solomon coding.
    3. BackBlaze looks very profesional, but also people-friendly :)
    4. From this chart, which you shared with us, i see that hardware and data centers are expensive. But What will happen, if instead of it, use free space on people hard drives? Something like storj.io or sia.tech but done better? (of course it requires investing in new software)

    • We’re always on the lookout for new services and platforms, but switching over to those at the moment would mean a lot of development, and for the time being we’re focusing on making sure that our current service is a great as it can be.

  • Manuel Wymann

    Dear Tim, I just wonder, with the recent increase in memory prices (also due to hyperscale server demand), what impact does this have on your purchasing strategy? Can you source cheaper?

    • Ahin Thomas

      Manuel – Ahin f/Backblaze here. Thanks for writing in. Our procurement team is great and has a process of continual optimization. So, where we can, we find efficiencies. Generally speaking, the percentages for COGS are consistent.

      • Manuel Wymann

        Dear Ahin, many thanks for answering to my question. I had also input meanwhile from other cloud operators and it seems that the majority of the cost increase is being taken by vendors (like Dell, HP, etc.) but I wonder if, with the usual thin margins they have, they are currently loss making by selling to cloud operators. You cannot really despec, can you?

        • Ahin Thomas

          A good partnership is sustainable for both sides… I couldn’t speak to the relationship between vendors and the “other” cloud providers, but we’re pleased to have long standing partnerships across multiple vendors. I’m not sure what “despec” means, but hope that answers your question! Also, if you’re interested in the cost side of the equation, I’d recommend our blog post on the cost of storage https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/

  • Paul van den Bergen

    SO, one of the thoughts I’ve had for a while (somewhere around a decade now) is how much would you save if you just turned off drives that were no longer being accessed – where real time retrieval of data was not a problem. I was considering this from two perspectives – 1) archival data – virtual tape drives – and 2) disk lifetime savings – presuming disks that are not powered up and spinning might have a longer lifetime.

    I’ve come to realise over time that data more than 5 years old is such a small fraction of your current storage needs (less than 10%, presuming Moore’s law) that you may as well just keep it. So somewhere between now and 5 years old is a point at which you might consider quiescing the drives.

    Now I know that at most that will save you something less than 8% of your costs…. Moving the old data onto new faster disks and saving some floor space is probably a bigger saving…

    • Ahin Thomas

      Hey Paul – Ahin from Backblaze here. It’s an interesting idea, one that other storage players have taken up. They call it “cold” or “warm” storage. The challenge is that when you need that old data, you usually don’t want to wait for it. So we’ve made the choice to offer “alive” storage and the best possible pricing.

      • Anibal Pereira

        sorry, im not a computer geek just wondering. I have been using hosting service from 2001, and had my share of experiences dealing with critical crisis. Hoster many times talk about RAID-1 off-site backups etc, but many times that is not the case, and you only know about it when disks crashes.

        When reading about the gigantic disk space you are using to serve clients, and the “all inclusive” stategy, is there a “virtual raid” service ? Instead of having a raid setup at home with physical disks, the mirror disk whould be on the cloud. Would it generate too much traffic thus making it impossible to manage? Hardware limitations?

        • Ahin Thomas

          Hi Anibal – Depends what you’re trying to achieve… in a sense our computer backup service does what your describing -> unlimited data backup of your computer into our cloud for $5/month…

          • Anibal Ventura

            Sorry, have to ask
            When uploading backup does it compare files sized and only updates the news ones ?
            For exemple fist upload of 35 GB backup of personal files. Second time i backup with some new files and having edited a few, does i have to fully upload the all backup of 35gb ?

          • Ahin Thomas

            Hey Anibal – once uploaded, any file that is unchanged will not be reuploaded. We wrote a help article describing the policy here https://help.backblaze.com/hc/en-us/articles/217665548-Deduplication

  • xxxMicrobexxx

    Like everyone else said…

  • Dianne A

    Thanks for sharing this info. I continue to be amazed at the great service you provide for the price. Color me a very happy customer who recommends your service to anyone who is even remotely interested.

  • BLE

    Great info – thanks for sharing!

  • Jesse_Bruce_Pinkman

    Now that Amazon has announced that Unlimited Amazon Cloud has become 1 TB storage for 1 year at $60.00, do I really need to state how much better backblaze B2 is for the exact same price? Amazon’s S3 offering of the same service costs 3X as much.

    • CMS

      Amazon’s and Backblaze’s unlimited services were entirely different products. Backblaze offer unlimited backup but the user must retain a local copy of any backed up data while it is stored on Backblaze. Amazon were offering unlimited cloud storage which meant users could dump any amount of data onto Amazon’s servers – which they did in their droves. There’s a large group over on Reddit going by the name Datahoarders and a growing subset of this group seem to be intent on abusing unlimited cloud storage services. At the time of closure, one user had over 1.5 Petabytes of porn screened from webcam sites dumped on Amazon cloud… 1.5 Petabytes and growing every day. Many of this group had 100’s of TB in storage and were encrypting their data (much of it pirated content) and then uploading it using customised tools. Amazone tried to block these tools from hitting their API’s only for the group to try countless workarounds. The same group are now migrating en-masse to Google Drive for Business which – through an unenfored clause – allows them to signup for a single low-priced G Suite Business account and upload unlimited data to Drive. There is a large element of abuse in what the like of these folk are doing but they retort with the fact that OneDrive, Amazon, {insert latest company to remove service} shouldn’t have offered unlimited storage if they couldn’t honour it. In a way, they’re correct. Amazon and MS OneDrive before them were silly enough to offer unlimited storage.

      I think it boils down to a difference in business attitudes between the big players like Amazon, MS, Google and co. and the smaller guys like Backblaze. From reading the countless blog postings here, it is ever apparent the thought and effort that goes into eash decision Backblaze makes and how keeping costs at a minimum is the goal. Amazon unlimited offerings was probably dreamt up by the marketing guys only for everyone to to come back a year later and realise it was a mistake!

  • bob

    Great writeup but are you really getting 5 years out of the harddrives? I thought you have migrated disks as the density increased and would think that the density of 5 year old drives was back in 1 or 2T byte size.

    • We do indeed assume that we’ll get 5 years out of the drives. We cycle drives out if they get long in the tooth, but we have whole batches that were 7 years and kicking when we decided to remove them.