On Tuesday August 27th, we announced the opening of our first European data center. This post is part of our three-part series on selecting a new data center.
There’s an old saying, “How do you eat an elephant? One bite at a time.” The best way to tackle big problems is to simplify as much as you can.
In our case, with almost an exabyte of customer data under management and customers in over 160 countries, expanding the geographic footprint of our data centers (DCs) has been a frequently discussed topic. Prior to opening up EU Central, we had three DCs, but all in the western U.S. The topic of opening a DC in Europe is not a new one within Backblaze, but going from idea to storing customer data can be a long journey.
As our team gathered to prioritize the global roadmap, the first question was an obvious one: Why do we want to open a DC in Europe? The answer was simple: Customer demand.
While nearly 15 percent of our existing customer base already resides in Europe, the requests for an EU DC come from citizens around the globe. Why?
- Customers like keeping data in multiple geographies. Doing so is in line with the best practices of backup (long before there was a cloud, there was still 3-2-1).
- Geopolitical/regulatory concerns. For any number of reasons, customers may prefer or be required to store data in certain physical locations.
- Performance concerns. While we enjoy a debate about the effects of latency for most storage use cases, the reality is many customers want a copy of their data as physically close to where it’s being used as possible.
With the need established, the next question was predictably obvious: How are we going to go about this? Our three existing DCs are all in the same timezone as our California headquarters. Logistically, opening and operating a DC that has somewhere around an eight hour time difference from our headquarters felt like a significant undertaking.
Organizing the Search for the Right Data Center
To help get us organized, our co-founder and Chief Cloud Officer, Tim Nufire, drew the following on a whiteboard.
This basic matrix frames the challenge well. If one were willing to accept infinite risk (have customers write to scrolls and “upload” via sealed bottle transported across the ocean), we’d have low financial and effort investments outlays to open the data enter. However, we’re not in the business of accepting infinite risk. So we wanted to achieve a low risk environment for data storage while sustaining our cost advantage for our customers.
But things get much more nuanced once you start digging in.
There are multiple risk factors to consider when selecting a DC. Some of the leading ones are:
- Environmental: One could choose a DC in the middle of a floodplain, but, with few exceptions, most DCs don’t work well underwater. We needed to find an area to minimize adverse environmental impact.
- Political: DCs are physical places. Physical places are governed by some form of nation state. Some customers want (or need) their data to be stored within certain regulatory or diplomatic parameters. In the case of the requests for opening a DC in Europe, many of our customers want their data to be inside of the European Union (EU). That requirement strikes Switzerland off our list. For similar reasons, another requirement we imposed was operating inside of a country that is a NATO member. Regrettably, that eliminated any location inside of Finland. Our customers want EU, not Europe.
- Financial: By opening a DC in Europe, we will be conducting business with a partner that expects to be paid in euros. As an American company, we primarily operate in dollars. So now the simple timing of when we pay our bills may change the cost (depending on exchange rate fluctuations).
The other dimension on the board was costs, expressed as Affordable to Expensive. Costs can be thought of both as financial as well as effort:
- Operating Efficiency: Generally speaking, the climate of the geography will have an effect on the heating/cooling costs. We needed to understand climate nuances across a broad geographic area.
- Cost of Inputs: Power costs vary widely, often due to fuel sources having different availability at a local level. For example, nuclear power is generally cheaper than fossil fuel, but may not be available in a given region. Complicating things is that power source X may cost one thing in the first country, but something totally different in the next. Our DC negotiations may be for physical space, but we needed to understand our total cost of ownership.
- Staffing: Some DCs provide remote hands (contract labor) while others expect us to provide our own staffing. We needed to get up to speed on labor laws and talent pools in desired regions.
Trying to Push Forward
We’re fortunate to have a great team of Operations people that have earned expertise in the field. So with the desire to find a DC in the EU, a working group formed to explore our options. A little while later, when the internal memo circulated, the summary in the body of the email jumped out:
“It could take 6-12 months from project kick-off to bring a new EU data center online.”
That’s a significant project for any company. In addition, the time range was sufficiently wide to indicate the number of unknowns in play. We were faced with a difficult decision: How can we move forward on a project with so many unknowns?
While this wouldn’t be our first data center search, prior experience told us we had many more unknowns in front of us. Our most recent facility searches mainly involved coordinating with known vendors to obtain facility reports and pricing for comparison. Even with known vendors, this process involved significant resources from Backblaze to relay requirements to various DC sales reps and to take disparate quotes and create some sort of comparison. All DCs will quote you $/Kilowatt Hour or $/kWh, but there is no standard definition of what is and isn’t included in that. Generally speaking, a DC contract has unit costs that decline as usage goes up. So is the $/kWh in a given quote the blended lifetime cost? Year one? Year five? Adding to this complexity would be all the variables discussed above (and more).
Serendipity Strikes: UpStack
Despite the obstacles in our path, our team committed to finding a location inside the EU that makes sense for both our customers’ needs and our business model. We have an experienced team that has demonstrated the ability to source and vet DCs already. That said, our experienced team were already quite busy with their day jobs. This project looked to come at a significant opportunity cost as it would fully occupy a number of people for an extended period of time.
At the same time as we were trying to work through the internal resource planning, our CEO happened across an interesting article from our friends at Data Center Knowledge; they were covering a startup called UpStack (“Kayak for data center services”). The premise was intriguing — the UpStack platform is designed to gather and normalize quotes from qualified vendors for relevant opportunities. Minimizing friction for bidding DCs and Backblaze would enable both sides to find the right fit. Intrigued, we reached out to their CEO, Chris Trapp.
We were immediately impressed with how easy the user experience was on our side. Knowing how much effort goes into normalizing the data from various DCs, having a DC shopping experience comparable to that of searching for plane tickets was mind blowing. With a plane ticket, you might search for number of stops and layover airports. With UpStack, we were able to search for connectivity to existing bandwidth providers, compliance certifications, and location before asking for pricing.
Once vendors returned pricing, UpStack’s application made it easy to compare specifications and pricing on an apples-to-apples basis. This price normalization was a huge advantage for us as it saved many hours of work usually spent converting quotes into pricing models simply for comparison sake. We have the expertise to do what UpStack does, but we also know how much time that takes us. Being able to leverage a trusted partner was a tremendous value add for Backblaze.
Narrowing Down The Options
With the benefit of the UpStack platform, we were able to cast a much wider net than would have been viable hopping on phone calls from California.
We specified our load ramp. There’s a finite amount of data that will flow into the new DC on day one, and it only grows from there. So part of the pricing negotiation is agreeing to deploy a minimum amount of racks on day one, a minimum by the end of year one, and so on. In return for the guaranteed revenue, the DCs return pricing based on those deployments. Based on the forecasted storage needs, UpStack’s tool then translates that into estimated power needs so vendors can return bids based on estimated usage. This is an important change from how things are usually done; many quotes otherwise price based on the top estimated usage or a vendor-imposed minimum. By basing quotes off of one common forecast, we could get the pricing that fits our needs.
There are many more efficiencies that UpStack provides us and we’d encourage you to visit their site at https://upstack.com to learn more. The punchline is that we were able to create a shortlist of the DCs that fit our requirements; we received 40 quotes provided by 40 data centers in 10 markets for evaluation. This was a blessing and a curse, as we were able to cast a wider net and learn about more qualified vendors than we thought possible, but a list of 40 needed to be narrowed down.
Based on our cost/risk framework, we narrowed it down to the 10 DCs that we felt gave us our best shot to end up with a low cost, low risk partner. With all the legwork done, it was time to go visit. To learn more about our three country trip to 10 facilities that lasted less than 72 hours, tune in tomorrow. Same bat time, same bat station.