Moving 670 Network Connections

An illustration of server racks and networking cables.

Editor’s Note

We’re constantly upgrading our storage cloud, but we don’t always have ways to tangibly show what multi-exabyte infrastructure looks like. When data center manager, Jack Fults, shared photos from a recent network switch migration, though, it felt like exactly the kind of thing that makes The Cloud™ real in a physical, visual sense. We figured it was a good opportunity to dig into some of our more recent upgrades.

If your parents ever tried to enforce restrictions on internet time, and in response, you hardwired a secret 120ft Ethernet cable from the router in your basement through the rafters and up into your room so you could game whenever you wanted, this story is for you. 

Replacing 670 network switches in a data center is kind of like that, times 1,000.  And that’s exactly what we did in our Sacramento data center recently. 

Hi, I’m Jack

I’m a data center manager here at Backblaze, and I’m in charge of making sure our hardware can meet our production needs, interfacing with the data center ownership, and generally keeping the building running, all in service of delivering easy cloud storage and backup services to our customers. I lead an intrepid team of data center technicians who deserve a ton of kudos for making this project happen as well as our entire Cloud Operations team.

An image of a data center manager with decommissioned cables.
Here I am taking a swim in a bunch of decommissioned cable from an older migration of cat 5e out of our racks. Do not be alarmed by the Spaghetti Monster—these cables aren’t connected to anything, and they promptly made their way to a recycling facility.

Why Did We Need to Move 670 Network Connections?

We’re constantly looking for ways to make our infrastructure better, faster, and smarter, and in that effort, we wanted to upgrade to new network switches. The new switches would allow us to consolidate connections and mitigate any potential future failures. We have plenty of redundancy and protocols in place in the event that happens, but it was a risk we knew we’d be wise to get ahead of as we continued to grow our data under management.

An image of network cables in a data center rack.
Example of the old cabling connected to the Dell switches. Pretty much everything in this cabinet has been replaced, except for the aggregate switch providing uplinks to our access switches.

Switch Migration Challenges

In order to make the move, we faced a few challenges:

  • Minimizing network loss: How do we rip out all those switches without our Vaults being down for hours and hours?
  • Space for new cabling: In order to minimize network loss, we needed the new cabling in place and connected to the new switches before a cutover, but our original network cabinets were on the smaller side and full of existing cabling.
  • Space for new switches: We wanted to reuse the same rack units for the new Arista switches, so we had to figure out a method that allowed us to slide the old switches straight forward, out of the cabinet, and slide the new switches straight in.
  • Time: Every day we didn’t have the new switches in place was a day we risked a lock up that would take time away from our ability to roll out standard deployments and prepare for production demands.

Here’s How We Did It

Racking new switches in cabinets that are already fully populated isn’t ideal, but it is totally doable with a little planning (okay, a lot of planning). It’s a good thing I love nothing more than a good Google sheet, and believe me we tracked everything down to the length of the cables (3,272ft to be exact, but more on that later). Here’s a breakdown of our process:

  1. Put up a temporary, transfer switch in the cabinet and move the connections there. Ports didn’t matter, since it was just temporary, so that sped things up a bit.
  2. Decommission the old switch, pulling the power cabling and unbolting it from the rack.
  3. Ratchet our cables up using a makeshift pulley system in order to pull the switches straight out from the rack and set them aside.
An image of cables connected to network switches in a data center.
Carefully hoisting up the cabling with our makeshift Velcro pulley systems to allow the old switches to come out, and the new ones go in. Although this might look a little jury-rigged, it greatly helped us support the weight of the production management cables and hold them out of the way.
  1. Rack the new Arista switch and connect it to our aggregate switch which breaks out connections to all of the access switches.
  2. Configure the new switch – many thanks go to our Network Engineering team for their work on this part.
  3. Finally, move the connections from the temporary switch to the new Arista switch.
An image of network switches in a data center rack.
One of the first 2U switches to start receiving new cabling.

Each 1U Dell had 48 connections, which handled two Backblaze Vaults. We were able to upgrade to 2U switches with the new Aristas, which each had 96 connections, fitting four Backblaze Vaults plus 16 core servers. So, every time we moved to the next four vaults, we’d go through this process until we were through the network switch migration for 27 Vaults plus core servers, comprising the 670 network connections.

An image of a data center technician plugging network connections into servers.
Justin Whisenant, our senior DC tech, realizing that this is the last cable to cutover before all connections have been swapped.

Using the transfer switch allowed us to decommission the old switch then rack and configure the new switch so that we only lost a second or two of network connectivity as one of the DC techs moved the connection. That was one of the things we had to be very planful about—making sure the Vault would remain available, with the exception of one server that would be down for a split second during the swap. Then, our DC techs would confirm that connectivity was back up before moving on to the next server in the Vault.

Oh, And We Also Ran New Cables

We ran into a wrinkle early on in the project. We had two cabinets side by side where the switches are located, so sometimes we’d rack the temporary switch in one and the new Arista switch in the other. Some of the old cables weren’t long enough to reach the new switches. There’s not much else you can do at that point but run new cables, so we decided to replace all of the cables wholesale—3,272ft of new cable went in. 

We had to fine-tune our plans even more to balance decommissioning with racking the new switches in order to make room for the new cables, but it also ended up solving another issue we hadn’t even set out to address. It allowed us to eliminate a lot of slack from cables that were too long. Over time, with the amount of cables we had, the slack made it difficult to work in the racks, so we were happy to see that go away.

An image of cable dressing in a data center.
There’s still decom and cable dressing to do, but it looks so much better.

While we still have some cable management and decommissioning to be done, migrating to the Arista switches was the mission critical piece to mitigate our risk and plan for ongoing improvements. 

As a data center manager, we get to work on the side of tech that takes the abstract internet and makes it tangible, and that’s pretty cool. It can be hard for people to visualize The Cloud, but it’s made up of cables and racks and network switches just like these. Even though my mom loves to bring up that secret Ethernet cable story at family events, I think she’s pretty happy that it led that mischievous kid to a project like this.

One Project Among Many

While not every project has great pictures to go along with it, we’re always upgrading our systems for performance, security, and reliability. Some other projects we’re completed in the last few months include reconfiguring much of our space to make it more efficient and ready for enterprise-level hardware, moving our physical media operations, and decommissioning 4TB Vaults as we migrate them to larger Vaults with larger drives. Stay tuned for a longer post about that from our very own Andy Klein.

print

About Jack Fults

Jack is a Data Center Manager at Backblaze where he works with a team of data center techs to maintain our equipment. With over six years of experience working in our data centers, he strives to develop and execute plans to reconfigure space efficiently to best serve our customers. Jack is an Eagle Scout who has survived three of the Boy Scouts of America's high adventure camps—Philmont, Sea Base and Northern Tier—which earned him a Triple Crown award. He also loves cars... Can't forget the cars.