{"id":110674,"date":"2024-01-05T09:43:26","date_gmt":"2024-01-05T17:43:26","guid":{"rendered":"https:\/\/www.backblaze.com\/blog\/?p=110674"},"modified":"2025-12-11T13:29:02","modified_gmt":"2025-12-11T21:29:02","slug":"new-open-source-tool-for-consistency-in-cassandra-migrations","status":"publish","type":"post","link":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/","title":{"rendered":"New Open Source Tool for Consistency in Cassandra Migrations"},"content":{"rendered":"\r\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"583\" class=\"wp-image-110675\" src=\"https:\/\/www.backblaze.com\/blog\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration-1024x583.png\" alt=\"A decorative image showing the Cassandra logo with a function represented by two servers on either side of the logo. \" srcset=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration-1024x583.png 1024w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration-300x171.png 300w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration-768x437.png 768w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png 1440w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\r\n\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 15px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<p class=\"has-drop-cap\">Sometimes you find a problem because something breaks, and other times you find a problem the good way\u2014by thinking it through <em>before <\/em>things break. This is a story about one of those bright, shining, lightbulb moments when you find a problem the good way.<\/p>\r\n\r\n\r\n\r\n<p>On the Backblaze Site Reliability Engineering (SRE) team, we were thinking through an upcoming datacenter migration in <a href=\"https:\/\/cassandra.apache.org\/_\/index.html\" target=\"_blank\" rel=\"noreferrer noopener\">Cassandra<\/a>. We were running through all of the various types of queries we would have to do when we had the proverbial \u201caha\u201d moment. We discovered an inconsistency in the way Cassandra handles lightweight transactions (LWTs).<\/p>\r\n\r\n\r\n\r\n<p>If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process but you couldn\u2019t figure out why or how\u2014this might be why. I\u2019m going to walk through a short intro on Cassandra, how we use it, and the issue we ran into. Then, I\u2019ll explain the workaround, which we open sourced.\u00a0<\/p>\r\n\r\n\r\n\r\n<div class=\"abstract\" style=\"line-height: 1.8; margin: 24px 12px; padding: 24px 12px 10px 12px;\">\r\n<h4>Get the Open Source Code<\/h4>\r\n<p>You can download the open source code from <a href=\"https:\/\/github.com\/Backblaze\/cassandra_lwt_migration_tool\" target=\"_blank\" rel=\"noopener\">our Git repository<\/a>. We\u2019d love to know how you\u2019re using it and how it\u2019s working for you\u2014let us know in the comments.<\/p>\r\n<\/div>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">How We Use Cassandra<\/h2>\r\n\r\n\r\n\r\n<p>First, if you\u2019re not a Cassandra dev, I should mention that when we say \u201cdatacenter migration\u201d it means something slightly different in Cassandra than what it sounds like. It doesn\u2019t mean a data center migration in the physical sense (although you can use datacenter migrations in Cassandra when you\u2019re moving data from one physical data center to another). In the simplest terms, it involves moving data between two Cassandra or Cassandra-compatible database replica sets within a cluster.<\/p>\r\n\r\n\r\n\r\n<p>And, if you\u2019re not familiar with Cassandra at all, it\u2019s an open-source, NoSQL, distributed database management system. It was created to handle large amounts of data across many commodity servers, so it fits our use case\u2014lots of data, lots of servers.\u00a0<\/p>\r\n\r\n\r\n\r\n<p>At Backblaze, we use Cassandra to index filename to location for data stored in Backblaze B2, for example. Because it\u2019s customer data and not just analytics, we care more about durability and consistency than some other applications of Cassandra. We run with three replicas in a single datacenter and \u201cbatch\u201d mode to require writes to be committed to disk before acknowledgement rather than the default \u201cperiodic.\u201d<\/p>\r\n\r\n\r\n\r\n<p>Datacenter migrations are an important aspect of running Cassandra, especially on <a href=\"\/blog\/how-to-do-bare-metal-backup-and-recovery\/\" target=\"_blank\" rel=\"noreferrer noopener\">bare metal<\/a>. We do a few datacenter migrations per year either for physical data moves, <a href=\"\/blog\/the-storage-pod-story-innovation-to-commodity\/\" target=\"_blank\" rel=\"noreferrer noopener\">hardware refresh<\/a>, or to change certain cluster layout parameters like tokens per host that are otherwise static.\u00a0<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">What Are LWTs and Why Do They Matter for Datacenter Migrations in Cassandra?<\/h2>\r\n\r\n\r\n\r\n<p>First of all, LWTs are neither lightweight nor transactions, but that\u2019s neither here nor there. They <em>are <\/em>an important feature in Cassandra. Here\u2019s why.\u00a0<\/p>\r\n\r\n\r\n\r\n<p>Cassandra is great at scaling. In something like a replicated SQL cluster, you can add additional replicas for read throughput, but not writes. Cassandra scales writes (as well as reads) nearly linearly with the number of hosts\u2014into the hundreds. Adding nodes is a fairly straightforward and \u201cautomagic\u201d process as well, with no need to do something like manual token range splits. It also handles individual down nodes with little to no impact on queries. Unfortunately, these properties come with a trade-off: a complex and often nonintuitive consistency model that engineers and operators need to understand well.<\/p>\r\n\r\n\r\n\r\n<p>In a distributed database like Cassandra, data is replicated across multiple nodes for durability and availability.\u00a0<\/p>\r\n\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"523\" height=\"417\" class=\"wp-image-110679\" src=\"https:\/\/www.backblaze.com\/blog\/wp-content\/uploads\/2024\/01\/Cassandra_1_Writes-e1704472892271.png\" alt=\"\" srcset=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_1_Writes-e1704472892271.png 523w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_1_Writes-e1704472892271-300x239.png 300w\" sizes=\"auto, (max-width: 523px) 100vw, 523px\" \/><\/figure>\r\n<\/div>\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 10px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<p>Although databases generally allow multiple reads and writes to be submitted at once, they make it look to the outside world like all the operations are happening in order, one at a time. This property is known as serializability, and Cassandra is not serializable. Although it does have a \u201clast write wins\u201d system, there\u2019s no transaction isolation and timestamps can be identical.\u00a0<\/p>\r\n\r\n\r\n\r\n<p>It\u2019s possible, for example, to have a row that has some columns from one write and other columns from another write. It\u2019s safe if you\u2019re only appending additional rows, but mutating existing rows safely requires careful design. Put another way, you can have two transactions with different data that, to the system, appear to have equal priority.\u00a0<\/p>\r\n\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"721\" height=\"459\" class=\"wp-image-110676\" src=\"https:\/\/www.backblaze.com\/blog\/wp-content\/uploads\/2024\/01\/Cassandra_2_Concurrent-Write-Coordinators-e1704473003203.png\" alt=\"\" srcset=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_2_Concurrent-Write-Coordinators-e1704473003203.png 721w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_2_Concurrent-Write-Coordinators-e1704473003203-300x191.png 300w\" sizes=\"auto, (max-width: 721px) 100vw, 721px\" \/><\/figure>\r\n<\/div>\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 10px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">How Do LWTs Solve This Problem?<\/h2>\r\n\r\n\r\n\r\n<p>As a solution for cases where stronger consistency is needed, Cassandra has a feature called \u201cLightweight Transactions\u201d or LWTs. These are not really identical to traditional database transactions, but provide a sort of \u201ccompare and set\u201d operation that also guarantees pending writes are completed before answering a read. This means if you\u2019re trying to change a row\u2019s value from \u201cA\u201d to \u201cB\u201d, a simultaneous attempt to change that row from \u201cA\u201d to \u201cC\u201d will return a failure. This is accomplished by doing a full\u2014not at all lightweight\u2014Paxos round complete with multiple round trips and slow expensive retries in the event of a conflict.<br \/><br \/>In Cassandra, the minimum consistency level for read and write operations is <strong>ONE<\/strong>, meaning that only a single replica needs to acknowledge the operation for it to be considered successful. This is fast, but in a situation where you have one down host, it could mean data loss, and later reads may or may not show the newest write depending on which replicas are involved and whether they\u2019re received the previous write. For better durability and consistency, Cassandra also provides various quorum levels that require a response from multiple replicas, as well as an <strong>ALL<\/strong> consistency that requires responses from every replica.<\/p>\r\n\r\n\r\n\r\n<div class=\"abstract\" style=\"line-height: 1.8; margin: 24px 12px; padding: 24px 12px 10px 12px;\">\r\n<h4>Cassandra Is My Type of Database<\/h4>\r\n<p>Curious to know more about consistency limitations and LWTs in Cassandra? Christopher Batey\u2019s <a href=\"https:\/\/www.youtube.com\/watch?v=wcxQM3ZN20c\" target=\"_blank\" rel=\"noopener\">presentation at the 2016 Cassandra Summit<\/a> does a good job of explaining the details.<\/p>\r\n<\/div>\r\n\r\n\r\n\r\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\">\r\n<div class=\"wp-block-embed__wrapper\">https:\/\/www.youtube.com\/watch?v=wcxQM3ZN20c<\/div>\r\n<\/figure>\r\n\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 10px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">The Problem We Found With LWTs During Datacenter Migrations<\/h2>\r\n\r\n\r\n\r\n<p>Usually we use one datacenter in Cassandra, but there are circumstances where we sometimes stand up a second datacenter in the cluster and migrate to it, then tear down the original. We typically do this either to change <code>num_tokens<\/code>, to move data when we\u2019re refreshing hardware, or to physically move to another nearby data center.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">The TL:DR<\/h3>\r\n\r\n\r\n\r\n<p>We reasoned through the interaction between LWTs\/<code>serial<\/code> and datacenter migrations and found a hole\u2014there\u2019s no guarantee of LWT correctness during a topology change (that is, a change to the number of replicas) large enough to change the number of replicas needed to satisfy quorum. It turns out that combining LWTs and datacenter migrations can violate consistency guarantees in subtle ways without some specific steps and tools to work around it.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">The Long Version<\/h3>\r\n\r\n\r\n\r\n<p>Let\u2019s say you are standing up a new datacenter, and you need to copy an existing datacenter to it. So, you have two datacenters\u2014datacenter A, the existing datacenter, and datacenter B, the new datacenter. Let\u2019s say datacenter A has three replicas you need to copy for simplicity\u2019s sake, and you\u2019re using quorum writes to ensure consistency.<\/p>\r\n\r\n\r\n\r\n<div class=\"abstract\" style=\"line-height: 1.8; margin: 24px 12px; padding: 24px 12px 10px 12px;\">\r\n<h4>Refresher: What is Quorum-Based Consistency in Cassandra?<\/h4>\r\n<p>Quorum consistency in Cassandra is based on the concept that a specific number of replicas must participate in a read or write operation to ensure consistency and availability\u2014a majority (n\/2 +1) of the nodes must respond before considering the operation as successful. This ensures that the data is durably stored and available even if a minority of replicas are unavailable.<\/p>\r\n<\/div>\r\n\r\n\r\n\r\n<p>You have different types of quorum you can choose from, and here\u2019s how those defaults make a decision:\u00a0<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li><strong>Local quorum:<\/strong> Two out of the three replicas in the datacenter I\u2019m talking to must respond in order to return success. I don\u2019t care about the other datacenter.<\/li>\r\n\r\n\r\n\r\n<li><strong>Global quorum:<\/strong> Four out of the six total replicas must respond in order to return success, and it doesn\u2019t matter which datacenter they come from.<\/li>\r\n\r\n\r\n\r\n<li><strong>Each quorum:<\/strong> Two out of the three replicas in each datacenter must respond in order to return success.<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<p>Most of these quorum types also have a <code>serial<\/code> equivalent for LWTs.<\/p>\r\n\r\n\r\n\n<table id=\"tablepress-58\" class=\"tablepress tablepress-id-58\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\"><strong>Type of Quorum<\/strong><\/th><th class=\"column-2\"><strong>Serial<\/strong><\/th><th class=\"column-3\"><strong>Regular<\/strong><\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">Local<\/td><td class=\"column-2\"><code>LOCAL_SERIAL<\/code><\/td><td class=\"column-3\"><code>LOCAL_QUORUM<\/code><\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">Each<\/td><td class=\"column-2\">unsupported<\/td><td class=\"column-3\"><code>EACH_QUORUM<\/code><\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">Global<\/td><td class=\"column-2\"><code>SERIAL<\/code><\/td><td class=\"column-3\"><code>QUORUM<\/code><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-58 from cache -->\n\r\n\r\n\r\n<p>The problem you might run into, however, is that LWTs do not have an <code>each_serial<\/code> mode. They only have <code>local<\/code> and <code>global<\/code>. There\u2019s no way to tell the LWT you want quorum in each datacenter.\u00a0<\/p>\r\n\r\n\r\n\r\n<p><code>local_serial<\/code> is good for performance, but transactions on different datacenters could overlap and be inconsistent. <code>serial<\/code> is more expensive, but normally guarantees correctness as long as all queries agree on cluster size. But what if a query straddles a topology change that changes quorum size?\u00a0<\/p>\r\n\r\n\r\n\r\n<p>Let\u2019s use global quorum to show how this plays out. If a LWT starts when RF=3, at least two hosts must process it.\u00a0<\/p>\r\n\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"358\" height=\"440\" class=\"wp-image-110677\" src=\"https:\/\/www.backblaze.com\/blog\/wp-content\/uploads\/2024\/01\/Cassandra_3_Query-1-e1704475479686.png\" alt=\"\" srcset=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_3_Query-1-e1704475479686.png 358w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_3_Query-1-e1704475479686-244x300.png 244w\" sizes=\"auto, (max-width: 358px) 100vw, 358px\" \/><\/figure>\r\n<\/div>\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 10px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<p>While it\u2019s running, the topology changes to two datacenters (A and B) each with RF=3 (so six replicas total) with a quorum of four. There\u2019s a chance that a query affecting the same partition could then run without overlapping nodes, which means consistency guarantees are not maintained for those queries. For that query, quorum is four out of six where those four could be the three replicas in datacenter B and the remaining replica in datacenter A.\u00a0<\/p>\r\n\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"558\" height=\"489\" class=\"wp-image-110678\" src=\"https:\/\/www.backblaze.com\/blog\/wp-content\/uploads\/2024\/01\/Cassandra_4_Query-2-e1704475537774.png\" alt=\"\" srcset=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_4_Query-2-e1704475537774.png 558w, https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/Cassandra_4_Query-2-e1704475537774-300x263.png 300w\" sizes=\"auto, (max-width: 558px) 100vw, 558px\" \/><\/figure>\r\n<\/div>\r\n\r\n\r\n<div class=\"wp-block-spacer\" style=\"height: 10px;\" aria-hidden=\"true\">\u00a0<\/div>\r\n\r\n\r\n\r\n<p>Those two queries are on the same partition, but they\u2019re not overlapping any hosts, so they don\u2019t know about each other. It violates the LWT guarantees.<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">The Solution to LWT Inconsistency<\/h2>\r\n\r\n\r\n\r\n<p>What we needed was a way to make sure that the definition of \u201cquorum\u201d didn\u2019t change too much in the middle of a LWT running. Some change is okay, as long as old and new are guaranteed to overlap.<\/p>\r\n\r\n\r\n\r\n<p>To account for this, you need to change the replication factor one level at a time and make sure there are no transactions still running that started before the previous topology change before you make the next. Three replicas with a quorum of two can only change to four replicas with a quorum of three. That way, at least one replica must overlap. The same thing happens when you go from four to five replicas or five to six replicas. This also applies when reducing the replication factor, such as when tearing down the old datacenter after everything has moved to the new one.<\/p>\r\n\r\n\r\n\r\n<p>Then, you just need to make sure no LWT overlaps multiple changes. You <em>could<\/em> just wait long enough that they\u2019ve timed out, but it\u2019s better to be sure. This requires querying the internal-only <code>system.paxos<\/code> table on each host in the cluster between topology changes.<\/p>\r\n\r\n\r\n\r\n<p>We built a tool that checks to see whether there are still transactions running from before we made a topology change. It reads <code>system.paxos<\/code> on each host, ignoring any rows with <code>proposal_ballot=null<\/code>, and records them. Then after a short delay, it re-reads <code>system.paxos<\/code>, ignoring any rows that weren\u2019t present in the previous run, or any with <code>proposal_ballot=null<\/code> in either read, or any where <code>in_progress_ballot<\/code> has changed. Any remaining rows are potentially active transactions.\u00a0<\/p>\r\n\r\n\r\n\r\n<p>This worked well the first few times that we used it, on 3.11.6. To our surprise, when we tried to migrate a cluster running 3.11.10 the tool reported hundreds of thousands of long-running LWTs. After a lot of digging, we found a small (but fortunately well-commented) performance optimization added as part of a correctness fix (CASSANDRA-12126), which means <code>proposal_ballot<\/code> does not get set to null if the proposal is empty\/noop. To work around this, we had to actually parse the proposal field. Fortunately all we need is the <code>is_empty<\/code> flag in the third field, so no need to reimplement the full parsing code. A big impact to us for a seemingly small and innocuous change piggy-backed onto a correctness fix, but that\u2019s the risk of directly reading internal-only tables.\u00a0<\/p>\r\n\r\n\r\n\r\n<p>We\u2019ve used the tool several times now for migrations with good results, but it\u2019s still relatively basic and limited. It requires running repeatedly until all transactions are complete, and sometimes manual intervention to deal with incomplete transactions. In some cases we\u2019ve been able to force-commit a long-pending LWT by doing a <code>SERIAL<\/code> read of the partition affected, but in a couple of cases we actually ended up running across LWTs that still didn\u2019t seem to complete. Fortunately in every case so far it was in a temporary table and a little work allowed us to confirm that we no longer needed the partition at all and could just delete it.<\/p>\r\n\r\n\r\n\r\n<p>Most people who use Cassandra may never run across this problem, and most of those who do will likely never track down what caused the small mystery inconsistency around the time they did a datacenter migration. If you rely on LWTs and are doing a datacenter migration, we definitely recommend going through the extra steps to guarantee consistency until and unless Cassandra implements an <code>EACH_SERIAL<\/code> consistency level.<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">Using the Tool<\/h2>\r\n\r\n\r\n\r\n<p>If you want to use the tool for yourself to help maintain consistency through datacenter migrations, you can find it <a href=\"https:\/\/github.com\/Backblaze\/cassandra_lwt_migration_tool\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. Drop a note in the comments to let us know how it\u2019s working for you and if you think of any other ways around this problem\u2014we\u2019re all ears!<\/p>\r\n\r\n\r\n\r\n<div class=\"abstract\" style=\"line-height: 1.8; margin: 24px 12px; padding: 24px 12px 10px 12px;\">\r\n<h4>If You\u2019ve Made It This Far<\/h4>\r\n<p>You might be interested in signing up for our <a href=\"https:\/\/info.backblaze.com\/tech-community-sign-up\" target=\"_blank\" rel=\"noopener\">Developer Newsletter<\/a> where our resident Chief Technical Evangelist, Pat Patterson, shares the latest and greatest ways you can use B2 Cloud Storage in your applications.<\/p>\r\n<\/div>\r\n","protected":false},"excerpt":{"rendered":"<p>When considering Cassandra datacenter migrations with lightweight transactions, the Backblaze team discovered an inconsistency in execution. Read about the new, open source tool we&#8217;ve developed and released to solve it. <\/p>\n","protected":false},"author":101,"featured_media":110675,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[7,434,483],"tags":[468],"class_list":["post-110674","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cloud-storage","category-featured-1","category-tech-lab","tag-b2cloud","entry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Open Source Tool for Consistency in Cassandra Migrations<\/title>\n<meta name=\"description\" content=\"If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process this might be why.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Open Source Tool for Consistency in Cassandra Migrations\" \/>\n<meta property=\"og:description\" content=\"If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process this might be why.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/\" \/>\n<meta property=\"og:site_name\" content=\"Backblaze Blog | Cloud Storage &amp; Cloud Backup\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/backblaze\" \/>\n<meta property=\"article:published_time\" content=\"2024-01-05T17:43:26+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-12-11T21:29:02+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1440\" \/>\n\t<meta property=\"og:image:height\" content=\"820\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Elliott Sims\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@backblaze\" \/>\n<meta name=\"twitter:site\" content=\"@backblaze\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Elliott Sims\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Open Source Tool for Consistency in Cassandra Migrations","description":"If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process this might be why.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/","og_locale":"en_US","og_type":"article","og_title":"Open Source Tool for Consistency in Cassandra Migrations","og_description":"If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process this might be why.","og_url":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/","og_site_name":"Backblaze Blog | Cloud Storage &amp; Cloud Backup","article_publisher":"https:\/\/www.facebook.com\/backblaze","article_published_time":"2024-01-05T17:43:26+00:00","article_modified_time":"2025-12-11T21:29:02+00:00","og_image":[{"width":1440,"height":820,"url":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","type":"image\/png"}],"author":"Elliott Sims","twitter_card":"summary_large_image","twitter_creator":"@backblaze","twitter_site":"@backblaze","twitter_misc":{"Written by":"Elliott Sims","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#article","isPartOf":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/"},"author":{"name":"Elliott Sims","@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#\/schema\/person\/e0dec9b544aa70f19772d02bc7d3eb6c"},"headline":"New Open Source Tool for Consistency in Cassandra Migrations","datePublished":"2024-01-05T17:43:26+00:00","dateModified":"2025-12-11T21:29:02+00:00","mainEntityOfPage":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/"},"wordCount":2228,"commentCount":0,"publisher":{"@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#primaryimage"},"thumbnailUrl":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","keywords":["B2Cloud"],"articleSection":["Cloud Storage","Featured","Tech Lab"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/","url":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/","name":"Open Source Tool for Consistency in Cassandra Migrations","isPartOf":{"@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#primaryimage"},"image":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#primaryimage"},"thumbnailUrl":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","datePublished":"2024-01-05T17:43:26+00:00","dateModified":"2025-12-11T21:29:02+00:00","description":"If you\u2019ve ever tried to do a datacenter migration in Cassandra and something got corrupted in the process this might be why.","breadcrumb":{"@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#primaryimage","url":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","contentUrl":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","width":1440,"height":820,"caption":"A decorative image showing the Cassandra logo with a function represented by two servers on either side of the logo."},{"@type":"BreadcrumbList","@id":"https:\/\/www.backblaze.com\/blog\/new-open-source-tool-for-consistency-in-cassandra-migrations\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/"},{"@type":"ListItem","position":2,"name":"New Open Source Tool for Consistency in Cassandra Migrations"}]},{"@type":"WebSite","@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#website","url":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/","name":"Backblaze Cloud Solutions Blog","description":"Cloud Storage &amp; Cloud Backup","publisher":{"@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#organization","name":"Backblaze","url":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/i0.wp.com\/www.backblaze.com\/blog\/wp-content\/uploads\/2017\/12\/backblaze_icon_transparent.png?fit=512%2C512&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.backblaze.com\/blog\/wp-content\/uploads\/2017\/12\/backblaze_icon_transparent.png?fit=512%2C512&ssl=1","width":512,"height":512,"caption":"Backblaze"},"image":{"@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/backblaze","https:\/\/x.com\/backblaze","https:\/\/www.youtube.com\/user\/Backblaze","https:\/\/en.wikipedia.org\/wiki\/Backblaze"]},{"@type":"Person","@id":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/#\/schema\/person\/e0dec9b544aa70f19772d02bc7d3eb6c","name":"Elliott Sims","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2019\/04\/elliott_sims-150x150.png","url":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2019\/04\/elliott_sims-150x150.png","contentUrl":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2019\/04\/elliott_sims-150x150.png","caption":"Elliott Sims"},"description":"Elliott is a Senior Sysadmin at Backblaze. Previously, he worked at Facebook from 2009-2015, building and maintaining infrastructure through exponential growth.","sameAs":["http:\/\/www.backblaze.com"],"url":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/author\/elliott\/"}]}},"jetpack_featured_media_url":"https:\/\/backblazeprod.wpenginepowered.com\/wp-content\/uploads\/2024\/01\/bb-bh-Cassandra-Consistency-LWTs-and-Migration.png","_links":{"self":[{"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/posts\/110674","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/users\/101"}],"replies":[{"embeddable":true,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/comments?post=110674"}],"version-history":[{"count":0,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/posts\/110674\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/media\/110675"}],"wp:attachment":[{"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/media?parent=110674"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/categories?post=110674"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/backblazeprod.wpenginepowered.com\/blog\/wp-json\/wp\/v2\/tags?post=110674"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}