Segmed

Scaling a Business on Medical Image AI Training Datasets and High Efficiency Object Storage

Use Cases
Industry
Integrations
Features

We processed 780 million CT files, 1.4 million studies, on five VMs in four days. Backblaze gave us the throughput to make that happen. The performance surprised even my own team.

Donnie Owen, Chief Product Officer, Segmed

750TB

Migrated and Restructured Easily

1.4M

Studies Processed in 4 Days

75%

Cost Savings

Situation

Segmed operates a proprietary real world data platform that licenses de-identified medical imaging data to researchers and life sciences organizations. The company manages petabytes of DICOM files sourced through live integrations with health systems and outpatient diagnostic centers. As Segmed scaled, AWS S3 egress fees became an unsustainable drag on margins. The company's transfer model, in which large datasets are physically delivered to customers, made every terabyte moved a direct cost. Multi region redundancy at petabyte scale in a hyperscaler was simply out of reach financially.

Solution

Segmed migrated approximately 750TB of data from AWS S3 to Backblaze B2, working with the Backblaze team and our Universal Data Migration service to restructure and normalize bucket architecture during the transfer. The S3 compatible API made the migration straightforward and the production workflow cutover required fewer than four story points of engineering effort. Segmed now stores its full dataset across Backblaze's East and West Coast data centers, gaining active multi-region redundancy at a fraction of hyperscaler pricing.

Result

Backblaze dramatically reduced Segmed's egress costs, improving the margin on every data delivery. Multi region redundancy, previously unaffordable at Segmed's scale, is now live and operational. The team processed 780 million CT files across five virtual machines on Vultr at sustained gigabit level performance, completing a full repackaging of 1.4 million studies in four days. The migration completed cleanly with zero errors, and the Backblaze team was able to restructure inconsistent bucket architecture in a single pass during transfer.

How It Works

Segmed's platform integrates live with health system IT systems to extract and de-identify medical imaging data, primarily DICOM files. Ingested data is processed in a secure environment in our cloud provider and then written directly to Backblaze B2 across East and West Coast data centers. 

For customer specific enrichment and repackaging, data is pulled to Vultr compute instances via the Bandwidth Alliance, enabling egress free processing. Transformed datasets are written back to Backblaze B2, then delivered to research customers. Bucket level data isolation enforces security and compliance controls per data provider.

Share This Case Study

Download Case Study

Segmed is a real world data company that licenses de-identified medical imaging and clinical data to researchers and life sciences organizations. Its proprietary platform integrates directly with health systems and diagnostic imaging centers to extract, de-identify, and package DICOM and clinical datasets at scale.

  • Founded: 2019
  • Industries served: Life sciences, healthcare research, pharmaceutical
  • Data in B2: Scaling past multiple petabytes
Company bio image

Gigabit Performance at Massive File Counts

Segmed processes tens of millions of diagnostic studies, comprised of BILLIONS of DICOM files, many as small as 380 kilobytes. That file profile is punishing for throughput. Despite that, the team achieved sustained gigabit level transfer-transform-transfer performance between Backblaze B2 and Vultr compute through the Bandwidth Alliance. In one repackaging exercise, five low cost VMs processed 780 million CT files representing 1.4 million studies, ingesting, transforming, and writing the data back to Backblaze in under four days. The performance surprised even the engineering team. The total cost was less than the cost of one day of storage in AWS S3.

Most of the team was actually pretty surprised at the performance. We're averaging gigabit scale transfers in and out. That's not what people expect when they leave the hyperscalers.

Donnie Owen, Chief Product Officer, Segmed

No items found.

Smooth Migration to Backblaze

Segmed migrated approximately 750TB from AWS S3. Working with Backblaze, the team also took the opportunity to normalize bucket architecture that had grown inconsistent over our seven year history. Segmed provided a mapping document and the migration team executed the restructure in a single pass during transfer. Every validation came back clean. On the engineering side, adapting the production workflow to write to Backblaze instead of AWS took fewer than four story points. The S3 compatible API meant the team barely had to adjust.

My team's super familiar with the AWS S3 API. It was trivial to adapt our production workflow to stop writing to AWS and start writing to Backblaze. Way less than a sprint.

Donnie Owen, Chief Product Officer, Segmed

Egress Costs Were Killing Margins

Segmed operates a transfer model. When a customer licenses data, Segmed moves the bytes. At petabyte scale, AWS S3 egress fees became a significant percentage of the company's margin on every deal. The calculation was straightforward: the business could not continue to grow its data volume while absorbing those costs. Reducing egress was not a nice to have. It was a prerequisite for the company's unit economics to work at scale.

Egress fees were brutal as a percentage of our margin opportunity. That was one of the biggest drivers in making the move.

Donnie Owen, Chief Product Officer, Segmed

Related Case Studies

A Publicly Traded Company (BLZE)
Backblaze © 2024

Staging secure is temporarily unavailable. Please check for any ongoing deploys. If none are in progress, contact the fullstack team for assistance. Click me to dismiss.