Welcome to

Backblaze Drive Stats:

Hard Drive Reliability Test Data

Explore our comprehensive collection of HDD and SSD data and reports

Since 2013, Backblaze has collected, curated, and published the annualized failure rates (AFR) and related statistics from the hard disk drives (HDDs) and solid state drives (SSDs) in our data centers. This collection is the Backblaze Drive Stats dataset. Each quarter we publish updates to the dataset which is open source and can be downloaded using the links in the “Downloading the Drive Stats Dataset” section below.

Drive Stats Q2 2025 Snapshot

Drive count

317,230

Drive failures

1,061

Drive days

28,402,627

Drive population by manufacturer

HGST

Seagate

Toshiba

WDC

Drive reliability: annualized failure rates (AFR)

Period
Drive days
Drives failed
AFR
Quarterly: Q2 2025
28,402,627
1,061
1.36%
Annual: 2024
101,906,290
4,372
1.57%
Lifetime
498,078,717
17,707
1.30%

Drive Stats related podcasts and webinars

LinkedIn Live: Backblaze Drive Stats Q1 2025

Webinar (Contains instructions for querying the Drive Stats data in Iceberg table format)

Webinar

Drive Stats quarterly reports and related articles

We publish our analyses, observations, and insights based on the Drive Stats dataset on a regular basis on the Backblaze Blog which includes the quarterly Hard Drive Stats reports and SSD reports, and related topics such as the cost of storage, bathtub curve, enterprise-level drive management in our data centers, and more.

View the Hard Drive Stats Archives

Overview of the Drive Stats dataset

How we collect the drive data

Every day at Backblaze data center, we take a snapshot of each operational drive. This snapshot includes basic drive information along with the S.M.A.R.T. statistics reported by that drive. The daily snapshot of one drive is one record or row of data. All of the drive snapshots for a given day are collected into a CSV file consisting of a row for each active drive. Each day this file is named in the format YYYY-MM-DD.csv; for example, 2024-03-25.csv.

How the Drive Stats data is organized

The Drive Stats schema is comprised of fields Backblaze includes for each drive record and the raw and normalized S.M.A.R.T. attributes reported by each drive.

Please note, schema changes from quarter to quarter do occur, so you should always check for such changes each quarter and align the data to reflect any changes.

How you can use the Drive Stats data

The Drive Stats dataset is open source and available for you to download below, all we ask is is the following:

  1. Cite Backblaze as the source if you use the data,
  2. Accept that you are solely responsible for how you use the data,
  3. You may sell derivative works based on the data, but
  4. You can not sell the data itself to anyone; it is free.

Querying the Drive Stats dataset

As well as the compressed CSV files listed below, we maintain a copy of the Drive Stats dataset in Apache Iceberg table format. You should be able to use the following read-only credentials to query the dataset from any tools that support Apache Iceberg:

Application Key ID (AWS Key ID): 0045f0571db506a0000000017

Application Key (AWS Secret Key): K004Fs/bgmTk5dgo6GAVm2Waj3Ka+TE

Endpoint URL: https://s3.us-west-004.backblazeb2.com

Region: us-west-004

Bucket: drivestats-iceberg

Path Prefix: drivestats

See the blog post Iceberg on Backblaze B2 for comprehensive instructions on how to directly query the Drive Stats dataset from Trino, Snowflake, and DuckDB.

Downloading the Backblaze Drive Stats dataset

Beginning in 2016 we uploaded the Drive Stats dataset for a given quarter. Prior to 2016 the datasets uploaded were annual (2013, 2014, and 2015). Each item listed below is a ZIP file of containing the .csv files for the named quarter or year. 

All
arrow

FAQ

Beyond your test data, what steps can I take to keep hard drive issues to a minimum?

Maintaining drive health is affected by a variety of factors, including where and how you store the drive, how you use the drive, and so on. Over the years, we’ve seen that even drive models from the same manufacturer have variable reliability.  

To minimize hard drive issues and maximize your data durability, regularly back up your data, ensure proper ventilation and stable temperatures for your drives, handle them carefully to avoid physical shocks, and consider monitoring S.M.A.R.T. data for early warning signs of potential problems.

How long do hard drives typically last, according to Backblaze's data?

Backblaze's data shows that while annualized failure rates (AFR) vary by drive model and age, a large majority of drives operate reliably for many years, often exceeding four or five years with low failure rates. The overall lifetime AFR across all drives in their fleet remains very low, indicating general robustness.

Why does Backblaze make this extensive drive data publicly available?

Backblaze makes this data publicly available for transparency and to champion the open cloud philosophy. By open-sourcing the data, they foster trust, allow the community to analyze and leverage the information, and contribute to a more open and collaborative understanding of cloud infrastructure and hard drive reliability.

How does Backblaze collect this hard drive test data?

Backblaze collects this data by taking a daily snapshot of each operational drive in their data centers. This snapshot includes basic drive information and all of its S.M.A.R.T. statistics, which cover aspects like hours running, temperature, and bad sectors.

A Publicly Traded Company (BLZE)
Backblaze © 2024

Staging secure is temporarily unavailable. Please check for any ongoing deploys. If none are in progress, contact the fullstack team for assistance. Click me to dismiss.