Since 2013, Backblaze has published statistics and insights based on the hard drives in our data center. You'll find links to those reports below. We also publish the data underlying these reports, so that anyone can reproduce them. You'll find an overview of this data and the download links further down this page.

Hard Drive Reliability - Annualized Hard Drive Failure Rates

Backblaze Hard Drive Stats Chart
Backblaze Quarter Blog Icon

Quarterly Stats Blog Articles

Backblaze Caret Down Icon
Backblaze Hard Drive Blog Icon

Hard Drive Related Blog Articles

Backblaze Caret Down Icon
Backblaze Overview Hard Drive Data Icon

Overview of the Hard Drive Data

Backblaze Caret Down Icon

Each day in the Backblaze data center, we take a snapshot of each operational hard drive. This snapshot includes basic drive information along with the S.M.A.R.T. statistics reported by that drive. The daily snapshot of one drive is one record or row of data. All of the drive snapshots for a given day are collected into a file consisting of a row for each active hard drive. The format of this file is a "csv" (Comma Separated Values) file. Each day this file is named in the format YYYY-MM-DD.csv, for example, 2013-04-10.csv.

The first row of the each file contains the column names, the remaining rows are the actual data. The columns are as follows:

  • Date – The date of the file in yyyy-mm-dd format.
  • Serial Number – The manufacturer-assigned serial number of the drive.
  • Model – The manufacturer-assigned model number of the drive.
  • Capacity – The drive capacity in bytes.
  • Failure – Contains a “0” if the drive is OK. Contains a “1” if this is the last day the drive was operational before failing.
  • 2013-2014 SMART Stats – 80 columns of data, that are the Raw and Normalized values for 40 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2015-2017 SMART Stats – 90 columns of data, that are the Raw and Normalized values for 45 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q1) SMART Stats – 100 columns of data, that are the Raw and Normalized values for 50 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q2) SMART Stats – 104 columns of data, that are the Raw and Normalized values for 52 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q4) SMART Stats – 124 columns of data, that are the Raw and Normalized values for 62 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
Backblaze Helpful Hints and Caveats Icon

Helpful Hints and Caveats

Backblaze Caret Down Icon

Schema Changes

The schema for each quarter may change. The basic information: date, serial_number, model, capacity_bytes, and failure will not change. All of the changes will be in the number of SMART attributes reported for all of the drives in a given quarter. There will never be more than 255 pair of SMART attributes reported. When you load the CSV files for each quarter you will need to account for the potential of a different number of SMART attributes from the previous quarter.


Backblaze Information Icon

How You Can Use the Data

Backblaze Caret Down Icon

You can download and use this data for free for your own purpose, all we ask is three things:

  1. you cite Backblaze as the source if you use the data,
  2. you accept that you are solely responsible for how you use the data, and
  3. you do not sell this data to anyone, it is free.
Backblaze Hard Drive Download Icon

Downloading the Raw Hard Drive Test Data

Backblaze Caret Down Icon

Hopefully the information above has provided you with the information you need to access and use the hard drive data we have collected. Here is the data:

Backblaze Download Icon Download Documentation Files (77.3 KB ZIP file, 111 KB on disk, 5 files)