Since 2013, Backblaze has published statistics and insights based on the hard drives in our data center. You'll find links to those reports below. We also publish the data underlying these reports, so that anyone can reproduce them. You'll find an overview of this data and the download links further down this page.

Hard Drive Reliability - Annualized Hard Drive Failure Rates

Backblaze Hard Drive Stats Chart
Backblaze Quarter Blog Icon

Quarterly Stats Blog Articles

Backblaze Caret Down Icon
Backblaze Hard Drive Blog Icon

Hard Drive Related Blog Articles

Backblaze Caret Down Icon
Backblaze Overview Hard Drive Data Icon

Overview of the Hard Drive Data

Backblaze Caret Down Icon

Each day in the Backblaze data center, we take a snapshot of each operational hard drive. This snapshot includes basic drive information along with the S.M.A.R.T. statistics reported by that drive. The daily snapshot of one drive is one record or row of data. All of the drive snapshots for a given day are collected into a file consisting of a row for each active hard drive. The format of this file is a "csv" (Comma Separated Values) file. Each day this file is named in the format YYYY-MM-DD.csv, for example, 2013-04-10.csv.

The first row of the each file contains the column names, the remaining rows are the actual data. The columns are as follows:

  • Date – The date of the file in yyyy-mm-dd format.
  • Serial Number – The manufacturer-assigned serial number of the drive.
  • Model – The manufacturer-assigned model number of the drive.
  • Capacity – The drive capacity in bytes.
  • Failure – Contains a “0” if the drive is OK. Contains a “1” if this is the last day the drive was operational before failing.
  • 2013-2014 SMART Stats – 80 columns of data, that are the Raw and Normalized values for 40 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2015-2017 SMART Stats – 90 columns of data, that are the Raw and Normalized values for 45 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q1) SMART Stats – 100 columns of data, that are the Raw and Normalized values for 50 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q2) SMART Stats – 104 columns of data, that are the Raw and Normalized values for 52 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
  • 2018 (Q4) SMART Stats – 124 columns of data, that are the Raw and Normalized values for 62 different SMART stats as reported by the given drive. Each value is the number reported by the drive.
Backblaze Helpful Hints and Caveats Icon

Helpful Hints and Caveats

Backblaze Caret Down Icon

The Q4 2018 versus Q2 2018 Schema

For Q4 2018 we began tracking 10 additional SMART attributes, meaning there are 20 additional fields in the Q4 2018 schema. The new SMART attributes being collected for Q4 2018 are the raw and normalized values for: smart_16, smart_17, smart_168, smart_170, smart_173, smart_174, smart_218, smart_231, smart_232, and smart_233.


The Q2 2018 versus Q1 2018 Schema

For Q2 2018 we began tracking 2 additional SMART attributes, meaning there are 4 additional fields in the Q2 2018 schema. The new SMART attributes being collected for Q2 2018 are the raw and normalized values for: smart_23 and smart_24.


The Q1 2018 versus the 2015-2017 Schema

For Q1 2018 we began tracking 5 additional SMART attributes, meaning there are 10 additional fields in the Q1 2018 schema. The new SMART attributes being collected for Q1 2018 are the raw and normalized values for: smart_177, smart_179, smart_181, smart_182, and smart_235.

The 2015 versus the 2013-14 Schema

For 2015 we began tracking 5 additional SMART attributes, meaning there are 10 additional fields in the 2015 schema. The new SMART attributes being collected for 2015 are the raw and normalized values for: smart_22, smart_220, smart_222, smart_224, and smart_226.


Blank Fields

The daily snapshots record the SMART stats information reported by the drive. Since most drives do not report values for all SMART stats, there are blank fields in every record. Also, different drives may report different stats based on their model and/or manufacturer.


Inconsistent Fields

Reported stats for the same SMART stat can vary in meaning based on the drive manufacturer and the drive model. Make sure you are comparing apples-to-apples as drive manufacturers don't generally disclose what their specific numbers mean.


Out-of-Bounds Values

The values in the files are the values reported by the drives. Sometimes, those values are out of whack. For example, in a few cases the RAW value of SMART 9 (Drive life in hours) reported a value that would make a drive 10+ years old, which was not possible. In other words, it’s a good idea to have bounds checks when you process the data.


The Number of Drives Will Change

When a drive fails, the "Failure" field is set to "1" on the day it fails. The next day, the drive is removed from the list and is no longer counted, reducing the overall number of drives. On the other hand, new drives are added on a regular basis increasing the overall number of drives. In other words, count the number of drives each day.


Computing Drive Days

Each day a drive is listed in a daily snapshot file it counts as one drive day. For example, if there are 35,000 drives listed in a daily snapshot file that equals 35,000 drive days. In the docs.zip file you can download below, you’ll find a PDF file named “computing_failure_rates.pdf” which describes how we compute drive days, drive years, and drive failures rates.


Drive Age

As noted, the RAW value of SMART 9 is the number of hours a drive has been in service up to that point. To determine the drive’s age in days, you divide the reported number by 24.


Missing Data

In Q1 2017, the SMART stats for some of the hard drives during the period of January 28 thru 31, 2017 were not recorded. On February 1, 2017 complete reporting resumed. While this had no effect on the how we use the data, it may, depending on how you use the data, effect your efforts.

Backblaze Information Icon

How You Can Use the Data

Backblaze Caret Down Icon

You can download and use this data for free for your own purpose, all we ask is three things:

  1. you cite Backblaze as the source if you use the data,
  2. you accept that you are solely responsible for how you use the data, and
  3. you do not sell this data to anyone, it is free.
Backblaze Hard Drive Download Icon

Downloading the Raw Hard Drive Test Data

Backblaze Caret Down Icon

Hopefully the information above has provided you with the information you need to access and use the hard drive data we have collected. Here is the data:

Backblaze Download Icon Download Documentation Files (77.3 KB ZIP file, 111 KB on disk, 5 files)