Hard Drive Stats for Q1 2017

May 9th, 2017

2017 hard drive stats

In this update, we’ll review the Q1 2017 and lifetime hard drive failure rates for all our current drive models, and we’ll look at a relatively new class of drives for us – “enterprise”. We’ll share our observations and insights, and as always, you can download the hard drive statistics data we use to create these reports.

Our Hard Drive Data Set

Backblaze has now recorded and saved daily hard drive statistics from the drives in our data centers for over 4 years. This data includes the SMART attributes reported by each drive, along with related information such a the drive serial number and failure status. As of March 31, 2017 we had 84,469 operational hard drives. Of that there were 1,800 boot drives and 82,669 data drives. For our review, we remove drive models of which we have less than 45 drives, leaving us to analyze 82,516 hard drives for this report. There are currently 17 different hard drives models, ranging in size from 3 to 8 TB in size. All of these models are 3½” drives.

Hard Drive Reliability Statistics for Q1 2017

Since our last report in Q4 2016, we have added 10,577 additional hard drives to bring us to the 82,516 drives we’ll focus on. We’ll start by looking at the statistics for the period of January 1, 2017 through March 31, 2017 – Q1 2017. This is for the drives that were operational during that period, ranging in size from 3 to 8 TB as listed below.

hard drive failure rates by model

Observations and Notes on the Q1 Review

You’ll notice that some of the drive models have a failure rate of “0” (zero). Here a failure rate of zero means there were no drive failures for that model during Q1 2017. Later, we will cover how these same drive models faired over their lifetime. Why is the quarterly data important? We use it to look for anything unusual. For example, in Q1 the 4 TB Seagate drive model: ST4000DX000, has a high failure rate of 35.88%, while the lifetime annualized failure rate for this model is much lower, 7.50%. In this case, we only have a 170 drives of this particular drive model, so the failure rate is not statistically significant, but such information could be useful if we were using several thousand drives of this particular model.

There were a total 375 drive failures in Q1. A drive is considered failed if one or more of the following conditions are met:

  • The drive will not spin up or connect to the OS.
  • The drive will not sync, or stay synced, in a RAID Array (see note below).
  • The Smart Stats we use show values above our thresholds.
  • Note: Our stand-alone Storage Pods use RAID-6, our Backblaze Vaults use our own open-sourced implementation of Reed-Solomon erasure coding instead. Both techniques have a concept of a drive not syncing or staying synced with the other member drives in its group.

The annualized hard drive failure rate for Q1 in our current population of drives is 2.11%. That’s a bit higher than previous quarters, but might be a function of us adding 10,577 new drives to our count in Q1. We’ve found that there is a slightly higher rate of drive failures early on, before the drives “get comfortable” in their new surroundings. This is seen in the drive failure rate “bathtub curve” we covered in a previous post.

10,577 More Drives

The additional 10,577 drives are really a combination of 11,002 added drives, less 425 drives that were removed. The removed drives were in addition to the 375 drives marked as failed, as those were replaced 1 for 1. The 425 drives were primarily removed from service due to migrations to higher density drives.

The table below shows the breakdown of the drives added in Q1 2017 by drive size.

drive counts by size

Lifetime Hard Drive Failure Rates for Current Drives

The table below shows the failure rates for the hard drive models we had in service as of March 31, 2017. This is over the period beginning in April 2013 and ending March 31, 2017. If you are interested in the hard drive failure rates for all the hard drives we’ve used over the years, please refer to our 2016 hard drive review.

lifetime hard drive reliability rates

The annualized failure rate for the drive models listed above is 2.07%. This compares to 2.05% for the same collection of drive models as of the end of Q4 2016. The increase makes sense given the increase in Q1 2017 failure rate over previous quarters noted earlier. No new models were added during the current quarter and no old models exited the collection.

Backblaze is Using Enterprise Drives – Oh My!

Some of you may have noticed we now have a significant number of enterprise drives in our data center, namely 2,459 Seagate 8 TB drives, model: ST8000NM055. The HGST 8 TB drives were the first true enterprise drives we used as data drives in our data centers, but we only have 45 of them. So, why did we suddenly decide to purchase 2,400+ of the Seagate 8 TB enterprise drives? There was a very short period of time, as Seagate was introducing new and phasing out old drive models, that the cost per terabyte of the 8 TB enterprise drives fell within our budget. Previously we had purchased 60 of these drives to test in one Storage Pod and were satisfied they could work in our environment. When the opportunity arose to acquire the enterprise drives at a price we liked, we couldn’t resist.

Here’s a comparison of the 8 TB consumer drives versus the 8 TB enterprise drives to date:

enterprise vs. consumer hard drives

What have we learned so far…

  1. It is too early to compare failure rates – The oldest enterprise drives have only been in service for about 2 months, with most being placed into service just prior to the end of Q1. The Backblaze Vaults the enterprise drives reside in have yet to fill up with data. We’ll need at least 6 months before we could start comparing failure rates as the data is still too volatile. For example, if the current enterprise drives were to experience just 2 failures in Q2, their annualized failure rate would be about 0.57% lifetime.
  2. The enterprise drives load data faster – The Backblaze Vaults containing the enterprise drives, loaded data faster than the Backblaze Vaults containing consumer drives. The vaults with the enterprise drives loaded on average 140 TB per day, while the vaults with the consumer drives loaded on average 100 TB per day.
  3. The enterprise drives use more power – No surprise here as according to the Seagate specifications the enterprise drives use 9W average in idle and 10W average in operation. While the consumer drives use 7.2W average in idle and 9W average in operation. For a single drive this may seem insignificant, but when you put 60 drives in a 4U Storage Pod chassis and then 10 chassis in a rack, the difference adds up quickly.
  4. Enterprise drives have some nice features – The Seagate enterprise 8TB drives we used have PowerChoice™ technology that gives us the option to use less power. The data loading times noted above were recorded after we changed to a lower power mode. In short, the enterprise drive in a low power mode still stored 40% more data per day on average than the consumer drives.
  5. While it is great that the enterprise drives can load data faster, drive speed has never been a bottleneck in our system. A system that can load data faster will just “get in line” more often and fill up faster. There is always extra capacity when it comes to accepting data from customers.

    Wrapping Up

    We’ll continue to monitor the 8 TB enterprise drives and keep reporting our findings.

    If you’d like to hear more about our Hard Drive Stats, Backblaze will be presenting at the 33rd International Conference on Massive Storage Systems and Technology (MSST 2017) being held at Santa Clara University in Santa Clara California from May 15th – 19th. The conference will dedicate five days to computer-storage technology, including a day of tutorials, two days of invited papers, two days of peer-reviewed research papers, and a vendor exposition. Come join us.

    As a reminder, the hard drive data we use is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose, all we ask is three things 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone, it is free.

    Good luck and let us know if you find anything interesting.

Andy Klein

Andy Klein

Andy has 20+ years experience in technology marketing. He has shared his expertise in computer security and data backup at the Federal Trade Commission, Rootstech, RSA and over 100 other events. His current passion is to get everyone to back up their data before it's too late.
  • systemBuilder

    I think it would be most useful if you could publish drive reliability after the infant mortality phase (probaby the first 3 months of life). That is a real test of the drive design and manufacturing quality. Drives fail all the time from infant mortality and the customer is usually better equipped to handle the failure in the first few months of life, and the drive is exchanged for a new one. The most expensive and painful failures are the ones that only last half a year or just a year or two – those ones made it out of childhood but still failed, indicating not just bad luck, but poor design & manufacturing.

  • 2005OEFArmy .

    I’m sure the desire for Seagate enterprise drives has nothing to do with me bitching about unfair comparisons a few years ago. By the way, most Seagate “Enterprise” drives are made by the Desktop part of the business now, so you still haven’t compared the real Seagate enterprise drives in any test and they used to be very good, 5-7 years ago.

  • Pingback: Alta disponibilidad: ¿Cuánto cuesta que el sistema no se caiga? | Blog Sarenet()

  • Mo

    Thank you again for posting this data and the analysis that helps people with fewer skills make sense of it!

  • Jus’ Sayin’

    A scientist looked at your data and had some really interesting things to say about it.

    It’d be great to see a collaboration between you!

    https://bioinformare.blogspot.com/2016/02/survival-analysis-of-hard-disk-drive.html

    • Ross Lazarus

      Thanks – I think the data speak pretty clearly using KM curves with the caveat that they’re designed for units with similar times under observation – which is definitely a badly broken assumption here. Nevertheless, it may help you understand the data better…
      https://bioinformare.blogspot.com.au/ for the record. Github repository with code is linked there. Enjoy!

      • Jus’ Sayin’

        You added some sharp and unique perspective to the discussion, so thanks to you!

  • geky

    I am trying to derive a failure statistic that takes into account the average age in operation of each drive. I believe that value would be a more accuarte performance stat rather than the annualized projection of failures.
    I downloaded the excel sheet “q4-2016-hard-drive-tables-Backblaze” and tried to calculate the values.
    I noticed a few strange things in the data tables within the excel sheet and I give an example:
    For FY 2016:
    WDC WD40EFRX 4 TB 75 17.16 16,790 1 2.17%

    For All Times (April 2013-Dec2016):
    WDC WD40EFRX 4.0TB 46 17.16 46,684 3 2.35%

    How is it possible that the all time count of drives in 2016 is higher (75) than the count for the period since April 2013 (46)?
    In FY2016 how is it possible that the 75+1(failed) drives only clocked 16790 drive days out of a nominal (76*365) 27,740 drive days.
    An explanation could be that many drives were added at the very end of the year, but then the average age of the drives wouldnt be as high as 17.16 months. So how is it possible? Are you sure your data are consistent?

    For all times 46+3=49 drives clocked 46684 drive days which means that each drive on average cloacked 31 drive months on average (46684/49/30). How is this possible with average drive age of 17.16 months?

    How much confidence do you have for the rest of your reported data?

    Regardless of the inconsidtencies in your data, what I am trying to achive is to put into the failure equation the drives’ age.
    So I calculate the nominal drive days = (Total amount of drives used including failed and migrated) x (average drives’ age)

    then divide it by (recorded actual drive days).
    And derive the % of cumulative failures over the drives’ age.
    The resulted cumulative failure rate is comperable to other drives because it takes into account the age of each drive type, rahter than comparing annualized failure rates that doesnt take into account the drives’ age under comparison.

    Would you agree that this method could be a better way to compare your drives?

  • eltano06

    i have 2 3tb toshiba (DT01ACA300). thing are looking good acording to this numbers.

    • Marki

      No, they are not :(
      If you have only one drive (or few drives), it doesn’t matter if the failure rate is 0.1% or 10%. You never know if your 1 drive is the one which will fail.
      Also Backblaze lives with the fact that all those drives will fail. They just want to choose the model which will fail “slowly” so that they have less work replacing them.

  • nand

    I have trouble making sense of these tables. Do you think you could present the data as a survival graph? Basically I’d like the drive age on the X axis and the percentage of drives still operating at that age on the Y axis. That would help figure out which drives fail how quickly in a way that’s independent of their age.

    The current tables basically hide that information because we don’t really know how long the drives have been running for.

    Here’s an example of what they look like: https://0x0.st/SB1.png

  • Pingback: Backblaze 的 硬碟 選購測試經驗分享 - 2016 - Tsung's Blog()

  • Pingback: 2017年第一季各廠牌 硬碟故障率 統計出爐 高達35%故障率的硬碟再度誕生了 | 3PM.hk |esport|電競|打機|遊戲|Online Game|Mobile Game()

  • Pingback: Game life—Backblaze’s hard drive reliability report makes a strong case for 8TB models – Game life——The best life()

  • Heinz Kurtz

    Does Backblaze have any figures/experience/opinions regarding silent corruption, bit error rates, drives returning corrupt data instead of reporting read errors properly? You can find a lot of discussion about this in various blogs and mailing lists, but how often does it actually happen?

  • Pingback: Backblaze 2017Q1 對硬碟的分析 | Gea-Suan Lin's BLOG()

  • Logik

    HGST 4tb continues to appear to be the best choice for me. I bought one early last year from Fry’s for under $100 and have been kicking myself for not buying more. I just assumed they would go down in price, not disappear. Wrong. Anyone have suggestions on where to look for best (consumer) price on HGST drives?

    • Mark

      Don’t forget HGST is a brand of and has been manufactured by Toshiba since 2012. If you do a bit of research on the HGST model you have you can find the corresponding Toshiba model. A lot of the drives are identical in specs but with a different sticker on the top.

      • Logik

        Hmm. HGST is a subsidiary of WD. There was some sort of sale of tech to Toshiba, apparently to satisfy “regulators”. In any case, I just looked at images at Newegg of opened Toshiba and HGST 4tb 7200rpm drives and they are clearly different. Different head actuators, head arms, platter hubs, etc etc. So maybe _some_ Toshiba drives are (or were) the same as HGST, but it doesn’t seem to be generally the case. Also there is a clear difference in failure rates at Backblaze. And finally I’m not seeing a price advantage for Toshiba at the moment.
        Which Toshiba and HGST drives are identical under the sticker?

        • Martin Jones

          The deal mostly worked out that WD got the 2.5″ business (and the HGST brand) and Toshiba got the 3.5″ business, more or less. At first, there were Toshiba drives that looked identical to the HGST version, but as time goes on, they seem to be moving forwards with their own designs (incorporating HGST tech). I personally believe that the apparent higher failure rates of Toshibas are merely “growing pains”, and they’ll likely be on par with the indestructible Deskstars and Ultrastars eventually (basing that on nothing other than that the same thing kinda happened in the transition from IBM to HGST)…

  • Pingback: HGST製HDDの故障率の少なさは驚異的なレベル! | e-猫楽園()

  • Pingback: When will your hard drive fail? - History-To-Share Ceramic Outdoor Memorial Plaques()

  • Pingback: Hard Drive Stats for Q1 2017 | thechrisshort()

  • Pingback: おなじみのHDD故障率データの2017年Q1版公開 今回からエンタープライズ向けHDDの検証もやってるぞ | ジサクテック()

  • Pingback: Enterprise hard disks are faster and use more power, but are they more reliable?()

  • Pingback: ¿Fiabilidad de discos duros? ¿Cuáles son los mejores? | APN Móvil()

  • Pingback: Estadísticas de fiabilidad de discos duros en 2017 - HardLimit()

  • Thanks for sharing… seems Toshiba is doing very well all round though :)

    • therock

      No, but Hitachi is!

      • Mark

        Hitachi HDD’s are owned and manufactured by Toshiba since 2012, it’s the same product with a different label

        • therock

          But what does ‘same’ mean to you since the reliability metrics are different and the physical size of the drives is different?

          Tosh 4TB/7200/128MB NAS drive is 26.1mm/101.6mm/147.00 HGST 4TB/7200/128MB NAS drive is 26.1mm/101.0mm/146mm.

        • Csaba Boros

          Wrong. Hitachi is owned by HGST which is a Western Digital brand for years. During this acquisition WD had to sell Hitachi’s desktop drive assets including manufacturing plants, to Toshiba, to comply with regulations. The only Toshiba drive that was released with sole Hitachi technology was the DT01 series, those drives perhaps supposed to be the 7K3000.B, since their specifications mostly the same as the 7K3000, but with 1TB platters.

          • Mo

            If only someone could work up a current hard-drive-manufacturer corporate-ownership org chart that could accompany Backblaze’s failure tables.

          • Csaba Boros
          • Mo

            Thank you. The diagram was drawn in 2011. Is it still accurate?

          • Sue

            If you scroll down about half way, you’ll see “File History.” The first line says:
            “current 01:07, 4 May 2017” so the diagram is accurate up to that date.

  • Pingback: Enterprise hard disks are faster and use more power, but are they more reliable? | Anthony Colamussi()

  • Pingback: Backblaze’s hard drive reliability data helps you choose a long-lasting option()

  • Pingback: Backblaze’s hard drive reliability data helps you choose a long-lasting option – infoJuice()

  • Pingback: Backblaze Publishes its Hard Drive Reliability Data for the First Quarter of 2017 | Fleekist.com()

  • EricTheRed

    These articles are amazing but it would be great if we could see your datacenter network stats such as bandwidth usage at different times of the day/week. You guys once did this a while ago.

  • Pingback: Enterprise hard disks are faster and use more power, but are they more reliable? - IT Jockies Technology()

  • Pingback: Enterprise hard disks are faster and use more power, but are they more reliable? – NEWS Magazino The Online Daily News Portal #NewsMagazino()

  • Pingback: HGST drives still offer impressive reliability, enterprise models less clear-cut - LI Tech News()

  • Pingback: Backblaze’s hard drive reliability report makes a strong case for 8TB models - Games News 2017()

  • Logik

    Thanks again for this.
    A consideration for home users is that if you have a 4tb drive and an 8tb drive with identical failure rates, when there’s a failure you lose twice as much data with the larger drive. The probability of losing 8tb of data all at once with two 4tb drives with say a 2% failure rate is just .04%. (Did I do that right?) So currently for many of us 4tb drives may be a better choice than 8tb given that the larger don’t seem to be significantly cheaper per TB.
    The Seagate ST4000dx000 looks like a disaster for home users. Any info on the particular failure mode for these?

    • Tim Clevenger

      I noted that the Seagate in question spins faster (7200 RPM vs. 5900 RPM) and has one more platter (5 vs 4) compared to the other 4TB Seagate. Perhaps extra internal heating from having to sling more platters at a higher speed is the culprit here.

      • therock

        The extra platter could be the culprit in other ways as well, motor load, new unproven motors, change in tolerances due to changes on the spindle, modifications to the head mechanism’s height affecting flight height of the head, etc.

        • David Warner

          Very good article.. Already 10 and 12 TB enterprise drives are available in market. why you guys are not using these drives ?

          • therock

            It’s mostly cost analysis, channel availability, and watching metics among early-adopters. TB_per_dollar+(disks*watts)+(cabinet_cost/disks*cabinets)+racks, et al.

            Right now, you can get single-unit pricing on 8TB disks for $25/TB whereas 10TB disks are $36/TB. Those are low-end disks, not disks people would use in Datacenters, btw. It’s less compelling when you add in the power, space, etc, but you still have to be keen on the numbers and get a positive ROI.

            For reference, an 8w load takes 5.25 days to burn through 1Kwh of electricity. The disks have power management, of course, so this is probably close to worst-case. $.08/Kwh is probably high for datacenter bulk-users, so that would be $5.52 to spin a disk for a year. If you could get 4yrs of service out of it, that would be $22 of juice to cover a $160 disk-cost discrepancy. YMMV (a lot!)

      • Logik

        Thanks TIm. These both could be factors.
        I suppose one could mount a heatsink and wee fan on a drive if there were space in say a tower case. My HGST 4tb drive runs hotter than other drives I have had recently (right now 93 degrees F compared to 82 for a Seagate, and 75 for an SSD, in the same case).

    • Andrew Foss

      Depends on what your configuration is. JBOD (Just a Bunch/box Of Disks) or RAID-1 (Mirroring) would be 2% of 2% (0.0004%, but lose one, lose half of the data) RAID-0 (striping) would be 2%. (Lose one, lose it all) of course, this is why you use parity and/or mirroring. (Various flavors of RAID-5)

    • Murk

      Maybe also check the warranty time, I’ve noticed a few years ago that short warranty times show some inside information. I bought a SG drive with 1 year warranty, which broke down after 13 months, while the other 2 with 2 years warranty are still alive today (after 4 years)

    • Marki

      You can’t rely on statistics when you have 1 or 2 drives. You must rely on luck that your drive is not one of those 0.04%.

  • Pingback: Macpro()