Skip to content

Disk Health#

https://help.ubuntu.com/community/Smartmontools

  • Failure Trends in a Large Disk Drive Population

  • Google field study covering 100,000+ consumer-grade drives from 12/2005-08/2006

  • correlations between certain S.M.A.R.T. information and annualized failure rates:
    ||||
    |--|--|--|
    |uncorrectable error|198 (0xC6)|after 1st error drives are 39x more likely to fail within 60 days|
    |reallocation count|5 (0x05)|after 1st error drives are 14x more likely to fail within 60 days|
    |offline reallocations|196 (0xC4)|after 1st error drives are 21x more likely to fail within 60 days|
    |probational/current pending sector counts|197 (0xC5)|after 1st error drives are 14x more likely to fail within 60 days|
    |Temperature||little correlation except at at extremes 40-45°,>45°|
    |Seek Errors||little correlation|
    |CRC Errors||little correlation|
    |Power Cycles||little correlation|
    |Calibration Retries||little correlation|
    |Spin Retries||little correlation|
    |Power-on hours||little correlation|
    |Vibration||little correlation|

[!warning] Large number of drive failures occur with minimal to no warnings

  • 56% failed drives: zero counts in major SMART errors (scan errors, reallocation count, offline reallocation, probational count)
  • 36% failed drives: zero counts in all SMART errors