They sample the SMART metrics once a day. Would more frequent samples would provide a better signal?
For example, perhaps the read errors number increases about once a week. But in the week before a failure, it increases once a day. Or in the day before a failure, it increases once an hour.
OS-level metrics could also be predictive: write latency, i/o queue length, communication errors, and checksum errors in read data.
For example, perhaps the read errors number increases about once a week. But in the week before a failure, it increases once a day. Or in the day before a failure, it increases once an hour.
OS-level metrics could also be predictive: write latency, i/o queue length, communication errors, and checksum errors in read data.