The low level formatting has ECC, which never leaves the drive. That said, there...

Someone · on Sept 20, 2014

And that is where I stated that drives can report a problem. If they 'seed' their ECC algorithm with the sector number (XOR-ing the result with it would be sufficient), they can (statistically) detect that, when they read sector #X, what they got wasn't what they ever wrote as sector #X.

In fact, I guess they already do. If they didn't, there would be misdirected reads, too.

ryao · on Sept 20, 2014

The low level formatting does include a sector number, but it is not part of ECC. I am not sure what your point is. Your theoretical description of how hard drives could work does not reflect reality. Research by CERN and others has confirmed the existence of misdirected writes. Deployed ZFS installations are detecting corruption in situations where the drives report everything is fine. Even if the storage hardware improves, having end to end checksums in the filesystem will continue to make sense.

That said, I think you are fixating on one way that things can go wrong. Another way that misdirected writes can occur is a bit-flip in the micro-controller's memory. This also allows for misdirected reads as well as reading/writing data that has a single bit flipped. These devices micro-controllers do not have ECC memory. Even if it were added, you still need to prove that there are no programming bugs via formal verification, but given that these devices are black boxes that cannot be inspected, you cannot rely on the claim of a proof even if one is done and there would still be the possibility for errata in the micro-controller. It is far easier to just use end-to-end checksums in the filesystem. Even if you think the device is trustworthy, end-to-end checksums give you the ability to check that it is doing what it is supposed to do. You simply do not have that with traditional RAID.