A while ago we noticed log messages on one of the servers that a hard drive was going bad. The logs showed errors reading certain sectors of one of the drives in a software RAID-1 volume. Here’s how we went about diagnosing the faulty drive and replaced it.

Background

The server has several Linux software RAID (mdraid) volumes and the drives are connected to the server’s onboard Intel AHCI SATA controller. The Intel chipset and the Linux AHCI driver provides...