Minimizing Hard Disk Drive Failure and Data Loss/Data Redundancy

RAID

While RAID can be used to reduce the risk of data loss due to drive failure, it costs more to have the same amount of storage capacity available this way, and requires some amount of technical planning and expertise.

RAID 0 should not be used for the operating system because it does not provide any data redundancy and it has a greatly increased probability of failure resulting in data loss. RAID 6 is recommend over RAID 5 for increased redundancy.

Backups

Having a backup is an obvious way to reduce the risk of data loss due to drive failure. For many users, however, having a backup of all their data can be entirely impractical when they have large and increasing amounts of data. Nonetheless, it is critical to backup at least the data that is most important, such as the home directory.

As is obvious, a backup of the files on a hard disk should not be on the same disk, but on a different disk or other location instead.

Parchives

A parchive can be created and stored for sets of important files. This will allow those important files to be recovered if they later become corrupted. A parchive applies particularly to files that do not get modified, e.g. digital media. If any file in the set of source files is modified, the parchive will have to be regenerated.

A parchive can also be integrated with a backup to ensure a robust backup.

Alternatively, file verification using a checksum can be used for verifying the integrity of files. SFV is an adequate file format for this purpose. In contrast with a parchive, file verification only allows file corruption to be detected — it does not allow a corrupted file to be repaired.

Sharing is the ultimate means of preventing data loss. Everything that may be of common interest to others can be shared. This includes the burgeoning data individual users acquire from Usenet and file sharing networks. The philosophy behind sharing is that the shared content will be available for download from others when it is needed back in the event of data loss.

On Usenet, sharing implies posting files and also filling requests when possible. In file sharing networks it means uploading as much as possible of what is downloaded. On university campuses, much can be shared on fast networks such as Direct Connect. To encourage users to download files in such a network, it helps to have files named correctly and categorized in a suitable directory structure. Large amounts of data can also be shared with trusted friends using a designated external hard disk drive.

Continuous sharing puts a continuous strain on the drives containing the shared data. While there is a chance that this can reduce the life of the drive, a risk-benefit analysis is not clear-cut.