RAID 5 RECOVERY AND REBUILD
Under correct operation a RAID 5 configured system is tolerant to the failure of at least one unit hard disk drive in its array. Resulting delays in the auto rebuild process or other common events such as power outages, however, can result in poor maintenance decisions that ultimately leave the RAID Server unable to rebuild.
Recovering a failed RAID 5 involves reconstructing the original data striping and the associated parity fields. Here at Datlabs we have a specialist RAID recovery team on hand to help with all aspects of a RAID 5 failure, rebuild and recovery that can reunite you with your lost data.
Datlabs RAID 5 Recovery Service
Datlabs RAID Server Data Recovery processes and technical expertise take the risk out of handling failure situations. Our Step-By-Step process features:
Immediate technical support.
Fixed cost: diagnosis, disk repair, config rebuild.
Point to Point secure courier
In-Lab hard disk repair.
Hard Disk Drive cloning.
Data recovery costs and timescales.
File system repair
Data retrieval and storage.
In-Lab data testing (where possible).
On Site Support on request.
RAID 5 Rebuild Explained.
No doubt if you are a fair way along with the failure situation you will have found what to do and what not do by happenstance. However a bit more information never goes amiss and you may find the following advice of use.
RAID 5 Rebuild Limitations.
A rebuild does not repair the file system or make inaccessible data , accessible.
Any data that is missing prior to a rebuild will not be evident after a rebuild.
A rebuild will not:
Fix corrupt files or partitions.
Make your server boot if it wasn’t booting in the first place
If the array is not mounting, server not bootable or recently updated files are now corrupt or inaccessible, a rebuild WILL render the failure permanent.
RAID 5 Faulty Hard Disk Drive Replacement.
A RAID-5 rebuild will take a degraded array and restore redundancy. A RAID-5 rebuild will perform XOR calculations on the degraded set and write those values onto the new, healthy drive you just inserted when you replaced a failed one. Unless the array is accessible and all of the important, recently updated data is valid, never run a RAID rebuild.
Testing a RAID 5 Back-up using another Volume
The most common mistake made by Datlabs clients is to replace two failed Hard Disk Drives in a RAID-5 array and restore from backup. Having restored hundreds of GB of data from the backup onto the newly rebuilt array they discover that the backup was corrupted, incomplete, or out of date..
This problem can be is easily avoided by testing a backup on another storage array other than the original failed array prior to a restore.
Avoid rush decisions to restore to available working drives, simply explain to the client your game-plan is to source a new array, test all the backups ,and then deal with the dead array.
RAID 5 Hot Spares.
Many IT professional take a hot-spare to use in a new storage array, fully confident that it never engaged and is blank . Again, verify your backups are current and consistent on another volume completely unrelated to the failed array before utilizing any of the failed arrays drives, including hot-spares.
Dont Guess Parity / Rotation/Stripe/Off-Sets !
If you are not 100% sure then the odds of you guessing correctly are tiny.
Guessing incorrectly can be catastrophic. The O/S may recognize array or file system corruption and start running repairs which will be catastrophic .
The file system indeed is corrupted from the O/S point of view the file system appears corrupt because you have the wrong configuration.
After these repairs are complete, even if you guess the correct configuration the second time around, it will be too late to salvage any of these file definitions that were repaired.
Don’t Force RAID 5 Hard Drives Online !
Until a backup is verified do not force an offline drive online it is offline for a reason, and was almost certainly failing!
If a hard disk drive failed many days or months ago and you jam it back into the array, all data of relevant size will be corrupted.
Say you have a 3 drive array and the stripe size is 64 kb. Now, you force a drive
that failed months ago online . Any file bigger than 192 kb. will be guaranteed to have stripes of its binary run list residing across all three drives.
Any file bigger than 192 kb created or updated subsequent to the initial drive failure is guaranteed to be incomplete and useless. There will be a 30% chance that the actual file definitions of any file updated since the failure would now appear corrupt or missing. Often the O/S will recognize inconsistencies in the file system and run a helpful check-disk subroutine to repair these problems. However these were not corruptions but inconsistencies and the O/S repairs will likely permanently destroy data across all drives.
Never Plug-In Drives Individually .
Our techs frequently encounter situations where all hard drives in a RAID 5 array have been removed and plugged into a USB chassis as an interface to a data recovery software suite. Not only is this a waste of time but generally results in subsequent actions that cause further problems for us during our attempts to rebuild the system
The O/S may automatically attempt to fix what it is now recognizing as corruptions in the partition table / indexes / master file table. There’s a high probability the drive will show up as un-allocated or available space, and some misinformed IT staff will actually initialize the independent drive with a new volume in order to access its data. The drives were not corrupt in the first place, so fixing the corruptions will typically lead to massive data loss.
Running off-the-shelf data recovery software on a single drive of a 3 drive RAID-5 will yield 1/3 of the file definitions invalid. None of the run-list entries will be correct (file definitions only make sense in the context of the full partition), and the only data available will be definitions of where the data was resident ( “ini” files or log files ).
A Brief Summary:
Approach a failure situation with caution and with full knowledge of how a RAID-5 functions under fault conditions. If the RAID configuration utility warns you that you are about to destroy all the data with a particularly action, don’t do it.
You MUST read and understand the manufacturers manual before doing anything.
You MUST only rebuild to a newly added drive if the volume is good .and running degraded.
Do NOT re-use any hard disk drives from a previously failed volume.
ALWAYS verify your backups on a different set of hardware.