RAID Degraded Mode
Is your server RAID in degraded mode ?
When a single hard disk drive within a multi-disk RAID configured server fails the system it belongs to will attempt to make available the data stored on the failed disk to the system users.
In normal operation a record of the data on all available drives (including the failed drive) is kept in the form of parity records. From these records the data on any failed drive can be rebuilt on the working system drives that remain in operation.
During a system rebuild, users will experience slower response times and/or error reports. Usually this situation is evident over an extended period of many hours. This is referred to as the system running in “Degraded Mode. “
For Dell RAID Systems an application called Open Manager produces error reports and front panel status displays to show that show the RAID system operation is in trouble and its performance is compromised.
Once the data has been successfully rebuilt, system operation and performance will generally return to normal/acceptable. However the system status display will show as degraded usually alongside a disk fail indicator light on the front panel and or on the failed disk bay.
.
RAID Degraded Mode Recovery
Firstly it is important not to fiddle with the system when it is undergoing a rebuild. You must allow the system to complete its work. Interrupting a system rebuild can cause more serious problems that may lead to a system outage.
It is good practice to examine the systems error report log. If you are lucky the log will give you detailed information regarding the hard disk that is now failed and out of service. In all events it is good practice to undertake a system back up.
Having backed up your system and data you can now replace the out of service hard drive with a compatible new one and then undertake the system rebuild
Any subsequent failed rebuilds are usually a consequence of errors that have been produced in the system parity logs of the remaining hard drives that have occurred during the rebuild process “A bit of a catch twenty two situation. “
You can find Typical Dell RAID trouble shooting information here.
Best practice is to evaluate the system and contact the Datlabs RAID data recovery team. Our team are experts in rebuilding corrupt RAID systems and recovering your stored data.
Active and Passive Servers in Degraded Mode.
“Remember a RAID array is only as good as its last Back-Up”
A typical set up involving high performance servers is having a mirrored system set, one active and the other passive with both systems having an allocated hot spare hard drive.
Hot spare drives are standby drives that can be automatically replace a defective drive and repair degraded storage space. If you have set up a hot spare drive on one server make sure the other server also has the hot spare drive in the same physical position.
If the degraded storage space is on the active server, the system will immediately perform a reconfig from the hot spare repair and automatically replace the defective drive with the hot spare. When the issue is resolved on the active server, the passive server will also replace a drive with the hot spare in order to match the drive configuration on the active server.
If the degraded storage space is on the passive server, the system will not perform hot spare repair as that may cause a drive mismatch between the two servers. You need to manually replace the defective drive and repair the degraded storage space:
Fig One
Fig Two