Resolving RAID System Faults.
Our Data Recovery Technical team successfully rebuild failed multi hard disk drive RAID configured server systems. We rebuild failed Windows, Linux, Dell, IBM, HP, Fujitsu and many other system platforms. If you have a RAID system failure, failed back-up, corrupt file system or corrupt RAID configuration table, then our technical team has seen them all many times before. Rebuilding and recovering data from faulty RAID configured servers and storage systems is what we do !
RAID Rebuild and Data Recovery.
If your organisation has been disrupted by a failed RAID system with a consequential loss of data our RAID data recovery experts are on standby 24 x 7 to assist in getting your business back up and running as quickly as possible.
We provide data recovery and system restore solutions for all RAID levels, and we are experienced at dealing with everything from a RAID controller problem to power surges and virus attacks.
- Repair failed hard disk drives to make them operational.
- Clone all your system hard drives and volumes.
- Restore, rebuild and configure system and file structures.
- Extract, check and test your operational data.
- Rebuild server system and restore data.
- Validate operational capability.
RAID Recovery Offices.
When a RAID Hard Disk Drive drops out of the Array and has failed it is vital that it is repaired in a clean air and ESD compliant workshop, particularly when the hard drive concerned forms an integral part of a striped volume.
It is neither commercially viable or practical to have leading edge workshops and hard disk drive repair facilities and capabilities in all major cities. Datlabs clean air laboratory and workshop facilities are strategically located in Manchester. This RAID recovery centre of excellence is served by easy access to motorway and rail networks and is the home of Industry leading technicians. Using our dedicated express courier service your failed server can be transported and worked on within hours of your first call.
RAID Data Recovery Process:
RAID Data Recovery Process.
Datlabs processes are designed to rebuild failed RAID systems without compromising your stored data. Once the RAID has been rebuilt our technicians will transfer recover your data and transfer it to an alternative storage system. Our RAID data recovery process requires the use of industry specific tools and test gear and a detailed knowledge of operating system and file system configurations. Datlabs technical and operational staff have assisted hundreds of businesses in rebuilding all types of damaged RAID systems.
- Planning business operations restoration.
- Fault diagnostics and system imaging.
- Configuration rebuild.
- Data recovery and system restoration.
Each of these listed steps can be undertaken on site or at our service centre depending upon the circumstances.
RAID Server Restores.
RAID disk arrays provide redundancy, capacity and performance over standard single volume disk systems, however once they have failed, they are often complex and difficult to recover and can the consequences can have a severe impact on business operations and finances. Datlabs first aim is to understand the situation you and your business finds itself in and subsequently determine a plan of action that militates against further damages and uncertainty .
Datlabs technical staff are experienced in dealing with all types of RAID configuration recovery and restore operations. They provide, where necessary on-site emergency support with an initial consultancy and planning service. They will require knowledge of your system, data structures the applications, communications systems and the importance of your data and system to your operations. They will want to also understand what backup and contingency services are available and what part these need to play in the overall process of recovery.
Our staff will then devise and agree with you and your technical support team a plan of action to follow and provide clear estimates of costs and timescales.
RAID Fault Diagnosis.
To diagnose and rebuild a RAID system we only require the hard disks making up the RAID volume.
However we will want to determine how to deliver your data to you in order to restore the system to an operational state. The process begins by looking at the kinds of fault that have contributed to the system failure.
The disk image or the low level binary contents of each disk are then copied out. Next, analysis is performed on the disk images. A process of de-stripping will be carried out on each of the extracted disk images upon confirming the RAID types, correct orientation of disk elements forming the RAID volume, the raid strip block size and the associated parity location. Different manufacturers may have slightly different RAID settings so additional tweaks may be needed.
Data Recovery from RAID Systems needing Config Rebuild.
Once the data layout pattern making the RAID logical volume has been identified and confirmed, the critical data will then be extracted to other disk media. A part of the extraction process is to repair the primitive data blocks that may have been damaged or have existed leading up to the final system failure. This is a very necessary step and is unfortunately time consuming and the source of much frustration in terms of customer expectations. The data integrity is then evaluated to ensure that the data is of acceptable quality before a file list is finally produced for customer review.
RAID File Recovery.
Once your data and file listings have been recovered your system will need rebuilding. This will require cooperation between our technical support team and your IT support team to ensure data is accessible to the end user. Once you have a full system restore the Datlabs technical team will undertake a system review to ensure you have the necessary contingency management and disaster recovery systems in place.
Typical RAID Problems:
RAID Hard Disk Drive Failure
A RAID array is essentially a number of hard drives across which data is stored or replicated for the purpose of improving system performance, security or a combination of both. RAID arrays are usually configured and managed as a part of a maintenance regime either automatic or manual that militates against the possibility of data loss. The greatest risks to the data stored on a RAID array are hard disk drive failure, malware attack or poor maintenance procedures.
RAID Hard Disk Drive Replacement
For many arrays a drive that is accumulating errors will be forced out of service and its data reconstructed across the remaining good drives available to the array controller.
In this case the data on the failing hard disk drive must be rebuilt from the parity data on the remaining active drives and written to a hot spare. Post-failure replacement takes considerably longer due to the calculations that must take place in order to rebuild the data. To militate against the risk of drive failures, you should always try to ensure that the RAID array you have is first of all capable of actually performing a rebuild and also that it has compatible hot swap hard drives or replacements to rebuild to ! A rigorous data back-up regime is also a must do for any server system.
Scheduled system rebuilds are normally better undertaken when system downtime can be tolerated as they can take a considerable time where the stored data volumes are relatively large. On most RAID configured systems, rebuilds can be prioritised against other system related activities such that the rebuild will occur in preference to operational demands.
RAID Single Hard Disk Drive failure
Following the failure of a hard drive within the RAID Array, the system may still be accessible however its subsequent operation without fault tolerance/redundancy means it is left vulnerable to a catastrophic system failure. In this case all current data should be backed up before any rebuild is attempted. It is also probable that the contemporary hard drives making up the RAID volume are now at consequential risk of failure.
You should also be aware that a RAID rebuild process is generally IO intensive and can put a greater workload on potentially problematic hard disks within the volume/s. Under these circumstances a re-configuration of applications may not be wise i.e. if your rebuild fails you may end up with more failed hard disk drives than you bargained for.
RAID Controller Problems
RAID controllers manage data storage, access and the maintenance of your multi disk system. Implementations of RAID controllers include Mylex, Adaptec, Compaq, HP and IBM. These implementations can rebuild a failed data volume from a hot standby drive or a replacement drive through a hot swap. A rebuild will however fail if two disk volumes fail simultaneously or if part of the native configuration is actually stored on a single failed volume. RAID’s can also fail as a result of the following situations and frequently a combination of one or more of them:
- Malfunctioned Controller
- Raid rebuild error or volume reconstruction problem
- Missing RAID partition
- Multiple disk failure in off-line state resulting in loss of RAID volume
- Wrong replacement of good disk element belonging to a working raid volume
- Power Surge
- Data Deletion or reformat
- Virus Attack
- Loss of RAID configuration settings or system registry
- Inadvertent reconfiguration of RAID volume
- Loss of RAID disk access after system or application upgrade.
RAID 5 Bungled Rebuild
Datlabs technicians are frequently engaged by customers who have inadvertently bungled the RAID5 rebuild process. Once a mistake has been made it is not obvious that there is no longer a simple means of rebuilding the RAID and restoring the stored data and operating system. The damage occurs if one removes several disks from the RAID5 array, then plugs them back in a different order, and then performs a RAID 5 rebuild. The RAID 5 rebuild, sometimes called a re-synch, re-calculates and rewrites the XOR parity blocks of the array. A rebuild is executed automatically once the drive is removed and re-inserted, or after a power failure.The damage caused can be explained as :-
If you have problem with your RAID server, some of the processes listed below may help you to minimize further loss of data or at least increase your chance of successful recovery with the right expert.
Original 1 2 3 P 4 5 P 6 7 P 8 9 P 10 11 12
With drives swapped 1 2 P 3 4 5 6 P 7 P 9 8 P 10 12 11
After the rebuild 1 2 P X 4 5 X P 7 X 9 8 X 10 12 11
P is the original parity and X the new parity.
Recovering RAID Configuration tables
In the above case a software recovery will not be possible. A manual recovery can be accomplished but only by very experienced and capable technicians such as at the Datlabs workshops and Laboratories and there is sufficient and relevant data available to rebuild the contents of the array.
In the example a recovery requires knowledge of the original and current block sizes and disk order. Datalbs engineers are able to reverse engineer the configuration by iterative means without compromising the stored data.
Further RAID Problems
At default the RAID controller will instigate the rebuild automatically and in fact will exacerbate the problem. The rebuild in progress will destroy areas of stripped data and by the time the effects are apparent it can be too late for remedial work to be effective.
Restoring Drives in the array:
Without detailed knowledge of the disk drive order it is easy to mistakenly pull out a disk that is not the failed one. When this occurs the failed array will in fact be missing two disks and not just one. A two disk failure situation is beyond the auto recovery capabilities of the RAID 5 configuration.
In the majority of cases it is possible to bring the array back to life by re-inserting drives in a specific order however it is essential that drives are labelled corresponding to their original port in order to avoid further cock-ups and also identify, remove, and label the faulty drive.
Be aware however that Datlabs recommend that in ALL cases if you submit a failed RAID 5 array for data recovery and rebuild . IF YOU MESS UP the order in which you insert the disks you will get an enormous number of zeros added and mixed into the data. This sort of damage is generally fatal to subsequent rebuild and recovery attempts.
RAID Data Recovery Advice
- Place a value on your data and consider fully the consequences of losing your business or critical data.
- Estimate the true cost of replicating the non accessible data and how long will data entry take.
- Assess who will be affected yourself, your accountant, your customers, your family etc.
- Select an established Data Recovery service provider with clean room facilities, experienced technical support staff and a well organised customer services operation.
- Before you part with your system hardware, you may want to image all the working disks. It is better to play safe so you will always have a backup set to rework in case the original raid server suffered further corruption of any kinds. A good Data Recovery service provider will offer to undertake this for you.
- Carefully take note of some the following information if it’s applicable.
- Stripe block size (normally a multiple factor of 8K) and order of disk elements in which the RAID volume is formed. Such info can normally be found in the RAID BIOS or RAID configuration Manager.
- Description of problems
- Description of user’s attempt
- List of critical data and folders and any special requirements
- Label each disk before taking them out and carefully note the corresponding position.
- Carefully pack your disks or complete system for delivery to your chosen data recovery expert. Both working and damage disks are needed.
Frequently Asked Questions about Faulty RAID’s
Here are a few general questions and answers that you may find of interest.
Some answers depend on the capability of your RAID and controller, however you will get the general theme which is
“if you havent done this before and don’t understand how a RAID system works, then dont do it !”
Datlabs recommends that any actions with a RAID system are only undertaken by fully trained and competent technicians and with caution.
Can I delete my RAID array and create a new one without data loss?
Do not delete an array unless it is absolutely necessary. If for whatever crazy reason you are contemplating deleting an array, then back up the data in the array and also verify that this this data back-up can be restored.
Before deleting the array . Datlabs advice : don’t even think about it !
How can I find out what RAID levels are configured on my system?
You can generally see identify your configuration using the System Manager. Typically right-click an array (shown as a “virtual disk” in Array Manager) and select Properties to see what RAID level the array is. You can make RAID arrays easier to identify by naming them based on the RAID level and the physical disks they contain.
Do all drives in a RAID array have to be the same size?
It is recommended for continuity and safety purposes that all drives are of the same capacity and manufacture . In general all drives in an array do not have to be the same size as all drives in the array will default to the smallest drive in the array however you can see the dangers that are evident with installed drives of different capacity , in a fault situation.
Can I hot swap a drive in a RAID configuration?
If your system supports hot-swappable drives (the ability to replace or insert a drive without powering down the system), you can replace a failed drive in a RAID array with a good drive that is the same size or larger than the other drives in the array. You can also insert spare drives to be configured into arrays or used as hot spares. When you add or replace a drive in an array, the RAID array begins to rebuild using the new drive.
NOTE: Never pull an active drive from an array unless it is placed in a failed state, out of service or prepared for removal.
Can I upgrade controllers without data loss?
A Data Loss situation will occur if you initialize a new controller that stores the configuration data differently than the controller it is replacing.
Think about this ! This is really not a good idea is it ? Get expert advice and assistance.
How do hot spares work?
A hot spare is a drive that is on standby in case another drive fails. Depending on how the array is configured, the drive is either picked up automatically and the array is rebuilt, or you manually select the drive and rebuild the array. Most systems ship with the automatic rebuild feature enabled. When a drive fails, the array rebuilds automatically using the hot spare. This is assuming that automatic rebuild is enabled
Note : If automatic rebuild is disabled, you must manually start the rebuild process. During a rebuild you may notice degraded performance on the drives.
How do I replace a failed drive?
If you introduce a new drive into the same slot where a bad drive was located, the failback will generally be automatic (assuming that automatic rebuild is enabled on the system). In other words, a new drive inserted into the same slot as a previously bad drive acts as a dedicated hot spare for that array.
What is the rebuild rate?
In RAID 1, 5, 10, arrays, you can rebuild a failed drive by re-creating the data that was stored on the drive before it failed. The rebuild rate is the percentage of the compute cycles dedicated to rebuilding failed drives. A rebuild rate of 100 per cent means that the system is totally dedicated to rebuilding the failed drive, while a 0 per cent rebuild rate means that the rebuild occurs only when the system is not doing anything else.
What are stripe size and width?
Disk striping, which enables data to be written across multiple hard drives, partitions each drive into stripes that can vary in size . The stripes are interleaved, and the combined storage space consists of stripes from each drive. Stripe width is the number of disks involved in an array where striping is implemented. For example, a four-disk array with disk striping has a stripe width of four. Stripe size is the length of the interleaved data segments that a RAID controller writes across multiple drives. Disk striping enhances performance because multiple drives are accessed simultaneously, but it does not provide data redundancy.
Is disk spanning the same thing as RAID?
No. Disk spanning combines multiple drives and displays them in the operating system as one drive. For example, four 1 TB hard drives that are spanned appear as one 4 TB drive in the operating system. Disk spanning alone provides no data protection.