If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
external raid5 array reporting bad blocks
Folks,
this is the second time in only a couple of weeks that we had raid5 arrays mysteriously going bad on us, reporting bad blocks to the os without any discs being tagged as faulty. Yes, i do mean the virtual disc representing the (redundant) array reporting bad blocks :-( I'm really starting to wonder and would like to know if you've had similar experiences. The first incident involved a (relative to our IT budget) very expensive Dell PowerVault 220S with two Perc2/DC raid controllers (cluster configuration). The Netware 6 started crashing with "node castout/fatal san error", and running a bad blocks scan indicated the array was indeed reporting bad blocks. After quite some back and forth trying to save the data we had to replace 6 out of 8 drives in the array which the firmware _failed_ to report as faulty. We had to reinitialize the array and restore our data from tape. We've also updated the firmware and the vendor tells us "it shouldn't happen again with the new firmware". The second incident is yet to be resolved. We have an EasyRaid II (ca. 4 years old) with 8 discs, also in raid5 configuration (with one hot spare) that has started to report bad blocks: -------------------------- snip ------------------------------ Sep 29 18:33:41 koala10 kernel: scsi1: ERROR on channel 0, id 2, lun 0, CDB: Read (10) 00 28 73 69 f6 00 01 00 00 Sep 29 18:33:41 koala10 kernel: Info fld=0x0, Current sd08:25: sense key Medium Error Sep 29 18:33:41 koala10 kernel: I/O error: dev 08:25, sector 678652280 Sep 29 18:33:41 koala10 kernel: scsi1: ERROR on channel 0, id 2, lun 0, CDB: Read (10) 00 28 73 69 fe 00 00 f8 00 Sep 29 18:33:41 koala10 kernel: Info fld=0x0, Current sd08:25: sense key Medium Error Sep 29 18:33:41 koala10 kernel: I/O error: dev 08:25, sector 678652288 Sep 29 18:33:41 koala10 kernel: scsi1: ERROR on channel 0, id 2, lun 0, CDB: Read (10) 00 28 73 6a 06 00 00 f0 00 Sep 29 18:33:41 koala10 kernel: Info fld=0x0, Current sd08:25: sense key Medium Error Sep 29 18:33:41 koala10 kernel: I/O error: dev 08:25, sector 678652296 -------------------------- snip ------------------------------ Over the years a lot of those 8 discs have been replaced after going bad, and the array always reported them and started rebuilding with the hot spare. A similar thing happend right before it started reporting bad blocks - drive goes bad, the array removes it from the array and starts rebuilding with the hot spare. The array log reported a couple of remapped blocks during the rebuild, but as i understand it remaps are not too uncommon. So now, after the rebuild, the virtual array disc is reporting bad blocks but no drives are being marked as bad by the array and the array log shows no errors. We're going to replace the older drives in the array over the next days and hope it's going away, but i'd really like to know if you have had similar experience with raid5 firmware failing to report bad discs. thx a lot, Nils |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
On the brink of madness... | I.C. Koets | General | 18 | January 31st 05 10:49 PM |
writing cd`s | biggmark | Cdr | 7 | December 31st 04 08:57 AM |
RAID Array "Off Line" on P4C800-E Deluxe | macleme | Asus Motherboards | 4 | September 1st 04 07:22 PM |
Updrade PC | Guy Smith | General | 22 | August 15th 04 01:57 AM |
Nero help needed | foghat | Cdr | 0 | May 31st 04 08:23 PM |