If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Bad sectors on new IDE drive
I have a brand new IDE drive (250GB Maxtor) that has a sector
that consistently triggers a CRC error on read. I find this puzzling, as I thought IDE disks were supposed to automatically remap bad sectors. And always have in my past experience. However, this is not the first time that I come across IDE disks that don't remap in the past few years. Can someone clue me in as to what's happening ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ "Finally I am becoming stupider no more." -- Paul Erdös' epitaph |
#2
|
|||
|
|||
Andre Majorel wrote:
I have a brand new IDE drive (250GB Maxtor) that has a sector that consistently triggers a CRC error on read. I find this puzzling, as I thought IDE disks were supposed to automatically remap bad sectors. And always have in my past experience. However, this is not the first time that I come across IDE disks that don't remap in the past few years. Can someone clue me in as to what's happening ? They should silently remap bad sectors internally until all of the "spare" sectors are used up. When that point is reached, the bad sectors become visible. Perhaps the drive in question is actually a refurb, or was mis-handled during shipping. Or it just happens to have a manufacturing defect. -WD |
#3
|
|||
|
|||
Andre Majorel wrote:
I have a brand new IDE drive (250GB Maxtor) that has a sector that consistently triggers a CRC error on read. I find this puzzling, as I thought IDE disks were supposed to automatically remap bad sectors. And always have in my past experience. However, this is not the first time that I come across IDE disks that don't remap in the past few years. Can someone clue me in as to what's happening ? Well, it could be that all the "spare" sectors used, if that's happened it won't remap at all, because there's no place to remap to. If that happens on a NEW disk, get it replaced... Also, you're talking about CRC errors on read, remember that while remapping on WRITE is trivial, since the data that was lost wasn't interresting anyway (it was going to be overwritten)... For READ it's more complicated, if it could recover the data using the ECC codes it can just remap it and be done with it, but if the data is lost that must be handled special somehow, the error MUST be reported back to the upper layers, so that they know data has been lost. It can either remap the sector immediately, but mark it as "temporarily bad", or defer the remapping until the sector is written the next time (marking the sector as needing remapping). In either case it will continue to report CRC errors until the sector has been written, since that's the only way to indicate that the real data has been lost! It's of course also possible to ignore the data loss and just silently remap on unreadable sectors during read, but one could at least HOPE that no IDE manufacturer cares that little about their customers data! :-) |
#4
|
|||
|
|||
On 2004-04-18, Torbjorn Lindgren wrote:
Andre Majorel wrote: I have a brand new IDE drive (250GB Maxtor) that has a sector that consistently triggers a CRC error on read. I find this puzzling, as I thought IDE disks were supposed to automatically remap bad sectors. And always have in my past experience. However, this is not the first time that I come across IDE disks that don't remap in the past few years. Can someone clue me in as to what's happening ? Well, it could be that all the "spare" sectors used, if that's happened it won't remap at all, because there's no place to remap to. If that happens on a NEW disk, get it replaced... This doesn't seem to be the case : | # smartctl -a /dev/hdc | Device Model: Maxtor 7Y250P0 | SMART support is: Available - device has SMART capability. | SMART support is: Enabled | capabilities: (0x5b) SMART execute Offline immediate. | Auto Offline data collection on/off support. | Suspend Offline collection upon new command. | Offline surface scan supported. | Self-test supported. | No Conveyance Self-test supported. | Selective Self-test supported. | | SMART Attributes Data Structure revision number: 16 | Vendor Specific SMART Attributes with Thresholds: | ID# ATTRIBUTE_NAME FLAG VAL WOR THR TYPE UPDATED RAW_VALUE | 3 Spin_Up_Time 0x0027 252 252 063 Pre-fail Always 1311 | 4 Start_Stop_Count 0x0032 253 253 000 Old_age Always 4 | 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always 2 I'm not quite sure how to read SMART values but 253 seems to mean "very good" or "perfect". The raw value (2) may be the number of remapped sectors. Is there any way to get the details of reallocated sectors from the drive, by the way ? | 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline 0 | 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always 0 | 8 Seek_Time_Performance 0x0027 252 251 187 Pre-fail Always 53547 Brand new : | 9 Power_On_Minutes 0x0032 253 253 000 Old_age Always 39h+19m The drive seems to be in fairly good shape so far : | 10 Spin_Retry_Count 0x002b 252 252 157 Pre-fail Always 0 | 11 Calibration_Retry_Count 0x002b 252 252 223 Pre-fail Always 0 | 12 Power_Cycle_Count 0x0032 253 253 000 Old_age Always 6 | 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always 0 | 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always 0 | 194 Temperature_Celsius 0x0032 253 253 000 Old_age Always 43 | 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always 12285 | 196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline 0 | 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline 2 | 198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline 0 | 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline 0 | 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always 0 | 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always 0 | 202 TA_Increase_Count 0x000a 253 252 000 Old_age Always 0 | 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always 0 | 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always 0 | 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always 0 | 207 Spin_High_Current 0x002a 252 252 000 Old_age Always 0 | 208 Spin_Buzz 0x002a 252 252 000 Old_age Always 0 | 209 Offline_Seek_Performnce 0x0024 149 149 000 Old_age Offline 0 | 99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline 0 | 100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline 0 | 101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline 0 The last five errors were uncorrectable read errors like this one : | Error 352 occurred at disk power-on lifetime: 22 hours | When the command that caused the error occurred, the device | was in an unknown state. | | After command completion occurred, registers we | ER ST SC SN CL CH DH | -- -- -- -- -- -- -- | 40 51 2a d6 bc 43 e0 Error: UNC 42 sectors at LBA = 0x0043bcd6 = 4439254 | | Commands leading to the command that caused the error we | CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name | -- -- -- -- -- -- -- -- --------- -------------------- | 25 00 2a d6 bc 43 e0 08 13047.376 READ DMA EXT | 25 00 2c d4 bc 43 e0 08 12980.816 READ DMA EXT | 25 00 2e d2 bc 43 e0 08 12979.792 READ DMA EXT | 25 00 30 d0 bc 43 e0 08 12978.736 READ DMA EXT | 25 00 32 ce bc 43 e0 08 12977.712 READ DMA EXT Also, you're talking about CRC errors on read, remember that while remapping on WRITE is trivial, since the data that was lost wasn't interresting anyway (it was going to be overwritten)... For READ it's more complicated, if it could recover the data using the ECC codes it can just remap it and be done with it, but if the data is lost that must be handled special somehow, the error MUST be reported back to the upper layers, so that they know data has been lost. It can either remap the sector immediately, but mark it as "temporarily bad", or defer the remapping until the sector is written the next time (marking the sector as needing remapping). In either case it will continue to report CRC errors until the sector has been written, since that's the only way to indicate that the real data has been lost! OK, got it. Thank you. -- André Majorel URL:http://www.teaser.fr/~amajorel/ "Finally I am becoming stupider no more." -- Paul Erdös' epitaph |
#5
|
|||
|
|||
| 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline 2
Andre, this is a sign of trouble. There are two sectors on the disk that could not be read by the operating system. | 40 51 2a d6 bc 43 e0 Error: UNC 42 sectors at LBA = 0x0043bcd6 = 4439254 One of the unreadable sectors is at LBA = 0x0043bcd6 = 4439254 . It is uncorrectable, meaning that the ECC bytes are inconsistent. Have a look at http://smartmontools.sourceforge.net/BadBlockHowTo.txt for some suggestions. If you run an extended self-test '-t long' on the disk, it should fail at these unreadable LBAs. Bruce |
#6
|
|||
|
|||
On 2004-04-19, Bruce Allen wrote:
| 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline 2 Andre, this is a sign of trouble. There are two sectors on the disk that could not be read by the operating system. | 40 51 2a d6 bc 43 e0 Error: UNC 42 sectors at LBA = 0x0043bcd6 = 4439254 One of the unreadable sectors is at LBA = 0x0043bcd6 = 4439254 . It is uncorrectable, meaning that the ECC bytes are inconsistent. Have a look at http://smartmontools.sourceforge.net/BadBlockHowTo.txt for some suggestions. Interesting reading, thank you for the link (and for smartmontools, too). I used a destructive bad blocks scanner and the UNC errors went away. Interestingly, the raw value for Reallocated_Sector_Ct is now zero. This suggests that, after rewriting the sector, the drive found it reliable enough to keep using it. A bit scary, but at that price, I guess I can't complain. -- André Majorel URL:http://www.teaser.fr/~amajorel/ "Finally I am becoming stupider no more." -- Paul Erdös' epitaph |
#7
|
|||
|
|||
| 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline 2
Andre, this is a sign of trouble. There are two sectors on the disk that could not be read by the operating system. | 40 51 2a d6 bc 43 e0 Error: UNC 42 sectors at LBA = 0x0043bcd6 = 4439254 One of the unreadable sectors is at LBA = 0x0043bcd6 = 4439254 . It is uncorrectable, meaning that the ECC bytes are inconsistent. Have a look at http://smartmontools.sourceforge.net/BadBlockHowTo.txt for some suggestions. Interesting reading, thank you for the link (and for smartmontools, too). You're welcome. I used a destructive bad blocks scanner and the UNC errors went away. Good -- problem fixed. Interestingly, the raw value for Reallocated_Sector_Ct is now zero. This suggests that, after rewriting the sector, the drive found it reliable enough to keep using it. That's possible. If the drive was powered off when the sector was being written, that might have made it uncorrectable because it had a corrupted ECC value written to the disk. In which case, when the sector is written again, the ECC code get laid down consistently and all is well. A bit scary, but at that price, I guess I can't complain. There's nothing to be scared about if you back up your data. So don't complain, make backups instead. Cheers, Bruce |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Disk Management - New Partition option Greyed Out | Tapas Das | Dell Computers | 3 | March 23rd 05 03:58 PM |
how to test psu and reset to cmos to default | Tanya | General | 23 | February 7th 05 09:56 AM |
HELP! MY Computer cannot find hard drive | Michael S. | Asus Motherboards | 8 | June 25th 04 07:13 AM |
Can't See New HD After Cloning with Ghost 2003 | Nehmo Sergheyev | General | 15 | March 27th 04 09:15 PM |
Upgrade Difficulties | Ron B | Gateway Computers | 0 | February 14th 04 03:26 AM |