If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#11
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
In comp.sys.ibm.pc.hardware.storage Ant wrote:
Hello. For about two weeks ago, I had two of these incidents (13 days apart between them): hdb: dma_timer_expiry: dma status == 0x61 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: DMA disabled hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success These are some problems the sytem has with getting data from.to the HDD. It can be cabeling, the HDD and the controller on the mainboard. Unlukley, but possible, is also a failing PSU. From this, my old Linux/Debian system became slow and unresponsive due to high CPU usage (e.g., 7.xx in top). I had to shutdown (shutdown -r now) Linux/Debian, reboot, and things are back to normal speed. I doubt it is temperature related because the room is in the 60s and 70s degrees(F) and computer wasn't working intensely (e.g., surfing the Web). Also, I recalled before these problems started, my motherboard (CMOS and BIOS) didn't see both of my primary master drives (both HDDs: hda and hdb), but can see my secondary master (DVD-ROM drive = hdc). I had to open the case, but didn't see anything wrong. I wiggled the cable ends for the HDDs. I booted my machine up and it seemed fine for a few days/a week and then these errors came up (not disconnections). I ran smartctl utility on both of my HDDs for information and results: http://pastebin.ca/930776 ... hdb looks fine and the absence of seek errors may indicate your PSU is fine too. For hda, Raw_Read_Error_Rate, Seek_Error_Rate and Hardware_ECC_Recovered look bad. If it was not for the seek error rate, I would have said the read circuitry is going bad. This way it looks like the power may be bad, although I would expect a stronger impact on the Spin_Up_Time. Your data cable is fine, as a problem with that would have caused the UDMA_CRC_Error_Count to show something. As to the tests, you run them and when they are finished, you look at the smart attributes and the test-log. My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Or is it something else? Thank you in advance. The Quantum looks fine. If something is dying, it is the Seagate. I think you should do the following: 1. Remove and reseat the power connector on the seagate, and if you use a splitter-cable, connect it directly to the PSU. 2. Run a long SMART selftest on the disk to its completion and then post the SMART attributes again. If they are unchanged, then I would say that your HDA is likely dying. Arno |
#12
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
Your hdb is VERY old (over 6 years of actual running hours). It does
not seem to support SMART error logging, so you do not know what is wrong with it. I think that it is dying based on your errror messages from console. I would make sure to run a backup ASAP, like right now. Already did onto hda since that's the only HDD I have to back up to). Now, how come before these errors came up my motherboard couldn't see BOTH HDDs? Is that related or just a coincident? You say hda and hdb, so they are on the same controller/cable. You know, one blocking/stalled drive can block the whole ide bus? You should consider moving the 2nd drive to the 2nd ide cable, often enough there were incompatibilities between different harddrive brands. I wasn't aware of that. I am not familiar with the hardwares in PCs. So one drive (even a CD/DVD-ROM drive) has problems, it affects the other drive on the same controller/cable? I never had problems before two weeks ago. Even on previous motherboards. -- "All the best work is done the way that ants do things -- by tiny but untiring and regular additions." --Lafcadio Hearn /\___/\ / /\ /\ \ Ant @ http://antfarm.home.dhs.org (Personal Web Site) | |o o| | Ant's Quality Foraged Links (AQFL): http://aqfl.net \ _ / Please remove ANT if replying by e-mail. ( ) |
#13
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
For about two weeks ago, I had two of these incidents (13 days apart
between them): hdb: dma_timer_expiry: dma status == 0x61 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: DMA disabled hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success These are some problems the sytem has with getting data from.to the HDD. It can be cabeling, the HDD and the controller on the mainboard. Unlukley, but possible, is also a failing PSU. I had to replace the PSU on 5/14/2007 according to my log: http://alpha.zimage.com/~ant/antfarm/about/toys.html ... It's almost ten months old. From this, my old Linux/Debian system became slow and unresponsive due to high CPU usage (e.g., 7.xx in top). I had to shutdown (shutdown -r now) Linux/Debian, reboot, and things are back to normal speed. I doubt it is temperature related because the room is in the 60s and 70s degrees(F) and computer wasn't working intensely (e.g., surfing the Web). Also, I recalled before these problems started, my motherboard (CMOS and BIOS) didn't see both of my primary master drives (both HDDs: hda and hdb), but can see my secondary master (DVD-ROM drive = hdc). I had to open the case, but didn't see anything wrong. I wiggled the cable ends for the HDDs. I booted my machine up and it seemed fine for a few days/a week and then these errors came up (not disconnections). I ran smartctl utility on both of my HDDs for information and results: http://pastebin.ca/930776 ... hdb looks fine and the absence of seek errors may indicate your PSU is fine too. For hda, Raw_Read_Error_Rate, Seek_Error_Rate and Hardware_ECC_Recovered look bad. If it was not for the seek error rate, I would have said the read circuitry is going bad. This way it looks like the power may be bad, although I would expect a stronger impact on the Spin_Up_Time. Your data cable is fine, as a problem with that would have caused the UDMA_CRC_Error_Count to show something. As to the tests, you run them and when they are finished, you look at the smart attributes and the test-log. Where are these logs at? My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Or is it something else? Thank you in advance. The Quantum looks fine. If something is dying, it is the Seagate. I think you should do the following: 1. Remove and reseat the power connector on the seagate, and if you use a splitter-cable, connect it directly to the PSU. 2. Run a long SMART selftest on the disk to its completion and then post the SMART attributes again. If they are unchanged, then I would say that your HDA is likely dying. Odd that dmesg and /var/log/messages don't mention hda. Same for SMART. -- "All the best work is done the way that ants do things -- by tiny but untiring and regular additions." --Lafcadio Hearn /\___/\ / /\ /\ \ Ant @ http://antfarm.home.dhs.org (Personal Web Site) | |o o| | Ant's Quality Foraged Links (AQFL): http://aqfl.net \ _ / Please remove ANT if replying by e-mail. ( ) |
#14
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
Arno Wagner wrote in
In comp.sys.ibm.pc.hardware.storage Ant wrote: Hello. For about two weeks ago, I had two of these incidents (13 days apart between them): hdb: dma_timer_expiry: dma status == 0x61 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: DMA disabled hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success These are some problems the sytem has with getting data from.to the HDD. It can be cabeling, the HDD and the controller on the mainboard. Unlukley, but possible, is also a failing PSU. From this, my old Linux/Debian system became slow and unresponsive due to high CPU usage (e.g., 7.xx in top). I had to shutdown (shutdown -r now) Linux/Debian, reboot, and things are back to normal speed. I doubt it is temperature related because the room is in the 60s and 70s degrees(F) and computer wasn't working intensely (e.g., surfing the Web). Also, I recalled before these problems started, my motherboard (CMOS and BIOS) didn't see both of my primary master drives (both HDDs: hda and hdb), but can see my secondary master (DVD-ROM drive = hdc). I had to open the case, but didn't see anything wrong. I wiggled the cable ends for the HDDs. I booted my machine up and it seemed fine for a few days/a week and then these errors came up (not disconnections). I ran smartctl utility on both of my HDDs for information and results: http://pastebin.ca/930776 ... hdb looks fine and the absence of seek errors may indicate your PSU is fine too. For hda, Raw_Read_Error_Rate, Seek_Error_Rate and Hardware_ECC_Recovered look bad. If it was not for the seek error rate, I would have said the read circuitry is going bad. This way it looks like the power may be bad, although I would expect a stronger impact on the Spin_Up_Time. Your data cable is fine, as a problem with that would have caused the UDMA_CRC_Error_Count to show something. Not if the commands that initiate the UDMA data tranfers never make it. Which apparently they didn't as a bus reset was needed to get it going again. Reversely, UDMA CRC errors say nothing about cable quality. As to the tests, you run them and when they are finished, you look at the smart attributes and the test-log. My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Or is it something else? Thank you in advance. The Quantum looks fine. If something is dying, it is the Seagate. The Seagate is fine. I think you should do the following: 1. Remove and reseat the power connector on the seagate, and if you use a splitter-cable, connect it directly to the PSU. 2. Run a long SMART selftest on the disk to its completion and then post the SMART attributes again. If they are unchanged, then I would say that your HDA is likely dying. Arno |
#15
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
In comp.sys.ibm.pc.hardware.storage Ant wrote:
[...] For hda, Raw_Read_Error_Rate, Seek_Error_Rate and Hardware_ECC_Recovered look bad. If it was not for the seek error rate, I would have said the read circuitry is going bad. This way it looks like the power may be bad, although I would expect a stronger impact on the Spin_Up_Time. Your data cable is fine, as a problem with that would have caused the UDMA_CRC_Error_Count to show something. As to the tests, you run them and when they are finished, you look at the smart attributes and the test-log. Where are these logs at? It seems your hdb cannot log self-test results. For hda, the log starts at line 139 in your SMART attribute list. My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Or is it something else? Thank you in advance. The Quantum looks fine. If something is dying, it is the Seagate. I think you should do the following: 1. Remove and reseat the power connector on the seagate, and if you use a splitter-cable, connect it directly to the PSU. 2. Run a long SMART selftest on the disk to its completion and then post the SMART attributes again. If they are unchanged, then I would say that your HDA is likely dying. Odd that dmesg and /var/log/messages don't mention hda. Same for SMART. Indeed for the message log. However it is possible that hda does things to the bus, like keeping it occupied too long that causes the error messages for hdb. The error messages in the log are basically a timout only, and do not indicate that there is necessarily anything wrong with the disk it happened on. As for SMART, it is possible that hdb is actually breaking down, but there seems to be something wrong with hda, so my first take would be that hda causes interference to hdb in some way. They are on the same cable, after all. I especially do not like the values opf attributes 1, 7, and 195 on hda as mentioned before. They indicate there is some problem with the seeking mechanism. This can be due to power issues. It also can be problems with the drive hardware itself. Admittedly my comments are highly speculative. I have had similar errors in my logs in the past. One turned out to be a shot drive controller (possibly inadequate cooling for this particular chip). An other one was a software issue and went away after a kernel update. Arno |
#16
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
In comp.sys.ibm.pc.hardware.storage Walter Mautner wrote:
Ant wrote: In alt.comp.periphs.hdd Ignoramus24341 wrote: Your hdb is VERY old (over 6 years of actual running hours). It does not seem to support SMART error logging, so you do not know what is wrong with it. I think that it is dying based on your errror messages from console. I would make sure to run a backup ASAP, like right now. Already did onto hda since that's the only HDD I have to back up to). Now, how come before these errors came up my motherboard couldn't see BOTH HDDs? Is that related or just a coincident? You say hda and hdb, so they are on the same controller/cable. You know, one blocking/stalled drive can block the whole ide bus? You should consider moving the 2nd drive to the 2nd ide cable, often enough there were incompatibilities between different harddrive brands. Good idea. Some incompatibility could even have been introduced by the problem on one drive, that normally would not manifest or show, and now cause issues for the other drive to be logged. -- vista policy violation: Microsoft optical mouse found penguin patterns on mousepad. Partition scan in progress to remove offending incompatible products. Reactivate MS software. Linux 2.6.24. [LinuxCounter#295241,ICQ#4918962] Cool .Sig! Arno |
#17
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
Previously Ant wrote:
Your hdb is VERY old (over 6 years of actual running hours). It does not seem to support SMART error logging, so you do not know what is wrong with it. I think that it is dying based on your errror messages from console. I would make sure to run a backup ASAP, like right now. Already did onto hda since that's the only HDD I have to back up to). Now, how come before these errors came up my motherboard couldn't see BOTH HDDs? Is that related or just a coincident? You say hda and hdb, so they are on the same controller/cable. You know, one blocking/stalled drive can block the whole ide bus? You should consider moving the 2nd drive to the 2nd ide cable, often enough there were incompatibilities between different harddrive brands. I wasn't aware of that. I am not familiar with the hardwares in PCs. So one drive (even a CD/DVD-ROM drive) has problems, it affects the other drive on the same controller/cable? I never had problems before two weeks ago. Even on previous motherboards. It can happen. The error you see is a timeout. If the other drive starts to mess with the bus when it has some internal error, the driver can see a timeout on the drive that is actually fine. Arno |
#18
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
Ant wrote:
For about two weeks ago, I had two of these incidents (13 days apart between them): hdb: dma_timer_expiry: dma status == 0x61 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: DMA disabled hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success Thats normally just a bad cable. And since the problem is seen with more than one hard drive, its almost certainly just a bad cable. Can be a bad hard drive controller on the motherboard etc but thats much less likely. From this, my old Linux/Debian system became slow and unresponsive due to high CPU usage (e.g., 7.xx in top). Because its turned the DMA off, as it says. I wonder if I can re-enable DMA without rebooting. Yes, but its much better to fix whatever is causing the problem instead. I think hdparm controls that? Dunno, I dont bother with Linux at that level myself. I had to shutdown (shutdown -r now) Linux/Debian, reboot, and things are back to normal speed. Because its turned the DMA on again. So why did my DMA go off? As a precaution? Yes, when it decided that there is a problem with the DMA, because of the timeouts it can see. Win does the same thing. I doubt it is temperature related because the room is in the 60s and 70s degrees(F) and computer wasn't working intensely (e.g., surfing the Web). Yeah, most likely just a bad cable. Also, I recalled before these problems started, my motherboard (CMOS and BIOS) didn't see both of my primary master drives (both HDDs: hda and hdb), but can see my secondary master (DVD-ROM drive = hdc). More evidence of a bad cable to the hard drives. I had to open the case, but didn't see anything wrong. I wiggled the cable ends for the HDDs. And that likely got it going again. Those cable piercing connectors can bend one of the things that bite the cable when the cable is made and can get loose if you reef the cable off the drive or motherboard end by pulling on the ribbon etc. Hmm, I have those old fashion flat cables. I guess I will go replace it. Yep, best thing to try first, they are so cheap. I assume replacing the whole ribbon cable is enough? Yes. I didn't see how many there were in my mini-tower case (hard to see and it's crowded). It will have two. I assume two are in total for primary and secondary drives. Yep. If its a round cable, its ****ed by design. Yeah, I don't have those. My other PC has a SATA cable that are round. Thats why this PC is playing up, its jealous. I booted my machine up and it seemed fine for a few days/a week and then these errors came up (not disconnections). I ran smartctl utility on both of my HDDs for information and results: http://pastebin.ca/930776 ... My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Nope, just the cable. Hmm, OK! I will go try the cable first then! Yep, thats the best thing to do first. Or is it something else? Yep, the cable. |
#19
|
|||
|
|||
Is my HDD dying or something else from these errors andsymptoms?
On Thu, 06 Mar 2008 14:47:46 -0600, Ant wrote:
Hello. For about two weeks ago, I had two of these incidents (13 days apart between them): hdb: dma_timer_expiry: dma status == 0x61 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: DMA disabled hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success hdb: dma_timer_expiry: dma status == 0x41 hdb: DMA timeout error hdb: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdb: DMA disabled ide0: reset: success From this, my old Linux/Debian system became slow and unresponsive due to high CPU usage (e.g., 7.xx in top). I had to shutdown (shutdown -r now) Linux/Debian, reboot, and things are back to normal speed. I doubt it is temperature related because the room is in the 60s and 70s degrees(F) and computer wasn't working intensely (e.g., surfing the Web). Also, I recalled before these problems started, my motherboard (CMOS and BIOS) didn't see both of my primary master drives (both HDDs: hda and hdb), but can see my secondary master (DVD-ROM drive = hdc). I had to open the case, but didn't see anything wrong. I wiggled the cable ends for the HDDs. I booted my machine up and it seemed fine for a few days/a week and then these errors came up (not disconnections). I ran smartctl utility on both of my HDDs for information and results: http://pastebin.ca/930776 ... My full system specifications can be found he http://alpha.zimage.com/~ant/antfarm.../computers.txt (secondary/backup machine). Does that mean my decade old Quantum 6.4 GB HDD (already made a backup just in case) is finally dying? Or is it something else? Thank you in advance. Suggest you boot a Live CD and run badblocks on the hard drive. |
#20
|
|||
|
|||
Is my HDD dying or something else from these errors and symptoms?
Ant wrote:
.... I would make sure to run a backup ASAP, like right now. Already did onto hda since that's the only HDD I have to back up to). Now, how come before these errors came up my motherboard couldn't see BOTH HDDs? Is that related or just a coincident? Well, you have a Quantum and a Seagate, both on the same cable. Some models of these competing manufacturers (oh that was a while ago) could not play nice with each other. That problem needed not to show up at the first time though .... And one stalling/blocking harddrive (electronics) can block the whole bus. -- vista policy violation: Microsoft optical mouse found penguin patterns on mousepad. Partition scan in progress to remove offending incompatible products. Reactivate MS software. Linux 2.6.24. [LinuxCounter#295241,ICQ#4918962] |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
bad drive symptoms? | [email protected] | Storage (alternative) | 1 | October 15th 07 02:26 AM |
What do these symptoms suggest? | Michael Hawes | Homebuilt PC's | 0 | June 6th 06 11:39 PM |
Seagate drive check utility reports "errors in metadatea file records, other errors critical errors in metadata.." | [email protected] | Storage (alternative) | 0 | February 3rd 06 05:48 AM |
c1 errors and 'soft errors' question | Phil | Cdr | 4 | May 24th 05 04:28 PM |
What are the symptoms of a weak PSU? | Dan B | Homebuilt PC's | 5 | January 12th 05 11:40 PM |