If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
OT?: Major PC hardware failure
I built a pc around Pentium cpu, Intel D865GVHZ board, Antec case/PS back in '06. Have been running it about 16 hours/day ever since. Whilst running a backup Sunday, I heard a constant little tic-tic, tic-tic, tic-tic sound, looked down to find Speedfan registered about 74 C. then watched the W2k system crash. I immediately suspected a cpu fan failure, but all fans were running when I checked. I replaced the cpu fan anyway. This morn it wouldn't even POST, showed only an "Intel board" screen. A bit later it wouldn't power at all (no fans. lights, nothing). The Power Supply is Antec Smart-Power 350w. It could have gone bad and overjuiced the cpu causing the 74 C. temp? After which the Intel overheated-cpu function started shutting everything down? I've not had a system fail in this manner. Hope someone has some idea what happened. Thx, Puddin' "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
#2
|
|||
|
|||
OT?: Major PC hardware failure
Puddin' Man wrote:
I built a pc around Pentium cpu, Intel D865GVHZ board, Antec case/PS back in '06. Have been running it about 16 hours/day ever since. Whilst running a backup Sunday, I heard a constant little tic-tic, tic-tic, tic-tic sound, looked down to find Speedfan registered about 74 C. then watched the W2k system crash. I immediately suspected a cpu fan failure, but all fans were running when I checked. I replaced the cpu fan anyway. This morn it wouldn't even POST, showed only an "Intel board" screen. A bit later it wouldn't power at all (no fans. lights, nothing). The Power Supply is Antec Smart-Power 350w. It could have gone bad and overjuiced the cpu causing the 74 C. temp? After which the Intel overheated-cpu function started shutting everything down? I've not had a system fail in this manner. Hope someone has some idea what happened. Thx, Puddin' "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." Have a look at your Southbridge. You just had a latchup failure on your ICH5 :-) Because the motherboard won't start, that means you had a major failure. Your ICH5 or ICH5R should look like this. (Back in that era, the Southbridge didn't have a heatsink on it. The Northbridge, usually does. The Southbridge is the chip you can visually inspect, without taking anything apart.) http://onfinite.com/libraries/179057/2ea.jpg It would be fun to ask the warranty people at Intel, how many of those blew. Musta been a few. It's a semiconductor problem, not publicly acknowledged by Intel. Gigabyte is the only company to post a warning about the problem. Replace the motherboard, and your other components should be OK. I haven't heard of collateral damage from that. It's just the Southbridge which is "toast". The burn mark in the picture, is over top of the contacts underneath, which power USB I/O. A possible explanation, is static electricity enters via a USB port, and causes the USB I/O pad to go into latchup. That causes a conducting path to form between VCC and GND. If the bond wires burn out, it's called a "minor failure" and you lose all your USB interfaces. There would be no burn mark in that case, because the bond wires let go, before anything gets fried. Device Manager still shows USB entries (because the logic blocks are intact), but no plugged in devices are detected. Since the bond wire is burned out, the D+ and D- signals can no longer receive info from USB peripherals. If the bond wires remain intact during the event, the chip heats up until a hole is burned in the lid. The silicon is ruined, and the board will no longer start. While Gigabyte claimed, in their warning, that ICH4 and ICH5 were affected, the vast majority of reported failures in newsgroups are ICH5. I think I've only read one report of an ICH4 blowing. Paul |
#3
|
|||
|
|||
OT?: Major PC hardware failure
Hi Paul,
I saved your notes re ICH5 problems some time ago. My Southbridge has markings: I N T E L FW82801EB F5441A13 Intel@@'02 Korea and it's square, unlike on your link below. And there's NO evidence of burnout. The chip and it's peripheral connections look OK. At the time of failure, I got RUNDLL Message Popup. Since then I get: "The CPU was previously shutdown due to a thermal event(overheating) Service the unit right away to resolve this" Any/all advice on getting the unit "serviced" much appreciated. Thx, P On Wed, 06 Apr 2011 17:35:35 -0400, Paul wrote: Have a look at your Southbridge. You just had a latchup failure on your ICH5 :-) Because the motherboard won't start, that means you had a major failure. Your ICH5 or ICH5R should look like this. (Back in that era, the Southbridge didn't have a heatsink on it. The Northbridge, usually does. The Southbridge is the chip you can visually inspect, without taking anything apart.) http://onfinite.com/libraries/179057/2ea.jpg It would be fun to ask the warranty people at Intel, how many of those blew. Musta been a few. It's a semiconductor problem, not publicly acknowledged by Intel. Gigabyte is the only company to post a warning about the problem. Replace the motherboard, and your other components should be OK. I haven't heard of collateral damage from that. It's just the Southbridge which is "toast". The burn mark in the picture, is over top of the contacts underneath, which power USB I/O. A possible explanation, is static electricity enters via a USB port, and causes the USB I/O pad to go into latchup. That causes a conducting path to form between VCC and GND. If the bond wires burn out, it's called a "minor failure" and you lose all your USB interfaces. There would be no burn mark in that case, because the bond wires let go, before anything gets fried. Device Manager still shows USB entries (because the logic blocks are intact), but no plugged in devices are detected. Since the bond wire is burned out, the D+ and D- signals can no longer receive info from USB peripherals. If the bond wires remain intact during the event, the chip heats up until a hole is burned in the lid. The silicon is ruined, and the board will no longer start. While Gigabyte claimed, in their warning, that ICH4 and ICH5 were affected, the vast majority of reported failures in newsgroups are ICH5. I think I've only read one report of an ICH4 blowing. "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
#4
|
|||
|
|||
OT?: Major PC hardware failure
Puddin' Man wrote:
Hi Paul, I saved your notes re ICH5 problems some time ago. My Southbridge has markings: I N T E L FW82801EB F5441A13 Intel@@'02 Korea and it's square, unlike on your link below. And there's NO evidence of burnout. The chip and it's peripheral connections look OK. At the time of failure, I got RUNDLL Message Popup. Since then I get: "The CPU was previously shutdown due to a thermal event(overheating) Service the unit right away to resolve this" Any/all advice on getting the unit "serviced" much appreciated. Thx, P OK, have you pulled the CPU heatsink/fan and checked for damage ? Do you still have thermal paste between the CPU and heatsink ? Maybe the Vcore circuit overvolted and ran the CPU at higher than normal voltage ? The last datasheet I looked at, Vcore is supposed to have a check for overvoltage - the regulator will shut down, if the output is off by a certain percentage or more. So normally you're protected from that. The Vcore circuit doesn't have "total control", and I suppose it would be possible for a fault to occur, that the regulator chip cannot stop. Only a certain class of faults can be detected by the regulator chip, and stopped relatively quickly. ******* The processor has two levels of thermal protection. It uses throttling first, to try to stop the overheat. If it gets too hot, it uses THERMTRIP, which should cause the power supply to shut off. THERMTRIP or a Vcore failure, might be latching faults. That means, to clear then, all power must be removed by the computer. The easiest way to do that, is to flip the rear switch to OFF for 60 seconds, then flip it back on again. ******* Relative to other components in the computer, processors are extremely reliable. It is possible for a silicon die to crack, if it's under enough mechanical stress. And perhaps a resulting short circuit on the die, can result in overheating. The only reason I'm not blaming your power supply at all, is your observation of the ongoing overtemperature indication, which would be coming from the processor. Vcore can handle excursions on +12V, without transferring that through to the processor. So if the 12V shot up to 16V for example, it might fry all the hard drives, but it wouldn't bother Vcore at all. Onboard regulators help to "insulate" motherboard components, from power supply problems. Components directly connected to the supply (PCI cards, hard drives) are more at risk. So right now, all I can suggest, is check to see whether the heatsink is firmly attached to the motherboard. Sometimes, a plastic latch snaps or the like. Also, visually inspect the capacitors around the motherboard socket, for bulging or leakage. http://upload.wikimedia.org/wikipedi...pacitor_01.jpg ******* On some motherboards, the Northbridge heatsink falls off, when the solder joint fails on the steel wire holding the heatsink to the motherboard. Your symptoms don't seem to be consistent with that, but you can check the Northbridge heatsink for being securely in place, while you're doing your other inspections. At least one Dell computer, electrically checks that the heatsink is secure, and the computer won't boot unless the mechanical wire is soldered back into place. Paul |
#5
|
|||
|
|||
Major PC hardware failure
"Puddin' Man" wrote in message ... I built a pc around Pentium cpu, Intel D865GVHZ board, Antec case/PS back in '06. Have been running it about 16 hours/day ever since. Whilst running a backup Sunday, I heard a constant little tic-tic, tic-tic, tic-tic sound, looked down to find Speedfan registered about 74 C. then watched the W2k system crash. I immediately suspected a cpu fan failure, but all fans were running when I checked. I replaced the cpu fan anyway. This morn it wouldn't even POST, showed only an "Intel board" screen. A bit later it wouldn't power at all (no fans. lights, nothing). The Power Supply is Antec Smart-Power 350w. It could have gone bad and overjuiced the cpu causing the 74 C. temp? After which the Intel overheated-cpu function started shutting everything down? I've not had a system fail in this manner. Hope someone has some idea what happened. Thx, Puddin' I'd test the PSU. Disconnect all leads from the M/B, drives etc. Short the green (PS_ON#) to black (Com) and check the voltages. See http://www.smpspowersupply.com/connectors-pinouts.html Around 06 I built 4 systems based on Antec cases with PSUs. 3 out the four PSUs malfunctioned and destroyed M/Bs and drives within about a month of each other. I changed the PSU in the fourth and the system is still running. Eric |
#6
|
|||
|
|||
OT?: Major PC hardware failure
On Thu, 07 Apr 2011 02:30:09 -0400, Paul wrote:
OK, have you pulled the CPU heatsink/fan and checked for damage ? Do you still have thermal paste between the CPU and heatsink ? I installed new HS/fan with paste the night it crashed. Maybe the Vcore circuit overvolted and ran the CPU at higher than normal voltage ? Sounds likely. The last datasheet I looked at, Vcore is supposed to have a check for overvoltage - the regulator will shut down, if the output is off by a certain percentage or more. So normally you're protected from that. The Vcore circuit doesn't have "total control", and I suppose it would be possible for a fault to occur, that the regulator chip cannot stop. Only a certain class of faults can be detected by the regulator chip, and stopped relatively quickly. ******* The processor has two levels of thermal protection. It uses throttling first, to try to stop the overheat. If it gets too hot, it uses THERMTRIP, which should cause the power supply to shut off. THERMTRIP or a Vcore failure, might be latching faults. That means, to clear then, all power must be removed by the computer. The easiest way to do that, is to flip the rear switch to OFF for 60 seconds, then flip it back on again. I've done that numerous times. No help. ******* Relative to other components in the computer, processors are extremely reliable. It is possible for a silicon die to crack, if it's under enough mechanical stress. And perhaps a resulting short circuit on the die, can result in overheating. The only reason I'm not blaming your power supply at all, is your observation of the ongoing overtemperature indication, which would be coming from the processor. Check. I've installed a new PS and it's made no difference in system behavior. Problem persists. Vcore can handle excursions on +12V, without transferring that through to the processor. So if the 12V shot up to 16V for example, it might fry all the hard drives, but it wouldn't bother Vcore at all. Onboard regulators help to "insulate" motherboard components, from power supply problems. Components directly connected to the supply (PCI cards, hard drives) are more at risk. The primary HD didn't fry: looked OK on the other pc. There were no PCI cards. So right now, all I can suggest, is check to see whether the heatsink is firmly attached to the motherboard. Sometimes, a plastic latch snaps or the like. Also, visually inspect the capacitors around the motherboard socket, for bulging or leakage. All that has been done. No evident problem. http://upload.wikimedia.org/wikipedi...pacitor_01.jpg ******* On some motherboards, the Northbridge heatsink falls off, when the solder joint fails on the steel wire holding the heatsink to the motherboard. Your symptoms don't seem to be consistent with that, but you can check the Northbridge heatsink for being securely in place, while you're doing your other inspections. At least one Dell computer, electrically checks that the heatsink is secure, and the computer won't boot unless the mechanical wire is soldered back into place. The Northbridge heatsink is fine. Everything on the board looks OK. I power it up for about 20 secs, then it shuts all power down. Should there be some sort of Vcore reset in bios? Any other way to effect a reset? Thanks, P "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
#7
|
|||
|
|||
Major PC hardware failure
On Thu, 7 Apr 2011 11:42:45 +0100, "Eric Parker" wrote:
I'd test the PSU. Disconnect all leads from the M/B, drives etc. Short the green (PS_ON#) to black (Com) and check the voltages. See http://www.smpspowersupply.com/connectors-pinouts.html Around 06 I built 4 systems based on Antec cases with PSUs. 3 out the four PSUs malfunctioned and destroyed M/Bs and drives within about a month of each other. I changed the PSU in the fourth and the system is still running. I'll assume the Antecd PS caused the problem 'till I have evidence to the contrary. I have new Extreme 460w PS installed: problem persists. Never again Antec PS for me. Thanks, P "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
#8
|
|||
|
|||
OT?: Major PC hardware failure
Puddin' Man wrote:
The Northbridge heatsink is fine. Everything on the board looks OK. I power it up for about 20 secs, then it shuts all power down. Should there be some sort of Vcore reset in bios? Any other way to effect a reset? Thanks, P "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." Your D865GVHZ should have THERMTRIP, and shutting off after 20 seconds could be due to that. The BIOS obviously has the skill set to shut off the computer too, but then the question would be, why. Some BIOS will check for a minimum fan rotation speed on the CPU fan, and shut off the computer if they don't detect it. That would be for the CPU fan, while other fans are allowed to die if they want. Usually, on the CPU fan would be considered as a reason to shut off the computer. Vcore is programmable a couple ways. On a non-enthusiast motherboard, they would use the facilities provided by default. The processor has VID signals on the bottom of it. The bit pattern specifies a voltage. Those can be connected directly to a similar interface on the Vcore regulator chip. Modern processors even adjust that voltage dynamically, as a function of computing load (a feature of Intel SpeedStep, otherwise known as EIST). On an enthusiast motherboard, the Vcore may have an additional input feature, that allows adding an offset to the "regular" voltage value. That is used when overclocking. On my last motherboard, I found a pin on the Vcore regulator, by which I could offset the voltage by 0.1 volts, so I could try overclocking (even though the motherboard didn't support it). When a BIOS supports such a feature, there is an actual Vcore setting in the BIOS screen (sometimes it'll be "+0.1" implying a bump of 0.1 volts, rather than the BIOS screen giving an absolute voltage as such). Your Intel board probably doesn't have that, and just uses the VID code as it comes from the processor and goes to Vcore. Since you can't boot the system right now, there is no way to get into Windows and use CPUZ, to check what CPUZ sees as the current Vcore value. I also checked the user manual for the motherboard, and I don't see a hardware monitor page in there. One option would have been, to use the 20 seconds to visit the hardware monitor page and check a Vcore reading there. (Some SuperI/O chips have eight channels of ADC or analog to digital conversion, for making crude 8 bit voltage measurements. They use that to measure the three main power supply rails, Vcore, and present that in a BIOS hardware monitor page.) Without that, you'd have to get out a multimeter, and figure out where to probe, if you thought it was actually a Vcore problem. Note that, a motherboard can "run" without a processor installed in it. You would remove the memory and the processor, and operate with just the bare board. Pressing the switch on the front of the computer, with the panel wiring attached, should allow the power supply to be turned on and off. In this case, the purpose of the test, would be to verify the control circuit is working. If the thing still died after 20 seconds, then you'd know it likely wasn't the processor. In such a scenario, motherboard running without processor installed, the VID signals are "floating", they float to all 1's logic value, and VCore gets set to zero volt output when that happens. That means effectively, the VCore regulator is not loaded. So the motherboard is pretty well neutered. You would expect the power to remain running, because many of the "switch-off" mechanisms have been removed. I don't particularly see a reason to try a CMOS reset or the like in this case. You can try it, but at the very least, remove all power from the system before doing it. Intel tends to do things differently than other manufacturers, so I don't even know what the gotchas are with their procedure. I don't think the BIOS can set the voltage to anything dangerous - it doesn't give the impression of an enthusiast board when I read the manual. I mean, not finding a hardware monitor page, seemed pretty bad. You'd find that feature available on a lot of other brands. Paul |
#9
|
|||
|
|||
OT?: Major PC hardware failure
Problem persists. Clear CMOS doesn't help.
I strongly suspect that the "thermal event" kicked THERMTRIP in, and that THERMTRIP isn't allowing the default reset that would enable resumption of operation (deficiency in THERMTRIP design). IIRC, the system has a Winbond chip, and there is a temp, etc monitor program (that doesn't function properly). Runs under Windoze, of course. Thanks for the cpu-less test info: I didn't know about that. I have an old Celeron S-478 cpu that I can sub. I guess I'll wait until tomorrow, do the cpu-less test, and if it keeps "running", install the Celeron, which is not THERMTRIP-ed. Many thanks, more info tomorrow. Prost, P On Thu, 07 Apr 2011 18:59:28 -0400, Paul wrote: Your D865GVHZ should have THERMTRIP, and shutting off after 20 seconds could be due to that. The BIOS obviously has the skill set to shut off the computer too, but then the question would be, why. Some BIOS will check for a minimum fan rotation speed on the CPU fan, and shut off the computer if they don't detect it. That would be for the CPU fan, while other fans are allowed to die if they want. Usually, on the CPU fan would be considered as a reason to shut off the computer. Vcore is programmable a couple ways. On a non-enthusiast motherboard, they would use the facilities provided by default. The processor has VID signals on the bottom of it. The bit pattern specifies a voltage. Those can be connected directly to a similar interface on the Vcore regulator chip. Modern processors even adjust that voltage dynamically, as a function of computing load (a feature of Intel SpeedStep, otherwise known as EIST). On an enthusiast motherboard, the Vcore may have an additional input feature, that allows adding an offset to the "regular" voltage value. That is used when overclocking. On my last motherboard, I found a pin on the Vcore regulator, by which I could offset the voltage by 0.1 volts, so I could try overclocking (even though the motherboard didn't support it). When a BIOS supports such a feature, there is an actual Vcore setting in the BIOS screen (sometimes it'll be "+0.1" implying a bump of 0.1 volts, rather than the BIOS screen giving an absolute voltage as such). Your Intel board probably doesn't have that, and just uses the VID code as it comes from the processor and goes to Vcore. Since you can't boot the system right now, there is no way to get into Windows and use CPUZ, to check what CPUZ sees as the current Vcore value. I also checked the user manual for the motherboard, and I don't see a hardware monitor page in there. One option would have been, to use the 20 seconds to visit the hardware monitor page and check a Vcore reading there. (Some SuperI/O chips have eight channels of ADC or analog to digital conversion, for making crude 8 bit voltage measurements. They use that to measure the three main power supply rails, Vcore, and present that in a BIOS hardware monitor page.) Without that, you'd have to get out a multimeter, and figure out where to probe, if you thought it was actually a Vcore problem. Note that, a motherboard can "run" without a processor installed in it. You would remove the memory and the processor, and operate with just the bare board. Pressing the switch on the front of the computer, with the panel wiring attached, should allow the power supply to be turned on and off. In this case, the purpose of the test, would be to verify the control circuit is working. If the thing still died after 20 seconds, then you'd know it likely wasn't the processor. In such a scenario, motherboard running without processor installed, the VID signals are "floating", they float to all 1's logic value, and VCore gets set to zero volt output when that happens. That means effectively, the VCore regulator is not loaded. So the motherboard is pretty well neutered. You would expect the power to remain running, because many of the "switch-off" mechanisms have been removed. I don't particularly see a reason to try a CMOS reset or the like in this case. You can try it, but at the very least, remove all power from the system before doing it. Intel tends to do things differently than other manufacturers, so I don't even know what the gotchas are with their procedure. I don't think the BIOS can set the voltage to anything dangerous - it doesn't give the impression of an enthusiast board when I read the manual. I mean, not finding a hardware monitor page, seemed pretty bad. You'd find that feature available on a lot of other brands. Paul "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
#10
|
|||
|
|||
OT?: Major PC hardware failure
Nothin' works! I meant to wait 'till tomorrow, but these thangs bother and bother me, and I went ahead and fiddled the damn thang. I cleared CMOS again, then pulled cpu and mem. Powered up and ... nuthin'!. No fans, no anything. Installed old Celeron (which ran in this system years ago), HS/fan (w paste), mem. Powered up and ... it spun the fans for a second or 2, then shut down everything. This is what it has been doing lately: a sec or 2 then shut down. Beats me. I was hoping at least the board would spin the fans w/o cpu, mem. Gawd, whotta headache! Any ideas? Mobo is now highly suspect?? Thx, P "Law Without Equity Is No Law At All. It Is A Form Of Jungle Rule." |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
WIN 7 Major USB Hardware Incompatibilities ? | Trimble Bracegirdle | General | 49 | February 25th 10 02:18 PM |
Major hardware problems - fans shutting down - overheating | loreleibansidhe | Homebuilt PC's | 4 | September 22nd 06 05:58 PM |
Hardware failure at boot up? | [email protected] | Storage (alternative) | 41 | May 24th 05 12:28 AM |
hardware or motherboard failure? | kony | General | 3 | December 23rd 04 02:47 PM |
hardware reboot failure | Jon Cortelyou | Asus Motherboards | 5 | October 18th 04 01:33 AM |