If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Defective RAM module - or is it?
Hi all.
This is not really OC related, but I assume you have experience with the topic. Here's my problem in short: I'm pretty sure I have a defective SO-DIMM DDR2 RAM module, but memtest86+ ran a complete pass w/o any error. The module came with the Notebook which still has warranty on it, but I'd like to have some kind of proof that this was really the source of the problem. And here it is again, but with a bit of added history: A few days ago, my Windows XP SP2 system started behaving unstable. At that time, I was using two 1GiB-Modules of DDR2-Ram, which the MB automatically used in dual channel configuration. The modules were not exactly the same , one of those had come with the Notebook (bought 7 months ago), the other I had added recently. The MB used the slower of the two timings. The problems began: Firefox would crash (which is not unusual since I use the latest nightly builds), twice I got the same bluescreen (Page fault in nonpaged area) and upon reboot I was notified that the registry had had to be restored. This was all within half a day. i tried a few things (fsck, registry cleaning etc.) but the problems persisted, so I wiped the partition and began reinstalling the system. While copying from CD, the installer claimed it was unable to copy some files to the HD, but the CD was in good condition. I tried another CD, also basically unscratched, but got read errors there as well. After trying a few times, the files were finally copied w/o errors, but in the next part of the installation, I got a bluescreen. Pagefault in nonpaged area. It became clear then that I had a hardware problem. I ran memtest86+ for a full pass, and had the HD run a SMART extended self-test, all without indication of a problem. Linux ran stable all the time, though I didn't use it much, so it could be coincidence. I removed the new RAM-Module and reinstalled, and this time it went cleanly. But while installing the drivers, odd errors kept occuring again, just a little too frequently, and then I got the messages about restoring the registry again after reboot, and the soundcard driver kept crashing... and firefox too. I wiped the system again, took out the old RAM-Module which came with the Notebook, and replaced it with the new one. This time, the installation of the system, drivers, updates and everything was as smooth as it could be. I haven't had a single crash or strange error in three days now. Sorry for the long story, but I wanted to give as much background info as possible. Now I want to return the probably defective module, but I'd like to have some kind of proof first that it really is at fault. If all I can say is "Windows crashed", then they'll probably look at me as if I'd said water was wet. Also, I'd like to know for sure so I can stop worrying if my system really is good again. So, what do you suggest? 24-hour-memtest? Statistical crash analysis? I'm open to all your suggestions. Simeon |
#2
|
|||
|
|||
Defective RAM module - or is it?
Simeon Maxein wrote:
Hi all. This is not really OC related, but I assume you have experience with the topic. Here's my problem in short: I'm pretty sure I have a defective SO-DIMM DDR2 RAM module, but memtest86+ ran a complete pass w/o any error. The module came with the Notebook which still has warranty on it, but I'd like to have some kind of proof that this was really the source of the problem. And here it is again, but with a bit of added history: A few days ago, my Windows XP SP2 system started behaving unstable. At that time, I was using two 1GiB-Modules of DDR2-Ram, which the MB automatically used in dual channel configuration. The modules were not exactly the same , one of those had come with the Notebook (bought 7 months ago), the other I had added recently. The MB used the slower of the two timings. The problems began: Firefox would crash (which is not unusual since I use the latest nightly builds), twice I got the same bluescreen (Page fault in nonpaged area) and upon reboot I was notified that the registry had had to be restored. This was all within half a day. i tried a few things (fsck, registry cleaning etc.) but the problems persisted, so I wiped the partition and began reinstalling the system. While copying from CD, the installer claimed it was unable to copy some files to the HD, but the CD was in good condition. I tried another CD, also basically unscratched, but got read errors there as well. After trying a few times, the files were finally copied w/o errors, but in the next part of the installation, I got a bluescreen. Pagefault in nonpaged area. It became clear then that I had a hardware problem. I ran memtest86+ for a full pass, and had the HD run a SMART extended self-test, all without indication of a problem. Linux ran stable all the time, though I didn't use it much, so it could be coincidence. I removed the new RAM-Module and reinstalled, and this time it went cleanly. But while installing the drivers, odd errors kept occuring again, just a little too frequently, and then I got the messages about restoring the registry again after reboot, and the soundcard driver kept crashing... and firefox too. I wiped the system again, took out the old RAM-Module which came with the Notebook, and replaced it with the new one. This time, the installation of the system, drivers, updates and everything was as smooth as it could be. I haven't had a single crash or strange error in three days now. Sorry for the long story, but I wanted to give as much background info as possible. Now I want to return the probably defective module, but I'd like to have some kind of proof first that it really is at fault. If all I can say is "Windows crashed", then they'll probably look at me as if I'd said water was wet. Also, I'd like to know for sure so I can stop worrying if my system really is good again. So, what do you suggest? 24-hour-memtest? Statistical crash analysis? I'm open to all your suggestions. Simeon Memtest86+ (and a tester that Microsoft provides) are good, in the sense that both testers work without an OS. That means, that the maximum amount of memory gets tested. But a more strenuous test, is Prime95 or Orthos. Both of them do a calculation with a known answer, and they can check for calculation errors. The error could be due to a bad CPU, a bad Northbridge (memory controller) or bad memory. Since the test is a bit more strenuous than Memtest86+, stability problems can be detected a bit better. Prime95 (use Torture Test option - available for Linux or Windows) Orthos (Basically multiple copies of Prime95 - designed for dual core) http://www.mersenne.org/freesoft.htm (Prime95) http://sp2004.fre3.com/beta/beta2.htm (Orthos) It is possible that Prime95 will make it easier for your warranty repair people to see the problem. A computer in good working order, should be able to run Prime95 for hours and hours, without it detecting an error. HTH, Paul |
#3
|
|||
|
|||
Defective RAM module - or is it?
'Simeon Maxein' wrote, in part:
| Hi all. | | This is not really OC related, but I assume you have experience with the | topic. | | Here's my problem in short: I'm pretty sure I have a defective SO-DIMM | DDR2 RAM module, but memtest86+ ran a complete pass w/o any error. The | module came with the Notebook which still has warranty on it, but I'd | like to have some kind of proof that this was really the source of the | problem. _____ I agree with the post from 'Paul'. There are many problems that could cause the symptoms you report.. At the only moment, you have only a coincidence, and only a megre one at that, if the problem did not start IMMEDIATELY after installing the new memory. You did cause mechanical stress when installing the new memory module, so that is another possibility for the association in time, and another indication of possible motherboard mechanical problems. Motherboard problem; perhaps it only appears when TWO modules are installed; controller problems (CD read problem, I/O errors when copying files). You don't really have the reponsibility of diagnosing the problem, your warranty guarantor does. Your knowing the exact diagnosis mearly helps you get faster service. Since you have multiple kinds of error, mainly associated with data transfer, I'd suspect the motherboard - a mechanical fault in the motherboard is far more likely than an intermittent memory problem. The failure rate of notebook computers is several precent in the first year of operation, the failure rate of memory modules magnitudes lower. Things you can easily do for differential diagnosis 1. try to recreate the problem with just the original memory module. 2. try to recreate the problem with just the new memory module 3. swap the positions of the memory module. 4. RMA the new memory module, then try to recreate the problem with the replacement Use Orthos Orthos: http://sp2004.fre3.com/beta/beta2.htm as Paul suggested, but be sure to pick the 'Blend - stress CPU and Memory' option, otherwise very little of the installed memory will be used. Orthos will stress the system, but it and programs like Prime95 are not really the correct kind of test because they make no attempt to test all of memory, and are mainly useful for CPU stability tests. Remove the new memory module and get warranty service on your notebook. You could just skip to this step, as it is the likely solution. Phil Weldon "Simeon Maxein" wrote in message ... | Hi all. | | This is not really OC related, but I assume you have experience with the | topic. | | Here's my problem in short: I'm pretty sure I have a defective SO-DIMM | DDR2 RAM module, but memtest86+ ran a complete pass w/o any error. The | module came with the Notebook which still has warranty on it, but I'd | like to have some kind of proof that this was really the source of the | problem. | | And here it is again, but with a bit of added history: A few days ago, | my Windows XP SP2 system started behaving unstable. At that time, I was | using two 1GiB-Modules of DDR2-Ram, which the MB automatically used in | dual channel configuration. The modules were not exactly the same , one | of those had come with the Notebook (bought 7 months ago), the other I | had added recently. The MB used the slower of the two timings. | | The problems began: Firefox would crash (which is not unusual since I | use the latest nightly builds), twice I got the same bluescreen (Page | fault in nonpaged area) and upon reboot I was notified that the registry | had had to be restored. This was all within half a day. i tried a few | things (fsck, registry cleaning etc.) but the problems persisted, so I | wiped the partition and began reinstalling the system. | | While copying from CD, the installer claimed it was unable to copy some | files to the HD, but the CD was in good condition. I tried another CD, | also basically unscratched, but got read errors there as well. After | trying a few times, the files were finally copied w/o errors, but in the | next part of the installation, I got a bluescreen. Pagefault in nonpaged | area. It became clear then that I had a hardware problem. | | I ran memtest86+ for a full pass, and had the HD run a SMART extended | self-test, all without indication of a problem. Linux ran stable all the | time, though I didn't use it much, so it could be coincidence. I removed | the new RAM-Module and reinstalled, and this time it went cleanly. But | while installing the drivers, odd errors kept occuring again, just a | little too frequently, and then I got the messages about restoring the | registry again after reboot, and the soundcard driver kept crashing... | and firefox too. I wiped the system again, took out the old RAM-Module | which came with the Notebook, and replaced it with the new one. | | This time, the installation of the system, drivers, updates and | everything was as smooth as it could be. I haven't had a single crash or | strange error in three days now. | | Sorry for the long story, but I wanted to give as much background info | as possible. Now I want to return the probably defective module, but I'd | like to have some kind of proof first that it really is at fault. If all | I can say is "Windows crashed", then they'll probably look at me as if | I'd said water was wet. Also, I'd like to know for sure so I can stop | worrying if my system really is good again. So, what do you suggest? | 24-hour-memtest? Statistical crash analysis? I'm open to all your | suggestions. | | Simeon |
#4
|
|||
|
|||
Defective RAM module - or is it?
Hello again.
Thanks so far, I am just now running Orthos with only the old memory module inserted. It's been running in Blend mode for 40 minutes now, without reporting an error. However, I had an error installing the JRE for Firefox just now (again, could be coincidence, that's the trouble with problems you can't reproduce). And I've had another idea. Most errors occured when large ammounts of data were transferred from/to the HD. The disk itself claims to be innocent (by SMART data and self-test), but I thought recreating similar stress should produce some result. And it worked, too: I just tried to create QuickPar ecc-data for a large file (700mb), and it failed with a checksum error. This should exclude dual-channel-problems, at least as a single source of the trouble. I'll try to repeat the test a few times once Orthos finishes one round, both on the internal HD and on my USB-drive (which is also a 2,5" HD). Then, I'll swap the memory modules again and repeat the tests. 10 repetitions per configuration should already give results which are statistically significant, and I can probably finish this today. Simeon |
#5
|
|||
|
|||
Defective RAM module - or is it?
Simeon Maxein wrote:
Hello again. Thanks so far, I am just now running Orthos with only the old memory module inserted. It's been running in Blend mode for 40 minutes now, without reporting an error. However, I had an error installing the JRE for Firefox just now (again, could be coincidence, that's the trouble with problems you can't reproduce). And I've had another idea. Most errors occured when large ammounts of data were transferred from/to the HD. The disk itself claims to be innocent (by SMART data and self-test), but I thought recreating similar stress should produce some result. And it worked, too: I just tried to create QuickPar ecc-data for a large file (700mb), and it failed with a checksum error. This should exclude dual-channel-problems, at least as a single source of the trouble. I'll try to repeat the test a few times once Orthos finishes one round, both on the internal HD and on my USB-drive (which is also a 2,5" HD). Then, I'll swap the memory modules again and repeat the tests. 10 repetitions per configuration should already give results which are statistically significant, and I can probably finish this today. Simeon HD seems to be the culprit. The Abit NF7-S was notorious for corrupting data during SATA RAID, but not PATA RAID. Timings on the HD controller had to be increased to 1ms to prevent corruption. Man that was annoying. Maybe your mobo has the same problem. -- Phil |
#6
|
|||
|
|||
Defective RAM module - or is it?
Phil schrieb:
HD seems to be the culprit. The Abit NF7-S was notorious for corrupting data during SATA RAID, but not PATA RAID. Timings on the HD controller had to be increased to 1ms to prevent corruption. Man that was annoying. Maybe your mobo has the same problem. I've just excluded that. My external USB-drive gave the same problem. I was able to recreate this error ten times in a row, five times on my internal and five times on my external HD. The verification failed at different points through the test each time. However, after that the problem didn't show up anymore. I'm still testing, and have a new suspect (need more testing before I tell you something misleading), but the HD is quite safely marked OK now. Simeon |
#7
|
|||
|
|||
Defective RAM module - or is it?
And hello again.
I've meanwhile excluded my new potential culprit (CPU voltage), but was unable to recreate the problem after the QuickPar test stopped failing. I'm running on the new memory module again (which is stable, assuming a single point of failure), and should the problem occur again, I know it's either the mainboard or the CPU. In fact, I agree with Phil now that the motherboard is most likely the defective part (Southbridge?), because several devices were making trouble. I think QuickPar already got wrong data from the HD, because Orthos, running at the same time I did some failing QuickPar tests, never showed a problem at all. Also, my WLAN card failed once when I booted up. I've already called Toshiba today, and when the problem next occurs, I will send the device in for warranty service. If it doesn't happen anymore, I'll send it in anyway after my exams, with the old memory installed. Something IS wrong with it, after all. Simeon |
#8
|
|||
|
|||
Defective RAM module - or is it?
"Simeon Maxein" wrote in message ... And hello again. I've meanwhile excluded my new potential culprit (CPU voltage), but was unable to recreate the problem after the QuickPar test stopped failing. I'm running on the new memory module again (which is stable, assuming a single point of failure), and should the problem occur again, I know it's either the mainboard or the CPU. In fact, I agree with Phil now that the motherboard is most likely the defective part (Southbridge?), because several devices were making trouble. I think QuickPar already got wrong data from the HD, because Orthos, running at the same time I did some failing QuickPar tests, never showed a problem at all. Also, my WLAN card failed once when I booted up. I've already called Toshiba today, and when the problem next occurs, I will send the device in for warranty service. If it doesn't happen anymore, I'll send it in anyway after my exams, with the old memory installed. Something IS wrong with it, after all. Simeon You're persistent. -- Phil |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
p4c800-e dlx usb 2.0 module and 1394 module | TawmcaT | Asus Motherboards | 1 | July 14th 05 01:39 AM |
Defective Board? | wooducoodu | Homebuilt PC's | 3 | July 16th 04 01:11 PM |
Defective CPU or mobo ? | Zotin Khuma | Overclocking AMD Processors | 3 | October 20th 03 03:15 AM |
Defective CPU or mobo ? | Zotin Khuma | AMD Thunderbird Processors | 3 | October 20th 03 03:15 AM |