If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On rare occasions, using a pirate version of Windows I got for $5 but am too lazy to change, I get a BSOD. The other day, while using Google Chrome on Youtube, I got such a problem. I think it's software related (I got rid of Google Updater and Google Chrome Frame, which I think are unstable, and I'm also thinking of getting rid of Google Earth, which also I think acts like some sort of memory leak, or so it seems).
But to test the hardware, I ran for 8 hours straight, with no errors, the latest Memtest86+, version 5.01. I am using DDR3 memory by Kingston, 4 MB. BTW an early "beta" version of Memtest showed numerous false positive errors in a certain test involving random test patterns, but running the latest version found no such errors. These errors were clearly some sort of programming error. So by definition it cannot be a hardware problem? Paul once mentioned some lose connection causing an occasional problem on and off, but this is not a laptop and I doubt the PC moves around enough to cause a lose connection. So it must be software. I use msconfig.exe to check startup services, and disable stuff I think is unstable (see above). Anything else I can do? Any sort of TSR type program I can run to check for memory leaks over time while I work? It's rare that I ever got a BSOD with Microsoft Genuine Advantage version Windows, so maybe it's the pirate copy? But to date it has not given me any problems, until the last year or so. So again that points to a software problem, unless some rare alpha particle has corrupted my memory but somehow fails to be detected by Memtest86+? RL |
#2
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
RayLopez99 wrote:
On rare occasions, using a pirate version of Windows I got for $5 but am too lazy to change, I get a BSOD. The other day, while using Google Chrome on Youtube, I got such a problem. I think it's software related (I got rid of Google Updater and Google Chrome Frame, which I think are unstable, and I'm also thinking of getting rid of Google Earth, which also I think acts like some sort of memory leak, or so it seems). But to test the hardware, I ran for 8 hours straight, with no errors, the latest Memtest86+, version 5.01. I am using DDR3 memory by Kingston, 4 MB. BTW an early "beta" version of Memtest showed numerous false positive errors in a certain test involving random test patterns, but running the latest version found no such errors. These errors were clearly some sort of programming error. So by definition it cannot be a hardware problem? Paul once mentioned some lose connection causing an occasional problem on and off, but this is not a laptop and I doubt the PC moves around enough to cause a lose connection. So it must be software. I use msconfig.exe to check startup services, and disable stuff I think is unstable (see above). Anything else I can do? Any sort of TSR type program I can run to check for memory leaks over time while I work? It's rare that I ever got a BSOD with Microsoft Genuine Advantage version Windows, so maybe it's the pirate copy? But to date it has not given me any problems, until the last year or so. So again that points to a software problem, unless some rare alpha particle has corrupted my memory but somehow fails to be detected by Memtest86+? RL I think there were some problems created at first, when the memtest author attempted to do multithreaded testing. I don't know all the details on that. See if there is an option in the interface of Memtest, to turn that off and do all the testing with one core. The earlier versions were only testing with the one core. And that makes sense, as the processor is normally faster than the memory subsystem, and can keep it pretty busy. I don't know if multithreading is all that necessary. It could be that multithreaded testing, was an attempt to simulate the thoroughness you get from Prime95 testing. ******* As a programmer, you should be setting up your system for debugging. Make sure when the system BSODs, it created a memory dump. And, that you have a set of symbol files for the OS, so you can debug what shows in the crash. It's possible windbg can read a large dump file for you. http://en.wikipedia.org/wiki/WinDbg When an application crashes, you can configure a system to not report to Microsoft, and instead create a .dmp file. That's a minidump, and can be read with BlueScreenView. That's a relatively small file, with a stack trace in it. When the OS crashes, I think it uses the pagefile as a place to dump, so the pagefile has to be big enough to hold all of memory. Something like that. I haven't attempted to run a debugger in some time, so have conveniently forgotten all the details :-) I may have a copy of windbg loaded in a VM here, for when I was trying to get debug information when compiling a debug version of Firefox in Win2K. On other platforms, usually a different debugger is used for kernel debug (like, kdb), versus program debugging (maybe, gdb). It's possible Windows does both with the one tool suite. But don't quote me on that. Any time I have to do this stuff, I have to research it all over again. ******* Memtest86+ (memtest.org) is not the final authority on stable memory. This is a tradeoff, between good coverage (testing all the memory), versus being thorough. Memtest86+ tests most all of the memory. It misses the low 1MB of memory, which contains the 640K area Bill Gates was so proud of. Any area like that which is marked as "reserved", memtest cannot touch it. Thorough, 100% testing, requires configuring the memory under test, into single channel mode, installing two sticks. That makes one DIMM the "high DIMM", the other dimm the "low DIMM". Memtest86+ then misses testing 1MB of space on the "low DIMM". By then swapping DIMMs and doing another test, the moved DIMM then gets 100% tested. The high DIMM is completely tested, when you run in that special test setup of using single channel mode. You must inspect the color of the motherboard RAM sockets, to figure out how to get single channel. On some rare platforms (LGA2011 microATX), there may not be the slots needed to do single channel mode. And proper testing then is not possible. That is only an issue, if you suspect a problem in the 640K area. The final authority on memory, is Prime95 Torture Test (or for that matter, any test constructed to do similar things, that came after Prime95). It doesn't cover all the memory, as the OS "owns" about 300MB. I think there is some Intel test, that is used as a thorough test like that. These tests tend to be multithreaded, and avoid some level of self-synchronization. When you run Prime95, the individual threads can get ahead of one another in terms of progress, so they may not have the same time relationship between each other when running. For a hardware noise perspective, it's good if the threads tend not to "lock" with one another in the time domain. You can get Prime95 from mersenne.org/freesoft. Versions are available for Windows and Linux. On Windows, for my own systems here, I play a DirectX 3D game, at the same time as Prime95 has a thread per core running. That seems to be a good test case for proving the system is stable. In Prime95, a thread of execution stops, any time a "roundoff error" is detected. And one presumes, a significant "roundoff error", is being causes by a memory corruption. It could also be caused by a flaky FPU on a processor, which may happen once in a blue moon (there was a bad batch of Intel processors with a problem like that). Prime95 knows what the answer of any of the FFTs it runs should be, and that's how it knows what to expect and how it can claim a "roundoff error". My acceptance test, is a 4 hour run with Prime95, where no thread of execution stops on an error. Other people like to run it all night. Paul |
#3
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On Sat, 16 Nov 2013 21:59:53 -0800 (PST), RayLopez99
wrote: I use msconfig.exe to check startup services, and disable stuff I think is unstable (see above). Anything else I can do? Any sort of TSR type program I can run to check for memory leaks over time while I work? It's rare that I ever got a BSOD with Microsoft Genuine Advantage version Windows, so maybe it's the pirate copy? But to date it has not given me any problems, until the last year or so. - I never get any problems until the last year or so, usually being the last years, or so, and potentially hardware related;- I'm going to cut the front out of my cases, those I need to, to expose the front fans for when the next one with frozen bearings burns up. When I do, however, get problems that need to be addressed, although they may surface fast for as early as within a week, it's usually intimate because I've been using the same ghosted OS, in binary images, longer than I care to reveal. (Everything, btw, is migrated to a SSD). One of my recent additions along the lines of process monitoring is PL*, although I can't offhand recall the differences to its free and paid incarnations. . . My oldest browser, btw, I do use frequently, I keep its processes heavily contained within both process and filtering rules -- I'm less comfortable engaging upon a few newer browsers more apt to be shadowed (from their original install configurations) -- through such as batched preprocessing calls designed to determine an incremental backup point based on changes (mutations) the browser engages, upon being and while connected, for synchronization purposes by another program, within the batch call, as in essence to restorative measure subsequent, or each time the browser is run. (With TOR exit nodes and potential NSA middleman hijacks, it can get more involved than that...although not terribly so.) * https://en.wikipedia.org/wiki/Process_Lasso |
#4
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On 17/11/2013 12:59 AM, RayLopez99 wrote:
On rare occasions, using a pirate version of Windows I got for $5 but am too lazy to change, I get a BSOD. The other day, while using Google Chrome on Youtube, I got such a problem. I think it's software related (I got rid of Google Updater and Google Chrome Frame, which I think are unstable, and I'm also thinking of getting rid of Google Earth, which also I think acts like some sort of memory leak, or so it seems). I'd first try to find out what the BSOD is all about before I even start to blame it on bad memory. Not all BSOD's are caused by bad memory. Run a crash dump analysis program. Here are two good ones: Resplendence Software - WhoCrashed, automatic crash dump analyzer http://www.resplendence.com/whocrashed Blue screen of death (STOP error) information in dump files. http://www.nirsoft.net/utils/blue_screen_view.html Yousuf Khan |
#5
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On Monday, November 18, 2013 11:56:05 PM UTC+8, Yousuf Khan wrote:
On 17/11/2013 12:59 AM, RayLopez99 wrote: On rare occasions, using a pirate version of Windows I got for $5 but am too lazy to change, I get a BSOD. The other day, while using Google Chrome on Youtube, I got such a problem. I think it's software related (I got rid of Google Updater and Google Chrome Frame, which I think are unstable, and I'm also thinking of getting rid of Google Earth, which also I think acts like some sort of memory leak, or so it seems). I'd first try to find out what the BSOD is all about before I even start to blame it on bad memory. Not all BSOD's are caused by bad memory. Run a crash dump analysis program. Here are two good ones: Resplendence Software - WhoCrashed, automatic crash dump analyzer http://www.resplendence.com/whocrashed Blue screen of death (STOP error) information in dump files. http://www.nirsoft.net/utils/blue_screen_view.html Yousuf Khan Thanks Khan! Thanks Paul and Flasherly. As for debugging the BSOD, I tried to use Visual Studio 2010's debugger, but for some strange reason could not get it to work. Next time I will try Khan's solution, having seen this caveat: "Note that WhoCrashed cannot always be exactly sure about the root cause of a system crash. Because all kernel modules run in the same address space, any driver or other kernel module can potentially corrupt another. Also, any driver may be able to cause problems to any other driver that runs in the same device stack. This is to say this software is not guaranteed to identify the culprit in every scenario. " Also the system crashed while I was hooked up to the internet using a proxy server and Google's Chrome, having just run Google Earth and CCleaner to clean unwanted files. Perhaps this combination was too much for Google's software? I've since removed Google Earth which seems to me to be buggy. Anyway for now the problem has gone away, and anyway it's rare, but with my other systems I never got a BSOD. RL |
#6
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On 18/11/2013 12:21 PM, RayLopez99 wrote:
Thanks Khan! Thanks Paul and Flasherly. As for debugging the BSOD, I tried to use Visual Studio 2010's debugger, but for some strange reason could not get it to work. Next time I will try Khan's solution, having seen this caveat: "Note that WhoCrashed cannot always be exactly sure about the root cause of a system crash. Because all kernel modules run in the same address space, any driver or other kernel module can potentially corrupt another. Also, any driver may be able to cause problems to any other driver that runs in the same device stack. This is to say this software is not guaranteed to identify the culprit in every scenario. " I started out using the Microsoft Visual Studio debugger, but once I found the BlueScreenView and to a lesser extent, WhoCrashed, there was never a reason to use the manual method any longer. It was a lot of fun using the VS debugger, you got to really learn how to follow the path of your programs, and even look at the assembly code involved in the crash, but it was a lot of unnecessary work. These automatic crash dump analysis programs do the same job, just a lot faster and with less hassle to you. Yes, it's true that one driver could corrupt another driver, but that's always the case, you'd be just as fooled whether you were doing the manual debug or the automatic one. Did the auto debuggers tell you what the cause of the previous crashes were? Yousuf Khan |
#7
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On Monday, December 2, 2013 8:14:01 AM UTC+8, Yousuf Khan wrote:
Did the auto debuggers tell you what the cause of the previous crashes were? I used the freeware Blue Screen view and I thought it indicated the AMD Radeon graphics card was at fault (that was the last driver that hung says the program, or so it seemed to say). Currently I am using DriverMax to replace all old drivers (see another thread here) and I've replaced half of them (two a day, the maximum allowed by the freeware version), including the Radeon drivers, and so far, fingers crossed, no BSOD after a week, but it's too early to tell since even before the BSOD would happen once a week or so. Worse case I'll do a clean reinstall using a licensed copy of Windows 7 (this is a pirate copy), but I'm too lazy and this workaround is working for now. I wonder if several bad drivers can affect each other, that is, somehow, depending on how they are loaded (what sequence) into memory, they can corrupt each other. I take it that this is unlikely, and more likely one badly written driver has a memory leak. It seems also that disabling Daemon Tools lite at Startup (which is what I was doing) and viewing embedded Youtube videos through a website (that is, not going directly to Youtube but viewing the videos at a website) may (but not clear) have triggered the bad drivers. Anyway it's like working backwards in a game of retrograde chess analysis to figure out the cause of BSOD. RL |
#8
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On Sunday, December 1, 2013 9:35:48 PM UTC-6, RayLopez99 wrote:
[...] I have an older version of Memtest86+ and always saw some errors on test 7. Recently I added another 2GB to that machine and now it never detects any memory errors. |
#9
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
Davej wrote:
On Sunday, December 1, 2013 9:35:48 PM UTC-6, RayLopez99 wrote: [...] I have an older version of Memtest86+ and always saw some errors on test 7. Recently I added another 2GB to that machine and now it never detects any memory errors. That could be a memory reservation issue. Memtest relies on some tables the BIOS provides, to warn it about areas it should not use. There might typically be around 1MB of memory that cannot be tested by Memtest86+. And there is some standard BIOS call for getting that information (E820?). You can download the source code, and find a reference to that. The memsize.c module does some stuff with that E820 info. It could be, that when you had the lesser amount of memory, the BIOS was not able to correctly report about high memory reservations. Then, BIOS usage of the affected area, conflicted with memtest86+ trying to read/write test there. Using SMM, I think it is possible for the BIOS to interrupt the execution of memtest86+, so that the BIOS SMM code can run. And there is nothing memtest86+ can do about it (it's an interruption that cannot be blocked, and it also upsets audio workstation users when SMM runs for too long). It could be some SMM code, which writes to a reserved area, or does something to upset an area that memtest86+ just wrote. A typical SMM application, might run 30 or 60 times a second. SMM might have been used by the Asus iPanel, to drive the display on the iPanel, without the OS knowing what was going on. Later motherboards used SMM to adjust the Vcore regulator running phases (turn off extra regulator phases when they're not needed, to improve efficiency). http://en.wikipedia.org/wiki/System_Management_Mode "Control power management operations, such as managing the voltage regulator modules" (Example of an iPanel, a display device connected to SMI interrupt to gain attention. The display is updated by BIOS code, on motherboards with support for it. The idea seems silly now, but the idea was to avoid users needing an application running to do this instead. So they made the BIOS do it, in the background.) http://ixbtlabs.com/articles/asusipanelbasic/ The description here, suggests the SMI to trigger SMM, may support a roughly 60Hz rate. http://www.google.com/patents/US5606713 And it could be one of those kinds of undocumented routines or activities in your hardware, that is interfering. A properly coded E820 table would have prevented that (avoid conflicts between BIOS usage and other usages). ******* An interesting test you can run, is as follows. Say you have a four DIMM slot motherboard. Two DIMM slots are occupied. Most people install the RAM in dual channel mode, for best performance. If you have memtest86+ failures, you can move one of the DIMMs around, until you're in "Single Channel Mode". That causes one DIMM to be the "High Memory DIMM" and the other DIMM to be the "Low Memory DIMM". In dual channel mode, they're interleaved, and it's pretty difficult for a human doing hex in their head, to convert a failure address, into a particular DIMM fault. Now, run memtest86+. Note the failure address (assuming a failure is still observed). Now swap the two DIMMs into each other's slot. Now, the Low Memory DIMM becomes the High Memory DIMM, and vice versa. If you really have a memory problem, the address of the fault will move in proportion to the new physical address of the module. If you find the faults haven't moved, and the faults are still at the same address, then that's an SMM conflict. Or, it could be. Other things running on your computer, include the Intel Active Management Technology (AMT). But that's only running on Q series chipsets, and the microcontroller located somewhere inside the chipset, shares a portion of system memory for its execution. Presumably, the BIOS E820 table is updated, so AMT activity won't upset things like memtest86+. At one time, AMT required that a certain Intel chipset DIMM slot, had to be populated first when fitting the RAM. ******* Of course it could always be bad RAM :-) Just a guess, Paul |
#10
|
|||
|
|||
Memtest86+ is always right? So it must be a software problem
On Sun, 1 Dec 2013 19:35:48 -0800 (PST), RayLopez99
wrote: Worse case I'll do a clean reinstall using a licensed copy of Windows 7 (this is a pirate copy), but I'm too lazy and this workaround is working for now. I wonder if several bad drivers can affect each other, that is, somehow, depending on how they are loaded (what sequence) into memory, they can corrupt each other. I take it that this is unlikely, and more likely one badly written driver has a memory leak. It seems also that disabling Daemon Tools lite at Startup (which is what I was doing) and viewing embedded Youtube videos through a website (that is, not going directly to Youtube but viewing the videos at a website) may (but not clear) have triggered the bad drivers. Anyway it's like working backwards in a game of retrograde chess analysis to figure out the cause of BSOD. -- That's what I call going hard core on it, reinstalling the MicroSoft OS. Only I may add another layer of successive binary OS images to the strategy of advancing installs [over and upon the initial install]. Seems ATI started out a Canadian company and is now absorbed by AMD;- Hence quite a long history, near since the beginnings of PCs, actually. The one abiding credit I'd most certainly ascribe to ATI to a slang PC dictionary of distinctions and notables, is first, (MS doesn't really count), right up there alongside commercial virus security suites, for having developed an install routine, few, if any, can break down, afterwards, for restorative analysis. Regardless whether the binaries are for troubleshooting, they do, nevertheless, form a routine integral and periodic application, personally, for wiping off MSOS partition, weekly on average, with a clean binary MSOS;- a backup image never, on certain principle, exposed to the WWW during its creation. I've seen one too many instances of what you're describing, now for the greater part to summarily dismiss what by binary copies expedite within character of anomalies I occasionally do encounter for no other apparent reason than being connected. (Chess is very patterned for its openings, which evinces at higher levels of play corresponding parity in safety, often declared draws, if not a parity, then, at an earlier axis of extant resources within a leeway subsequent, thereby to tax its masters. I play chess at the edge of its excellent rankings, above average, good players, when I take time, a few weeks of preparatory studies before entering its arena. The "game" as so applied to computers is within patterning absolutes to an established parity, de facto, MicroSoft garners every time it adds large venues, e.g., HP and Dell, to its corporate harem of capitalistic hegemony of PCs, certifiably approved, with a pre-installed OS;- a game for latecomers to lose, by default, from the challenge of codework upon deviating from those same preestablished rules, certifiable MSOS code (sic), subsequent in form and contingency to corrupt the "game" environment. The analogy would then seem be given reason for as much use, there cannot be in a contradiction, whereby gain is at all apropos to fix what is upon principle broken to begin with.) |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Memtest86 problem w P4c800-e | _dee[_2_] | Asus Motherboards | 2 | December 29th 08 12:45 AM |
using Memtest86+ | Synapse Syndrome | Asus Motherboards | 22 | March 7th 07 05:55 PM |
using Memtest86+ | Synapse Syndrome | General | 22 | March 7th 07 05:55 PM |
memtest86 memtest86+ memtest86++ | [email protected] | Overclocking AMD Processors | 6 | September 24th 06 02:47 AM |
Need help-- Memtest86 | MB_ | Dell Computers | 8 | September 8th 05 11:55 PM |