A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Homebuilt PC's
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Memtest86+ is always right? So it must be a software problem



 
 
Thread Tools Display Modes
  #1  
Old November 17th 13, 06:59 AM posted to alt.comp.hardware.pc-homebuilt
RayLopez99
external usenet poster
 
Posts: 897
Default Memtest86+ is always right? So it must be a software problem

On rare occasions, using a pirate version of Windows I got for $5 but am too lazy to change, I get a BSOD. The other day, while using Google Chrome on Youtube, I got such a problem. I think it's software related (I got rid of Google Updater and Google Chrome Frame, which I think are unstable, and I'm also thinking of getting rid of Google Earth, which also I think acts like some sort of memory leak, or so it seems).

But to test the hardware, I ran for 8 hours straight, with no errors, the latest Memtest86+, version 5.01. I am using DDR3 memory by Kingston, 4 MB. BTW an early "beta" version of Memtest showed numerous false positive errors in a certain test involving random test patterns, but running the latest version found no such errors. These errors were clearly some sort of programming error.

So by definition it cannot be a hardware problem? Paul once mentioned some lose connection causing an occasional problem on and off, but this is not a laptop and I doubt the PC moves around enough to cause a lose connection.

So it must be software.

I use msconfig.exe to check startup services, and disable stuff I think is unstable (see above). Anything else I can do? Any sort of TSR type program I can run to check for memory leaks over time while I work? It's rare that I ever got a BSOD with Microsoft Genuine Advantage version Windows, so maybe it's the pirate copy? But to date it has not given me any problems, until the last year or so. So again that points to a software problem, unless some rare alpha particle has corrupted my memory but somehow fails to be detected by Memtest86+?

RL
  #2  
Old November 17th 13, 10:39 AM posted to alt.comp.hardware.pc-homebuilt
Paul
external usenet poster
 
Posts: 13,364
Default Memtest86+ is always right? So it must be a software problem

RayLopez99 wrote:
On rare occasions, using a pirate version of Windows I got for $5 but
am too lazy to change, I get a BSOD. The other day, while using Google
Chrome on Youtube, I got such a problem. I think it's software related
(I got rid of Google Updater and Google Chrome Frame, which I think are
unstable, and I'm also thinking of getting rid of Google Earth, which also
I think acts like some sort of memory leak, or so it seems).

But to test the hardware, I ran for 8 hours straight, with no errors,
the latest Memtest86+, version 5.01. I am using DDR3 memory by Kingston,
4 MB. BTW an early "beta" version of Memtest showed numerous false positive
errors in a certain test involving random test patterns, but running the
latest version found no such errors. These errors were clearly some sort
of programming error.

So by definition it cannot be a hardware problem? Paul once mentioned
some lose connection causing an occasional problem on and off, but this is
not a laptop and I doubt the PC moves around enough to cause a lose connection.

So it must be software.

I use msconfig.exe to check startup services, and disable stuff I think is
unstable (see above). Anything else I can do? Any sort of TSR type program
I can run to check for memory leaks over time while I work? It's rare that
I ever got a BSOD with Microsoft Genuine Advantage version Windows, so maybe
it's the pirate copy? But to date it has not given me any problems, until
the last year or so. So again that points to a software problem, unless
some rare alpha particle has corrupted my memory but somehow fails to be
detected by Memtest86+?

RL


I think there were some problems created at first,
when the memtest author attempted to do multithreaded
testing. I don't know all the details on that. See if there
is an option in the interface of Memtest, to turn that
off and do all the testing with one core. The earlier
versions were only testing with the one core. And that
makes sense, as the processor is normally faster than
the memory subsystem, and can keep it pretty busy. I
don't know if multithreading is all that necessary.

It could be that multithreaded testing, was an attempt
to simulate the thoroughness you get from Prime95 testing.

*******

As a programmer, you should be setting up your system
for debugging. Make sure when the system BSODs, it
created a memory dump. And, that you have a set of
symbol files for the OS, so you can debug what
shows in the crash. It's possible windbg can
read a large dump file for you.

http://en.wikipedia.org/wiki/WinDbg

When an application crashes, you can configure a system
to not report to Microsoft, and instead create a .dmp file.
That's a minidump, and can be read with BlueScreenView.
That's a relatively small file, with a stack trace in it.

When the OS crashes, I think it uses the pagefile as a place
to dump, so the pagefile has to be big enough to hold all
of memory. Something like that.

I haven't attempted to run a debugger in some time, so
have conveniently forgotten all the details :-) I may
have a copy of windbg loaded in a VM here, for when
I was trying to get debug information when compiling
a debug version of Firefox in Win2K.

On other platforms, usually a different debugger is
used for kernel debug (like, kdb), versus program
debugging (maybe, gdb). It's possible Windows does
both with the one tool suite. But don't quote me on
that. Any time I have to do this stuff, I have to
research it all over again.

*******

Memtest86+ (memtest.org) is not the final authority
on stable memory. This is a tradeoff, between good
coverage (testing all the memory), versus being thorough.

Memtest86+ tests most all of the memory. It misses
the low 1MB of memory, which contains the 640K area
Bill Gates was so proud of. Any area like that which is
marked as "reserved", memtest cannot touch it.

Thorough, 100% testing, requires configuring the memory
under test, into single channel mode, installing two
sticks. That makes one DIMM the "high DIMM", the other
dimm the "low DIMM". Memtest86+ then misses testing
1MB of space on the "low DIMM". By then swapping DIMMs
and doing another test, the moved DIMM then gets 100%
tested. The high DIMM is completely tested, when you
run in that special test setup of using single channel
mode. You must inspect the color of the motherboard RAM
sockets, to figure out how to get single channel. On
some rare platforms (LGA2011 microATX), there may not
be the slots needed to do single channel mode. And proper
testing then is not possible.

That is only an issue, if you suspect a problem in the
640K area.

The final authority on memory, is Prime95 Torture Test
(or for that matter, any test constructed to do similar
things, that came after Prime95). It doesn't cover
all the memory, as the OS "owns" about 300MB.

I think there is some Intel test, that is used as a
thorough test like that.

These tests tend to be multithreaded, and avoid some
level of self-synchronization. When you run Prime95, the
individual threads can get ahead of one another in terms
of progress, so they may not have the same time relationship
between each other when running. For a hardware noise
perspective, it's good if the threads tend not to
"lock" with one another in the time domain.

You can get Prime95 from mersenne.org/freesoft. Versions
are available for Windows and Linux. On Windows, for my
own systems here, I play a DirectX 3D game, at the same
time as Prime95 has a thread per core running. That seems
to be a good test case for proving the system is stable.

In Prime95, a thread of execution stops, any time a "roundoff
error" is detected. And one presumes, a significant "roundoff
error", is being causes by a memory corruption. It could
also be caused by a flaky FPU on a processor, which may
happen once in a blue moon (there was a bad batch of
Intel processors with a problem like that). Prime95 knows
what the answer of any of the FFTs it runs should be, and
that's how it knows what to expect and how it can claim a
"roundoff error".

My acceptance test, is a 4 hour run with Prime95, where
no thread of execution stops on an error. Other people
like to run it all night.

Paul
  #3  
Old November 17th 13, 09:20 PM posted to alt.comp.hardware.pc-homebuilt
Flasherly[_2_]
external usenet poster
 
Posts: 2,407
Default Memtest86+ is always right? So it must be a software problem

On Sat, 16 Nov 2013 21:59:53 -0800 (PST), RayLopez99
wrote:


I use msconfig.exe to check startup services, and disable stuff I
think is unstable (see above). Anything else I can do? Any sort of
TSR type program I can run to check for memory leaks over time while I
work? It's rare that I ever got a BSOD with Microsoft Genuine
Advantage version Windows, so maybe it's the pirate copy? But to date
it has not given me any problems, until the last year or so.

-

I never get any problems until the last year or so, usually being the
last years, or so, and potentially hardware related;- I'm going to cut
the front out of my cases, those I need to, to expose the front fans
for when the next one with frozen bearings burns up.

When I do, however, get problems that need to be addressed, although
they may surface fast for as early as within a week, it's usually
intimate because I've been using the same ghosted OS, in binary
images, longer than I care to reveal. (Everything, btw, is migrated
to a SSD).

One of my recent additions along the lines of process monitoring is
PL*, although I can't offhand recall the differences to its free and
paid incarnations. . . My oldest browser, btw, I do use frequently, I
keep its processes heavily contained within both process and filtering
rules -- I'm less comfortable engaging upon a few newer browsers more
apt to be shadowed (from their original install configurations) --
through such as batched preprocessing calls designed to determine an
incremental backup point based on changes (mutations) the browser
engages, upon being and while connected, for synchronization purposes
by another program, within the batch call, as in essence to
restorative measure subsequent, or each time the browser is run. (With
TOR exit nodes and potential NSA middleman hijacks, it can get more
involved than that...although not terribly so.)

* https://en.wikipedia.org/wiki/Process_Lasso
  #4  
Old November 18th 13, 04:56 PM posted to alt.comp.hardware.pc-homebuilt
Yousuf Khan[_2_]
external usenet poster
 
Posts: 1,296
Default Memtest86+ is always right? So it must be a software problem

On 17/11/2013 12:59 AM, RayLopez99 wrote:
On rare occasions, using a pirate version of Windows I got for $5 but
am too lazy to change, I get a BSOD. The other day, while using
Google Chrome on Youtube, I got such a problem. I think it's
software related (I got rid of Google Updater and Google Chrome
Frame, which I think are unstable, and I'm also thinking of getting
rid of Google Earth, which also I think acts like some sort of memory
leak, or so it seems).


I'd first try to find out what the BSOD is all about before I even start
to blame it on bad memory. Not all BSOD's are caused by bad memory. Run
a crash dump analysis program. Here are two good ones:

Resplendence Software - WhoCrashed, automatic crash dump analyzer
http://www.resplendence.com/whocrashed

Blue screen of death (STOP error) information in dump files.
http://www.nirsoft.net/utils/blue_screen_view.html

Yousuf Khan
  #5  
Old November 18th 13, 06:21 PM posted to alt.comp.hardware.pc-homebuilt
RayLopez99
external usenet poster
 
Posts: 897
Default Memtest86+ is always right? So it must be a software problem

On Monday, November 18, 2013 11:56:05 PM UTC+8, Yousuf Khan wrote:
On 17/11/2013 12:59 AM, RayLopez99 wrote:

On rare occasions, using a pirate version of Windows I got for $5 but


am too lazy to change, I get a BSOD. The other day, while using


Google Chrome on Youtube, I got such a problem. I think it's


software related (I got rid of Google Updater and Google Chrome


Frame, which I think are unstable, and I'm also thinking of getting


rid of Google Earth, which also I think acts like some sort of memory


leak, or so it seems).




I'd first try to find out what the BSOD is all about before I even start

to blame it on bad memory. Not all BSOD's are caused by bad memory. Run

a crash dump analysis program. Here are two good ones:



Resplendence Software - WhoCrashed, automatic crash dump analyzer

http://www.resplendence.com/whocrashed



Blue screen of death (STOP error) information in dump files.

http://www.nirsoft.net/utils/blue_screen_view.html



Yousuf Khan


Thanks Khan! Thanks Paul and Flasherly.

As for debugging the BSOD, I tried to use Visual Studio 2010's debugger, but for some strange reason could not get it to work. Next time I will try Khan's solution, having seen this caveat: "Note that WhoCrashed cannot always be exactly sure about the root cause of a system crash. Because all kernel modules run in the same address space, any driver or other kernel module can potentially corrupt another. Also, any driver may be able to cause problems to any other driver that runs in the same device stack. This is to say this software is not guaranteed to identify the culprit in every scenario. "

Also the system crashed while I was hooked up to the internet using a proxy server and Google's Chrome, having just run Google Earth and CCleaner to clean unwanted files. Perhaps this combination was too much for Google's software? I've since removed Google Earth which seems to me to be buggy.

Anyway for now the problem has gone away, and anyway it's rare, but with my other systems I never got a BSOD.

RL
  #6  
Old December 2nd 13, 01:14 AM posted to alt.comp.hardware.pc-homebuilt
Yousuf Khan[_2_]
external usenet poster
 
Posts: 1,296
Default Memtest86+ is always right? So it must be a software problem

On 18/11/2013 12:21 PM, RayLopez99 wrote:
Thanks Khan! Thanks Paul and Flasherly.

As for debugging the BSOD, I tried to use Visual Studio 2010's debugger, but for some strange reason could not get it to work. Next time I will try Khan's solution, having seen this caveat: "Note that WhoCrashed cannot always be exactly sure about the root cause of a system crash. Because all kernel modules run in the same address space, any driver or other kernel module can potentially corrupt another. Also, any driver may be able to cause problems to any other driver that runs in the same device stack. This is to say this software is not guaranteed to identify the culprit in every scenario. "


I started out using the Microsoft Visual Studio debugger, but once I
found the BlueScreenView and to a lesser extent, WhoCrashed, there was
never a reason to use the manual method any longer. It was a lot of fun
using the VS debugger, you got to really learn how to follow the path of
your programs, and even look at the assembly code involved in the crash,
but it was a lot of unnecessary work. These automatic crash dump
analysis programs do the same job, just a lot faster and with less
hassle to you. Yes, it's true that one driver could corrupt another
driver, but that's always the case, you'd be just as fooled whether you
were doing the manual debug or the automatic one.

Did the auto debuggers tell you what the cause of the previous crashes were?

Yousuf Khan
  #7  
Old December 2nd 13, 04:35 AM posted to alt.comp.hardware.pc-homebuilt
RayLopez99
external usenet poster
 
Posts: 897
Default Memtest86+ is always right? So it must be a software problem

On Monday, December 2, 2013 8:14:01 AM UTC+8, Yousuf Khan wrote:
Did the auto debuggers tell you what the cause of the previous
crashes were?


I used the freeware Blue Screen view and I thought it indicated the AMD Radeon graphics card was at fault (that was the last driver that hung says the program, or so it seemed to say). Currently I am using DriverMax to replace all old drivers (see another thread here) and I've replaced half of them (two a day, the maximum allowed by the freeware version), including the Radeon drivers, and so far, fingers crossed, no BSOD after a week, but it's too early to tell since even before the BSOD would happen once a week or so. Worse case I'll do a clean reinstall using a licensed copy of Windows 7 (this is a pirate copy), but I'm too lazy and this workaround is working for now.

I wonder if several bad drivers can affect each other, that is, somehow, depending on how they are loaded (what sequence) into memory, they can corrupt each other. I take it that this is unlikely, and more likely one badly written driver has a memory leak. It seems also that disabling Daemon Tools lite at Startup (which is what I was doing) and viewing embedded Youtube videos through a website (that is, not going directly to Youtube but viewing the videos at a website) may (but not clear) have triggered the bad drivers. Anyway it's like working backwards in a game of retrograde chess analysis to figure out the cause of BSOD.

RL
  #8  
Old December 2nd 13, 07:00 AM posted to alt.comp.hardware.pc-homebuilt
Davej
external usenet poster
 
Posts: 273
Default Memtest86+ is always right? So it must be a software problem

On Sunday, December 1, 2013 9:35:48 PM UTC-6, RayLopez99 wrote:
[...]


I have an older version of Memtest86+ and always saw some errors on test 7. Recently I added another 2GB to that machine and now it never detects any memory errors.
  #9  
Old December 2nd 13, 07:59 AM posted to alt.comp.hardware.pc-homebuilt
Paul
external usenet poster
 
Posts: 13,364
Default Memtest86+ is always right? So it must be a software problem

Davej wrote:
On Sunday, December 1, 2013 9:35:48 PM UTC-6, RayLopez99 wrote:
[...]


I have an older version of Memtest86+ and always saw some errors on test 7. Recently I added another 2GB to that machine and now it never detects any memory errors.


That could be a memory reservation issue.

Memtest relies on some tables the BIOS provides, to warn
it about areas it should not use. There might typically
be around 1MB of memory that cannot be tested by Memtest86+.
And there is some standard BIOS call for getting that information (E820?).

You can download the source code, and find a reference to that.
The memsize.c module does some stuff with that E820 info.

It could be, that when you had the lesser amount of memory,
the BIOS was not able to correctly report about high memory
reservations. Then, BIOS usage of the affected area,
conflicted with memtest86+ trying to read/write test there.

Using SMM, I think it is possible for the BIOS to
interrupt the execution of memtest86+, so that the
BIOS SMM code can run. And there is nothing memtest86+
can do about it (it's an interruption that cannot be
blocked, and it also upsets audio workstation users when
SMM runs for too long).

It could be some SMM code, which writes to a reserved area,
or does something to upset an area that memtest86+ just wrote.
A typical SMM application, might run 30 or 60 times a second.

SMM might have been used by the Asus iPanel, to
drive the display on the iPanel, without the OS
knowing what was going on. Later motherboards used
SMM to adjust the Vcore regulator running phases
(turn off extra regulator phases when they're not needed,
to improve efficiency).

http://en.wikipedia.org/wiki/System_Management_Mode

"Control power management operations, such as managing
the voltage regulator modules"

(Example of an iPanel, a display device connected to SMI
interrupt to gain attention. The display is updated by
BIOS code, on motherboards with support for it. The idea
seems silly now, but the idea was to avoid users needing
an application running to do this instead. So they made
the BIOS do it, in the background.)

http://ixbtlabs.com/articles/asusipanelbasic/

The description here, suggests the SMI to trigger SMM,
may support a roughly 60Hz rate.

http://www.google.com/patents/US5606713

And it could be one of those kinds of undocumented
routines or activities in your hardware, that
is interfering. A properly coded E820 table
would have prevented that (avoid conflicts between
BIOS usage and other usages).

*******

An interesting test you can run, is as follows.

Say you have a four DIMM slot motherboard. Two
DIMM slots are occupied. Most people install the
RAM in dual channel mode, for best performance.

If you have memtest86+ failures, you can move
one of the DIMMs around, until you're in
"Single Channel Mode". That causes one DIMM to
be the "High Memory DIMM" and the other DIMM
to be the "Low Memory DIMM". In dual channel
mode, they're interleaved, and it's pretty
difficult for a human doing hex in their head,
to convert a failure address, into a particular
DIMM fault.

Now, run memtest86+. Note the failure address
(assuming a failure is still observed). Now
swap the two DIMMs into each other's slot.
Now, the Low Memory DIMM becomes the High
Memory DIMM, and vice versa. If you really have
a memory problem, the address of the fault will
move in proportion to the new physical address
of the module. If you find the faults haven't
moved, and the faults are still at the same address,
then that's an SMM conflict. Or, it could be.

Other things running on your computer, include
the Intel Active Management Technology (AMT).
But that's only running on Q series chipsets,
and the microcontroller located somewhere
inside the chipset, shares a portion of
system memory for its execution. Presumably,
the BIOS E820 table is updated, so AMT activity
won't upset things like memtest86+. At one time,
AMT required that a certain Intel chipset DIMM
slot, had to be populated first when fitting the
RAM.

*******

Of course it could always be bad RAM :-)

Just a guess,
Paul
  #10  
Old December 3rd 13, 02:06 AM posted to alt.comp.hardware.pc-homebuilt
Flasherly[_2_]
external usenet poster
 
Posts: 2,407
Default Memtest86+ is always right? So it must be a software problem

On Sun, 1 Dec 2013 19:35:48 -0800 (PST), RayLopez99
wrote:

Worse

case I'll do a clean reinstall using a licensed copy of Windows 7
(this is a pirate copy), but I'm too lazy and this workaround is
working for now.

I

wonder if several bad drivers can affect each other, that is, somehow,
depending on how they are loaded (what sequence) into memory, they can
corrupt each other. I take it that this is unlikely, and more likely
one badly written driver has a memory leak. It seems also that
disabling Daemon Tools lite at Startup (which is what I was doing) and
viewing embedded Youtube videos through a website (that is, not going
directly to Youtube but viewing the videos at a website) may (but not
clear) have triggered the bad drivers. Anyway it's like working
backwards in a game of retrograde chess analysis to figure out the
cause of BSOD.

--
That's what I call going hard core on it, reinstalling the MicroSoft
OS. Only I may add another layer of successive binary OS images to
the strategy of advancing installs [over and upon the initial
install]. Seems ATI started out a Canadian company and is now
absorbed by AMD;- Hence quite a long history, near since the
beginnings of PCs, actually. The one abiding credit I'd most
certainly ascribe to ATI to a slang PC dictionary of distinctions and
notables, is first, (MS doesn't really count), right up there
alongside commercial virus security suites, for having developed an
install routine, few, if any, can break down, afterwards, for
restorative analysis.

Regardless whether the binaries are for troubleshooting, they do,
nevertheless, form a routine integral and periodic application,
personally, for wiping off MSOS partition, weekly on average, with a
clean binary MSOS;- a backup image never, on certain principle,
exposed to the WWW during its creation.

I've seen one too many instances of what you're describing, now for
the greater part to summarily dismiss what by binary copies expedite
within character of anomalies I occasionally do encounter for no other
apparent reason than being connected. (Chess is very patterned for
its openings, which evinces at higher levels of play corresponding
parity in safety, often declared draws, if not a parity, then, at an
earlier axis of extant resources within a leeway subsequent, thereby
to tax its masters. I play chess at the edge of its excellent
rankings, above average, good players, when I take time, a few weeks
of preparatory studies before entering its arena. The "game" as so
applied to computers is within patterning absolutes to an established
parity, de facto, MicroSoft garners every time it adds large venues,
e.g., HP and Dell, to its corporate harem of capitalistic hegemony of
PCs, certifiably approved, with a pre-installed OS;- a game for
latecomers to lose, by default, from the challenge of codework upon
deviating from those same preestablished rules, certifiable MSOS code
(sic), subsequent in form and contingency to corrupt the "game"
environment. The analogy would then seem be given reason for as much
use, there cannot be in a contradiction, whereby gain is at all
apropos to fix what is upon principle broken to begin with.)
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Memtest86 problem w P4c800-e _dee[_2_] Asus Motherboards 2 December 29th 08 12:45 AM
using Memtest86+ Synapse Syndrome Asus Motherboards 22 March 7th 07 05:55 PM
using Memtest86+ Synapse Syndrome General 22 March 7th 07 05:55 PM
memtest86 memtest86+ memtest86++ [email protected] Overclocking AMD Processors 6 September 24th 06 02:47 AM
Need help-- Memtest86 MB_ Dell Computers 8 September 8th 05 11:55 PM


All times are GMT +1. The time now is 06:19 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.