View Single Post
  #2  
Old December 28th 08, 06:22 AM posted to comp.os.linux.hardware,alt.comp.hardware.overclocking
Aragorn
external usenet poster
 
Posts: 17
Default New release of sys_basher

On Sunday 28 December 2008 05:24, someone identifying as *General
Schvantzkoph* wrote in /comp.os.linux.hardwa/

I've put a new release of sys_basher on the web,

http://www.polybus.com/sys_basher_web/

sys_basher is a multi-threaded hardware exerciser, memory tester and
benchmarking tool. It will run on any Linux or Unix.


Does it also do diagnostics of other possibly failing hardware components
than memory? I'm just asking because I've got a machine sitting here idly
for quite some time now due to strange lockups and BIOS ECC error log
entries.

I had the machine checked by a tech but he couldn't find anything, and he
had run some benchmarking thing on it using an SQL database and had it
running like that for several days without - again, according to him - any
errors.

It was a rather expensive machine at the time, and I've only recently put in
a brandnew Adaptec 2130 SLP PCI-X U320 SCSI RAID controller and two even so
brandnew Hitachi 73 GB U320 SCSI disks. (The errors and crashes already
predate that "transplant", though.)

The motherboard is an Intel server board - I think a 7500CW - with two Intel
Xeon 2.2 GHz (400 Hz FSB) 32-bit processors with hyperthreading. The
memory is 4 GB (4x 1 GB) Transcend ECC registered DDR-266, running at 200
MHz. /memtest86/ shows no errors whatsoever, and the power supply is 350
Watts but doesn't pull more than 220 Watts during boot-up and appears to
check out fine. The BIOS is a Phoenix, but don't ask me what release. ;-)

One of the strange things is that often during the Linux kernel boot
process, only three of the four hyperthreaded virtual CPUs are found -
occasionally only two, even. This "failure" is noticeable in advance
before the kernel actually starts displaying its boot messages by the delay
in switching from standard VGA resolution to the higher resolution
framebuffer, and there is a higher chance of this oddness occurring when
you press the /Enter/ key in the GRUB or LILO boot menu before the timeout
has expired. It then also shows strange messages like "booting processor
3/7", suggesting that the kernel sees eight processors, while it's a
two-socket motherboard with two hyperthreaded Xeons.

The machine has had this flaw from the beginning, but at the time it was
still rather exceptional, while by now it's rather exceptional to still
have it recognize all four virtual CPUs. It also used to fully lock up
without anything serious running, but the rate at which this would happen
was unpredictable. One time it would run for a whole week, the other time
it would lock up after only a few hours, or even earlier.

The machine has had Mandrake 10.0 PowerPack on it with a custom-built
vanilla 2.6.x kernel - various releases, starting with 2.6.5 and ending
with 2.6.17 or something - for many years but since it has gotten the newer
SCSI disks I've installed it with CentOS 5.1, as it was intended to be used
for our still very preliminary webhosting, and our hosting software
(DirectAdmin) only works with CentOS (or is only supported by the
developers to work with CentOS, anyway).

I'm mentioning all this because quite obviously everyone appears to be
stumped with regard to what could be wrong with this machine, and you come
across like somewhat of an Intel /connoisseur./ So maybe you've got any
clues? ;-)

--
*Aragorn*
(registered GNU/Linux user #223157)