HardwareBanter

HardwareBanter (http://www.hardwarebanter.com/index.php)
-   Homebuilt PC's (http://www.hardwarebanter.com/forumdisplay.php?f=36)
-   -   CPU temp hits 80C then cools back to 55/60 [memtest] (http://www.hardwarebanter.com/showthread.php?t=199925)

[email protected] March 9th 20 01:26 PM

CPU temp hits 80C then cools back to 55/60 [memtest]
 
I was running memtest on an LGA2011 system that seemed to have memory issues.
I noticed that when memtest starts, the core temperature is up to 80 Celsius.
Then it drifts back to 60 (with full 8 DIMMs). Testing only 2 DIMMs at a time
it goes lower to 55 Celsius. This is single-threaded test. So what is
memtest reporting? Just the active core, rather than average of the die?
This PC has a water cooling system. There is no model number on it, so I
can't determine what the thermal capacity of this cooler is. The radiator
fans seem to be going at fixed speed.
CPU is i7-3820. The RAM speed is 1600 MHz with CAS 10. No overclocking is
set in BIOS.

Paul[_28_] March 9th 20 05:44 PM

CPU temp hits 80C then cools back to 55/60 [memtest]
 
wrote:
I was running memtest on an LGA2011 system that seemed to have memory issues.
I noticed that when memtest starts, the core temperature is up to 80 Celsius.
Then it drifts back to 60 (with full 8 DIMMs). Testing only 2 DIMMs at a time
it goes lower to 55 Celsius. This is single-threaded test. So what is
memtest reporting? Just the active core, rather than average of the die?
This PC has a water cooling system. There is no model number on it, so I
can't determine what the thermal capacity of this cooler is. The radiator
fans seem to be going at fixed speed.
CPU is i7-3820. The RAM speed is 1600 MHz with CAS 10. No overclocking is
set in BIOS.


It's hard to say what core it is.

The code does have the notion of the "boot processor" which might
be CPU 0, but I can't find the selection logic, like what happens
if this is a multicore run. I can't imagine it runs multiple
instances of the code, but maybe it does.

The following code "just runs" in a sense. Normally, on a P4 without HT,
it would be obvious what core it runs on. It's the boot processor
core. But I haven't located what happens here when it is running SMP.
It "smells" like it runs an instance on each core, but then some
test address must be assigned to a running test (it probably cannot
afford to overlap in memory).

There's no OS running. There's no scheduler. The code should
run 100%, 100% of the time. The code talks of a "barrier", which
means if it runs SMP, one copy of code is the master, and
it does something to run the slaves (on other cores). When a
slave is finished, presumably the master copy then advances
to the next test or test step. There would be an assumption
the cores are all "equally productive" and that the master
doesn't wait an extraordinary time for a slow slave to finish.

A comment in the code mentions "we don't have a timer", which
means delays are done by brute force. And if a delay was
being used, that would likely make a core hot. When actually
testing memory, the core should cool off, because the memory
cannot keep up with the CPU (the CPU core would stall until
the memory fetch comes back, no delay loop or anything).
Many steps in the memory test, are waiting for the memory subsystem
to come back. Running multiple cores is not likely necessary to
get full performance from the memory subsystem. Normal
program codes, rely on cache hits for performance,
not on main memory being available instantly. Since
memtest is a "cache buster" and cache is also undesired,
the CPU cools its heels waiting for each memory fetch
to come back. The temp should drop to some extent
when this happens.

If for any reason, an SMP thread was slow to complete, then
the main thread would perhaps busy_wait and the temperature
might go up. Just a guess.

*******

memtest.org version 5.01

http://memtest.org/download/5.01/memtest86+-5.01.tar.gz

void coretemp(void) {
unsigned int msrl, msrh;
unsigned int tjunc, tabs, tnow;
unsigned long rtcr;
double amd_raw_temp;

// Only enable coretemp if IMC is known
if (imc_type == 0) {
return;
}

tnow = 0;

// Intel CPU
if (cpu_id.vend_id.char_array[0] == 'G' && cpu_id.max_cpuid = 6) {
if (cpu_id.dts_pmp & 1) {
rdmsr(MSR_IA32_THERM_STATUS, msrl, msrh); === Core #, running this privileged instruction
tabs = ((msrl 16) & 0x7F);
rdmsr(MSR_IA32_TEMPERATURE_TARGET, msrl, msrh); === Tjmax value of newer processors (May 2010)
tjunc = ((msrl 16) & 0x7F);
if (tjunc 50 || tjunc 125) {
tjunc = 90;
} // assume Tjunc = 90°C if boggus value received.
tnow = tjunc - tabs;
dprint(LINE_CPU + 1, 30, v - check_temp, 3, 0);
v - check_temp = tnow;
}
return;
}

// AMD CPU
if (cpu_id.vend_id.char_array[0] == 'A' && cpu_id.vers.bits.extendedFamily 0) {
pci_conf_read(0, 24, 3, 0xA4, 4, & rtcr);
amd_raw_temp = ((rtcr 21) & 0x7FF);
v - check_temp = (int)(amd_raw_temp / 8);
dprint(LINE_CPU + 1, 30, v - check_temp, 3, 0);
}
}

*******

Paul

[email protected] March 10th 20 10:13 AM

CPU temp hits 80C then cools back to 55/60 [memtest]
 
On Tuesday, March 10, 2020 at 12:44:17 AM UTC+8, Paul wrote:
It's hard to say what core it is.


I noticed this happens when the PC started cold, i.e. after it was off for at
least 3 hours. So I speculate that the no-name Alibaba cooler has a sticky
pump, which only breaks free after the CPU got hot and cooked it?

Paul[_28_] March 10th 20 11:11 AM

CPU temp hits 80C then cools back to 55/60 [memtest]
 
wrote:
On Tuesday, March 10, 2020 at 12:44:17 AM UTC+8, Paul wrote:
It's hard to say what core it is.


I noticed this happens when the PC started cold, i.e. after it was off for at
least 3 hours. So I speculate that the no-name Alibaba cooler has a sticky
pump, which only breaks free after the CPU got hot and cooked it?


Doesn't the pump have an RPM signal ???

There should be one on it. Hook it up.

Paul

Flasherly[_2_] March 10th 20 07:36 PM

CPU temp hits 80C then cools back to 55/60 [memtest]
 
On Tue, 10 Mar 2020 02:13:26 -0700 (PDT), wrote:

So I speculate that the no-name Alibaba cooler has a sticky
pump, which only breaks free after the CPU got hot and cooked it?


The pump should still run with a no-CPU error, thus hang it over the
sink with a hose disconnected from the reservoir output (if it's not
presumably a "self-contained" water cooler). Check replacement fluid
types for pump lubricity with a cooling stasis overall consistently
resilient to deteriorated contamination. Or get a cheap CoolerMaster
"Hyper" grapefruit-sized (heat-wicking) model, renowned for kicking
holes through a bottom of the bucket for averaging cost to performance
ratios.


All times are GMT +1. The time now is 12:08 PM.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
HardwareBanter.com