View Single Post
  #45  
Old December 21st 18, 08:12 PM posted to alt.comp.hardware.pc-homebuilt
Char Jackson
external usenet poster
 
Posts: 213
Default New system build - reboot loop when attempting to boot from SATA HDD

On Wed, 19 Dec 2018 17:18:25 -0500, Paul wrote:

Char Jackson wrote:
On Wed, 19 Dec 2018 14:22:03 -0600, Char Jackson
wrote:
I just got off the phone with a technical guy from ASRock in California.
He says the default setting of Auto for VCore should absolutely work. If
it didn't, he says they'd have gotten a flood of calls and they haven't.
He strongly suggested that I not set a fixed VCore value, since that
voltage is intended to rise or fall, as needed, rising for performance
and falling to reduce heat.

At idle, he says 0.80v is common, and the voltage should ramp up, as
needed, but the CPU has to tell the motherboard what it needs. If not,
the motherboard just happily keeps the voltage at idle, which is fine
for the UEFI screens but too low for actual booting. I typically see
0.96v to 0.97v at idle. I lose visibility during boot attempts, but I
assume the voltage is not ramping up like it should.

When I explained the symptoms, he said I should RMA the board. When I
told him this is the second board with this identical behavior, he said
it's probably the CPU. So off I go to Newegg to see if they'll be as
nice about an RMA for the CPU as they were for the motherboard.


Just finished a nice online chat with Intel tech support. After some
back and forth, he says he doesn't think it's the CPU but he agrees that
I should try to RMA it, just in case. Next stop, Newegg.


Processors are tested along the curve.

Multiplier N 0.6V
Multiplier N+1 0.7V
...
Multiplier NMAX 1.2V Locked processor say

Multiplier NMAX+1 Via offsets Unlocked processor
Multiplier NMAX+... 1.43V Via offsets Unlocked processor

Of the first three, every VCore setting isn't
run through a full set of test vectors. In the past,
there were two stops with 500MHz margin on each run.
That's because running that set of vectors costs test
time and is quite thorough. Might take seconds to run,
instead of microseconds. Tester time could be a rate
limiting step in production.

If I had to guess:

Intel knows what they're doing

Award/AMI/Phoenix/Insyde are pretty good (bringup code)
Complete source is not given to the motherboard maker.

Motherboard manufacturer is the "unknown variable".
Some tweaking and tuning of all the Intel controls
goes on.

It's probably not hardware, but some problem with
adjustment of some dynamic control. Load line calibration.

The latest Intel processors have some extremely quick
power state change capabilities. Previous generations
would ramp over 100us or so. The time constants have
changed quite significantly.

*******

I have only one datapoint to offer. The BIOS for my
Asrock board had the *wrong* clock generator control
code in it. They changed brands of clock generator
in mid-production, without adding if-then-else
handling in the BIOS.

Clock generators don't *need* code at canonical frequencies.
Back at that time, BSEL could select 100MHz or 133MHz
(and multiply up), without invoking BIOS code. But
if you selected 101MHz, the board would crash at
BIOS level, because that caused the wrong registers
to be written in the ClockGen. It's like running
Marvell code, to program a Broadcom chip.

That's to give you some idea how clueless Asrock is.
The only overclocks I could do, were via BSEL mod.
That means insulating a pin in the LGA775 socket,
and soldering a wire to the correct BSEL pin on the
bottom of the motherboard. The board actually ran
with a 33% overclock but wasn't stable, so I backed
it out by leaving the wire floating in the computer
case. I used an offset mod on VCore, but elected to
not drive the voltage into the stratosphere. It was
just meant to get some "value" out of a $65 motherboard.

The VCore design on that $65 board was excellent.
I can't fault the mobo designer. Nor find fault
with the factory part change (you have to do that
if the previous clockgen has too long a lead time).
But the BIOS guys screwed the pooch in multiple
ways, such that I was using a hacked BIOS a guy in
Germany did instead. Even that couldn't fix the clockgen
issue. That requires more than just exposing
settings in the BIOS. People who hack BIOS, have
the ability to change the GUI items and expose
hidden ones, and the hacked BIOS allowed the
board to have "normal EIST". While Asrock kept
releasing the BIOS, with EIST broken. One stinking
"non-improved BIOS" after another. Like a
bunch of idiots. Just changing release numbers on
BIOS, without doing anything with the BIOS, isn't
"service" in my book.

I was willing at the time, to give them the
benefit of the doubt. But I've not had a
chance to test a second board and discover
a "theme of incompetence".

The chipset actually supported 1GB and 2GB DIMMs.
The board lacked 2GB tuning (Tsu/Th, delay taps etc).
But I can't really fault them, because the *chipset*
web site said it specifically could not work with
2GB DIMMs. Yet I plugged them in, and it *did*
work. Again, not stable. Using 1GB DIMMs, the
memory was perilously close to bulletproof.
All the board needed... was BIOS work to get 2GB working.

I think you can see some sort of theme here. It's just
one datapoint, but it does give you pause.


From a troubleshooting perspective, I'm in a pretty weak position
because I don't have two of anything at any given time. Amazon was out
of stock on the CPU so I went to Newegg, plus Newegg was a little less
expensive on the Motherboard and RAM. Those are all pluses for Newegg,
but unlike Amazon where replacement items get 2-day shipping, which for
me means 1 day or sometimes same day, Newegg switches over to FedEx
Ground for returns and replacements, and they require the returned item
to be inspected at their end before they ship the replacement. So if I
had ordered from Amazon, I'd have two motherboards and two CPUs by now
that I could mix and match, trying to find a good combo. As it is, I'll
be sitting here now for 10-14 days while I wait for a replacement CPU,
and if that doesn't resolve the 'no boot' issue, I'll be back at square
one.

Also from a troubleshooting perspective, the issue seems to be related
to VCore, and both ASRock and Intel told me the same story about how the
CPU tells the motherboard how much voltage it wants, and the motherboard
either complies or it doesn't. Just like everyone else, I don't have
test equipment that I can insert between the CPU and its socket to
monitor those requests and related responses, so I'm stuck doing
substitution.

As for setting a static VCore voltage, both support guys stressed that
that would be a very bad idea and was highly discouraged. Set it to less
than 'max' and the CPU will be hamstrung when it needs more power. OTOH,
set it to more than 'min' and the CPU will be converting any unused
power directly into heat. This CPU came with a stock cooler that would
be woefully inadequate for that job. That would call for water cooling,
at least. It would also affect the electric bill. Bottom line, it needs
to work properly. Anything less isn't good enough. The good news, I
guess, is that no one is pointing fingers at the RAM or any peripherals.