If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Intel's Skylake Prime Number Bug.
Hello,
Apperently Intel's Skylake Processors can freeze up when calculating certain Prime Numbers. I am investigating this story further, for now here is a link about it: https://communities.intel.com/mobile...nts%2F52 4553 Enjoy ! =D Bye, Skybuck =D |
#2
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On Mon, 11 Jan 2016 16:44:49 +0100, "Skybuck Flying" wrote:
| Hello, | | Apperently Intel's Skylake Processors can freeze up when calculating certain | Prime Numbers. | | I am investigating this story further, for now here is a link about it: | | https://communities.intel.com/mobile...nts%2F52 4553 Intel is apparently aware of this and is working with its partners to distribute a fix in form of a BIOS update. http://arstechnica.com/gadgets/2016/...lex-workloads/ Larc |
#3
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On Mon, 11 Jan 2016 11:10:44 -0500, Larc
wrote: On Mon, 11 Jan 2016 16:44:49 +0100, "Skybuck Flying" wrote: | Hello, | | Apperently Intel's Skylake Processors can freeze up when calculating certain | Prime Numbers. | | I am investigating this story further, for now here is a link about it: | | https://communities.intel.com/mobile...nts%2F52 4553 Intel is apparently aware of this and is working with its partners to distribute a fix in form of a BIOS update. http://arstechnica.com/gadgets/2016/...lex-workloads/ Larc I wonder how the BIOS can fix an FPU error. Trap exceptions? Change some firmware? -- John Larkin Highland Technology, Inc picosecond timing precision measurement jlarkin att highlandtechnology dott com http://www.highlandtechnology.com |
#4
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On 1/11/2016 1:22 PM, John Larkin wrote:
On Mon, 11 Jan 2016 11:10:44 -0500, Larc wrote: On Mon, 11 Jan 2016 16:44:49 +0100, "Skybuck Flying" wrote: | Hello, | | Apperently Intel's Skylake Processors can freeze up when calculating certain | Prime Numbers. | | I am investigating this story further, for now here is a link about it: | | https://communities.intel.com/mobile...nts%2F52 4553 Intel is apparently aware of this and is working with its partners to distribute a fix in form of a BIOS update. http://arstechnica.com/gadgets/2016/...lex-workloads/ Larc I wonder how the BIOS can fix an FPU error. Trap exceptions? Change some firmware? Nowadays processors (from micro to mainframe) are run by microcode. A problem like this can be in the circuitry (as was with the FPU problem in the early-mid 90's) or it can be in the microcode. IOW, it can be a hardware bug or a software bug Obviously if it's a microcode bug, an update should be able to fix it. Even if it's a hardware bug, there might be a way around the bug in the microcode. Being as this is a hang, my guess would be it's a microcode bug. But obviously I don't know. -- ================== Remove the "x" from my email address Jerry Stuckle ================== |
#5
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On Mon, 11 Jan 2016 13:42:57 -0500, Jerry Stuckle
wrote: On 1/11/2016 1:22 PM, John Larkin wrote: On Mon, 11 Jan 2016 11:10:44 -0500, Larc wrote: On Mon, 11 Jan 2016 16:44:49 +0100, "Skybuck Flying" wrote: | Hello, | | Apperently Intel's Skylake Processors can freeze up when calculating certain | Prime Numbers. | | I am investigating this story further, for now here is a link about it: | | https://communities.intel.com/mobile...nts%2F52 4553 Intel is apparently aware of this and is working with its partners to distribute a fix in form of a BIOS update. http://arstechnica.com/gadgets/2016/...lex-workloads/ Larc I wonder how the BIOS can fix an FPU error. Trap exceptions? Change some firmware? Nowadays processors (from micro to mainframe) are run by microcode. Not all. Some RISC machines are pure logic. ARM, Coldfire, maybe MIPS? Intels are still microcode based. A problem like this can be in the circuitry (as was with the FPU problem in the early-mid 90's) or it can be in the microcode. IOW, it can be a hardware bug or a software bug Obviously if it's a microcode bug, an update should be able to fix it. Even if it's a hardware bug, there might be a way around the bug in the microcode. Being as this is a hang, my guess would be it's a microcode bug. But obviously I don't know. -- John Larkin Highland Technology, Inc picosecond timing precision measurement jlarkin att highlandtechnology dott com http://www.highlandtechnology.com |
#6
|
|||
|
|||
Intel's Skylake Prime Number Bug.
Compute well this processor does not - Yoda.
Wrong is much with this processor - Yoda. Breath I would not hold - Yoda. Bye, Skybuck =D |
#7
|
|||
|
|||
Intel's Skylake Prime Number Bug.
Skybuck Flying wrote:
Compute well this processor does not - Yoda. Wrong is much with this processor - Yoda. Breath I would not hold - Yoda. Bye, Skybuck =D Some bugs on the processor can be fixed by microcode. A typical retail motherboard can have as many as eight microcode files, covering the compatible CPU table on the motherboard maker site, and those are stored in the BIOS flash chip. Microcode is tiny, and variable length. The last time I took apart that file, the segments were in multiples of 2KB or so. Microcode releases have a revision number, and a patcher loading a microcode, is allowed to install a patch which has a higher release number than the one currently in the processor. The BIOS has its microcode patcher. The microcode must be good enough, to allow the system to boot into the OS. So no storage bugs can exist with the shipped BIOS microcode. All it has to do, is get the system booted. Windows and Linux also have microcode patchers. The Windows one does its job early after boot, and then the service exits. So you don't really see it. The Windows one allows deployment of updates. It's unclear how much faster either a BIOS update would deploy a new version, versus how fast Microsoft could push a new file via Windows Update. If you have a copy of the Intel Processor Identification Utility (PIU), the field "revision" is actually the release number of the microcode. There was one incident, where no microcode was getting loaded, and the number was zero. Most of the time, you will find a small finite number for that field. In some cases, the utility mistakenly masks the value read out, and some digits may not belong there. (Maybe you see F07 instead of 07.) Some bugs in processors are fixed by actual code. When AMD had a TLB bug in the 9500, they distributed maybe a 15KB or so code module, to be added to the BIOS. This code disabled the TLB, or a portion of it, costing a small amount of performance. A fixed version of the processor, for the same family, had "50" added to the lower digits, so if you bought a 9550 you knew it was fixed, whereas a 9500 wasn't. So that fix wasn't microcode based, because it wasn't an actual instruction problem. It was a problem with virtual to physical address translation of some sort. The average processor has 100 errata. Some of the errata are discovered a year or two after the first batch is distributed for sale. Testing continues after release. Many bugs are repaired via microcode updates. Some are labeled "won't fix", meaning even if a new mask revision was in the pipe, they had no plan to patch out the problem. Some issues are innocuous enough they don't need fixing. In the case of the Prime95 issue above, the hand-coded FFTs are perfect material for uncovering bugs. Frequently, compilers produce "lame" code that doesn't give particularly good fault coverage. So you don't see bugs, because the instruction sequences aren't that challenging. One AMD processor, had an FPU bug caused by actual electrical noise. It was discovered after release. It took assembler code to do it. The assembler code consisted of a nonsensical continuous sequence of one FPU instruction after another. This drew enough current to cause a noise problem in the substrate. Errata like that receive a "will not fix" rating, because it is not expected that anyone will be coding with assembler, and using that stupid a sequence of instructions. Real FPU code needs an occasional bit test, branch condition, and so isn't solid 100% FPU instructions one after another. And when a HLL is used, the compiler/assembler wouldn't even get close to the required FPU code density to break that processor. (If I owned such a processor though, I'd be ****ed. For that not being caught in testing, or recognized as a potential issue during design.) When it comes to test benches for hardware design, you run the important ones first (and try to finish them by design close). The ridiculous tests are saved for later, after production has begun. And that's when the AMD testers carried out their artificial 100% density test and discovered a problem. For our chip designs, some staff were running simulation test cases a year after we had hardware in hand. (And ours didn't have microcode to patch with either. We had another feeble mechanism for emergencies :-) ) The level of bugs is rather constant. I don't recollect ever looking at an errata sheet for a CPU and seeing zero bugs. It just doesn't happen. I expect in some cases, staff already know of multiple errata, even before design close, but the boss says "ship it". I doubt they would hold up a mask release, chasing every possible bug and making the CPU two years late. That just isn't going to happen, especially when the "good ole microcode" can pull your bacon out of the fire. So while it's sad that this "bendable" processor also has an errata, it probably has another 99 errata to keep that one company. Most of those errata are invisible to end users. The janitorial staff already cleaned up the mess :-) It's the ones that cost you performance, that make the fanbois crazy. The AMD TLB bug certainly upset a few happy owners of the affected silicon. And the Intel FDIV bug certainly cost Intel (all those Excel spreadsheet jokes). Intel has had a few wakeup calls, and I like to think that Skylake is another wakeup call ("staff getting sloppy, poor decision making"). So far it hasn't cost them any "big money". I don't know if users are returning their "bent" processors or not. Paul |
#8
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On Mon, 11 Jan 2016 10:22:55 -0800, John Larkin
wrote: On Mon, 11 Jan 2016 11:10:44 -0500, Larc wrote: On Mon, 11 Jan 2016 16:44:49 +0100, "Skybuck Flying" wrote: | Hello, | | Apperently Intel's Skylake Processors can freeze up when calculating certain | Prime Numbers. | | I am investigating this story further, for now here is a link about it: | | https://communities.intel.com/mobile...nts%2F52 4553 Intel is apparently aware of this and is working with its partners to distribute a fix in form of a BIOS update. http://arstechnica.com/gadgets/2016/...lex-workloads/ Larc I wonder how the BIOS can fix an FPU error. Trap exceptions? Change some firmware? Change microcode? |
#9
|
|||
|
|||
Intel's Skylake Prime Number Bug.
Lol, I read that bug on a few sites; I was SURE Skybuck was going to
push it to the group before I could ;-) On 2016-01-11 16:55, Paul wrote: Skybuck Flying wrote: Compute well this processor does not - Yoda. Wrong is much with this processor - Yoda. Breath I would not hold - Yoda. I decided to wait for the next CPU a while ago. Not for the Bit Manipulation bug you posted on this group a while ago (as far as I know, all you need is AND OR and XOR, so what if some esoteric bit instruction is buggy). I decided to wait because I am not impressed with the "Turbo" boosts (0/0/0/2 - since Windoze never runs only 1 core, you will never see 4.2GHz). I decided I'd wait for the Skylake equivalent of Devil's Canyon... [big snip, sorry, all good stuff but no need to repeat it] It's the ones that cost you performance, that make the fanbois crazy. The AMD TLB bug certainly upset a few happy owners of the affected silicon. And the Intel FDIV bug certainly cost Intel Ya, I have a tagline or two around that: - According 2 Intel, 1+1 equals 3, for very large values of 1. - A bad random number generator: 1, 1, 1, 4.33e+67, 1, 1, 1... - Hitchhicker's pentium Ed: The meaning of life is 41.9815... (all those Excel spreadsheet jokes). Intel has had a few wakeup calls, and I like to think that Skylake is another wakeup call ("staff getting sloppy, poor decision making"). So far it hasn't cost them any "big money". I don't know if users are returning their "bent" processors or not. Well, they managed to make it 5-10% faster than previous model, even tho it doesn't turbo as well and even tho they dropped 8-way set associative L2 cache; it's still pretty impressive... Best Regards, -- ! _\|/_ Sylvain / ! (o o) Member-+-David-Suzuki-Fdn/EFF/Red+Cross/Planetary-Society-+- oO-( )-Oo "What's that?" -Arthur, "Something blue." -Ford |
#10
|
|||
|
|||
Intel's Skylake Prime Number Bug.
On Mon, 11 Jan 2016 13:42:57 -0500, Jerry Stuckle wrote:
Nowadays processors (from micro to mainframe) are run by microcode. A problem like this can be in the circuitry (as was with the FPU problem in the early-mid 90's) or it can be in the microcode. IOW, it can be a hardware bug or a software bug Obviously if it's a microcode bug, an update should be able to fix it. Even if it's a hardware bug, there might be a way around the bug in the microcode. Being as this is a hang, my guess would be it's a microcode bug. But obviously I don't know. Were there any information whether the CPU froze due to infinite loop or stop executing, internally? e.g. if it's due to an internal infinite loop, the CPU temperature won't decrease several minutes after it frozes. And if the CPU stops executing, could its state be an untrappable exception (i.e. hardware crash), deadlock, or a component which isn't either 0/false or 1/true (i.e. undetermined state; a kind of deadlock when the component is checked)? |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How does the Skylake fix work? | Mr Macaw | Intel | 19 | February 18th 16 07:57 PM |
(Skylake) Intel® Core i7-6700K (4.0 to 4.2 ghz 95 watts) vs Intel® Core i7-6700T (2.8 to 3.6 ghz 35 watts) | Skybuck Flying[_4_] | General | 6 | September 12th 15 11:37 AM |
(Skylake) Intel® Core i7-6700K (4.0 to 4.2 ghz 95 watts) vs Intel® Core i7-6700T (2.8 to 3.6 ghz 35 watts) | Skybuck Flying[_4_] | Homebuilt PC's | 6 | September 12th 15 11:37 AM |
Intel Processor Number | [email protected] | Intel | 0 | May 24th 06 09:49 AM |
Serial Number on MainBoard Intel ! | RustiK | General | 7 | February 19th 04 03:23 AM |