Floating point format for Intel math coprocessors

#1 June 27th 03, 04:01 PM

On Fri, 27 Jun 2003 14:23:22 GMT, Jack Crenshaw
wrote:

I've run across a peculiarity concerning the format used inside the
Intel math coprocessor. I have always
thought that the format used was in accordance with the IEEE 794

Unless IEEE 794 is something new, I think you mean IEEE 754.

standard, and every reference I've seen
on the web seems to imply that. But, as nearly as I can tell, it's not
the same.

The IEEE standard for 32-bit floats says the format should be

sign -- 1 bit
exponent -- 8 bits, power of 2, split on 127

With the proviso that the values of 0 and 255 for the exponent are
special cases reserved for 0, Inf, denormals, and NaN.

mantissa -- 23 bits + phantom bit in bit 24.

The Intel processor seems to use the following:

sign -- 1 bit
exponent -- _SEVEN_ bits, power of _FOUR_
mantissa -- sometimes 23 bits, sometimes 24. Sometimes phantom bit,
sometimes not.
When it's there, it's in bit _TWENTY_THREE_ !

I don't think so. Are you mistaking the lsb of the exponent for the
"visible" phantom bit?

[...]

You'll see that 1 -- 3f800000 (high bit is visible)

seee eeee emmm mmmm mmmm mmmm mmmm mmmm
0011 1111 1000 0000 0000 0000 0000 0000

s = 0, e = 127, m = 0

(-1)^s * 2^(e-127) * (1+m/(2^23)) = 1*1*1 = 1.0

but 2 -- 40000000 (high bit is not)

seee eeee emmm mmmm mmmm mmmm mmmm mmmm
0100 0000 0000 0000 0000 0000 0000 0000

s = 0, e = 128, m = 0

(-1)^s * 2^(e-127) * (1+m/(2^23)) = 1*2*1 = 2.0

Try a few others and see what you get. Some will surprise you.

I'm not finding any surprises.

1.5 - 3fc00000
seee eeee emmm mmmm mmmm mmmm mmmm mmmm
0011 1111 1100 0000 0000 0000 0000 0000
s = 0, e = 127, m = 0x400000
(-1)^s * 2^(e-127) * (1+m/(2^23)) = 1*1*1.5 = 1.5

2.5 - 40200000
seee eeee emmm mmmm mmmm mmmm mmmm mmmm
0100 0000 0020 0000 0000 0000 0000 0000
s = 0, e = 128, m = 0x400000
(-1)^s * 2^(e-127) * (1+m/(2^23)) = 1*2*1.25 = 2.5

Perhaps I'm misunderstanding your point?

Regards,

-=Dave
--
Change is inevitable, progress is not.

#2 June 27th 03, 04:34 PM

In article , Jack Crenshaw wrote:

I've run across a peculiarity concerning the format used inside
the Intel math coprocessor.

If you're talking about the format that IA32 FPs store values
in memory, then I doubt it. If you're talking about the 80-bit
internal format, I don't know. I've never tried to use that
format externally.

I've been exchanging float data between IA32 systems and at
least a half-dozen other architectures since the 8086/8087
days. I never saw any format problems.

I have always thought that the format used was in accordance
with the IEEE 794 standard,

It is IEEE something though 794 doesn't sound right...

--
Grant Edwards grante Yow! .. this must be what
at it's like to be a COLLEGE
visi.com GRADUATE!!

#3 June 27th 03, 06:28 PM

Jack Crenshaw wrote:

.... snip ...

The IEEE standard for 32-bit floats says the format should be

sign -- 1 bit
exponent -- 8 bits, power of 2, split on 127
mantissa -- 23 bits + phantom bit in bit 24.

The Intel processor seems to use the following:

sign -- 1 bit
exponent -- _SEVEN_ bits, power of _FOUR_
mantissa -- sometimes 23 bits, sometimes 24. Sometimes
phantom bit, sometimes not. When it's there, it's in bit
_TWENTY_THREE_ !

.... snip ...

You'll see that 1 -- 3f800000 (high bit is visible)
but 2 -- 40000000 (high bit is not)

Try a few others and see what you get. Some will surprise you.

I know that there must be people out there to whom this is old,
old news. Even so, I've never seen a word about it, and didn't
find anything in a Google search. I'd appreciate any comments.

What chips does this format appear in? I expect the presence or
absence of normalization depends on the oddness of the exponent
byte. It makes sense for byte addressed memory based systems,
since zero (ignoring denormalization) can be detected in a single
byte.

--
Chuck F ) )
Available for consulting/temporary embedded and systems.
http://cbfalconer.home.att.net USE worldnet address!

#4 June 27th 03, 08:14 PM

On Fri, 27 Jun 2003 17:58:50 GMT, Jonathan Kirwan
wrote:

Hmm. Are you *THE* Jack Crenshaw? The "Let's Build A Compiler"
and "Math Toolkit for Real-Time Programming" Jack _W._ Crenshaw?

Actually, I think I've answered my own question, here. You
really *are* that Jack.

What clues me in is your use of "phantom" he

mantissa -- 23 bits + phantom bit in bit 24.

The same term used on page 50 in "Math toolkit..."

In my own experience, even that predating the Intel 8087 or the
IEEE standardization, it was called a "hidden bit" notation. I
don't know where "phantom" comes from, as my own reading managed
to completely miss it.

So, a hearty "Hello" from me!

Jon

#5 June 27th 03, 08:15 PM

On Fri, 27 Jun 2003 14:23:22 GMT, Jack Crenshaw
wrote:

I've run across a peculiarity concerning the format used inside the
Intel math coprocessor. I have always thought that the format used
was in accordance with the IEEE 794 standard, and every reference
I've seen on the web seems to imply that. But, as nearly as I can
tell, it's not the same.

The IEEE standard for 32-bit floats says the format should be

sign -- 1 bit
exponent -- 8 bits, power of 2, split on 127
mantissa -- 23 bits + phantom bit in bit 24.

The Intel processor seems to use the following:

sign -- 1 bit
exponent -- _SEVEN_ bits, power of _FOUR_
mantissa -- sometimes 23 bits, sometimes 24. Sometimes phantom bit,
sometimes not.

The sign bit is there, as the highest order bit -- just as you
note. This is followed, working left to right, by the exponent
which is in "excess 127" format. It's not a signed format, but
an excess 127 format. A couple of special values, 0 and 255,
are reserved for the fancy stuff. But those correspond to
exponent values of -127 and +128 and no one is supposed to miss
them. The mantissa is quite simply *always* associated with a
hidden bit, which always leads the value.

Okay, the exception made to the above is for the exact value of
zero, where the exponent is 0 and the mantissa is 0 and the
hidden bit is assumed 0, as well.

When it's there, it's in bit _TWENTY_THREE_ !

I don't agree. At least, not yet in my experience. And I
haven't seen an example below which makes your point.

The way this works is, if you have an exponent other than 2, you must
shift by more than one bit
to normalize.

Well, I know how to normalize. After writing a few complex-FP
FFT routines for integer processors, it gets to be kind of
routine and hum-drum. So I'll skip the explanation.

The old IBM 360 used base 16, so had to shift by four
bits. That's why its f.p.
precision was so awful.

Interesting note about the 360. I only had a few opportunities
to program in BAL and never got into the floating point formats.

The Intel 32-bit format shifts by two, so sometimes the high bit is a
one, sometimes a zero.

High bit is *always* a 1, after normalizing (except for zero.)
But, as you know, it is thrown away. Never kept.

That causes the format to look really funky. Try this:

typedef union{
float x;
long n;
}float_hack;

void main(void){
float_hack num;
while(1){
cin num.x;
cout hex num.n endl;
}
}

You'll see that 1 -- 3f800000 (high bit is visible)
but 2 -- 40000000 (high bit is not)

Which doesn't make your point, because it's quite correct to use
those two values to represent 1 and 2.

3F800000 is:

1 -- hidden bit
0 01111111 00000000000000000000000
- -------- -----------------------
S exponent mantissa

40000000 is:

1 -- hidden bit
0 10000000 00000000000000000000000
- -------- -----------------------
S exponent mantissa

In those two cases, the only difference is that the exponents
are 1 apart from each other. Which is exactly what you'd expect
for 1.0 and 2.0. The mantissa is the same for both.

Try a few others and see what you get. Some will surprise you.

I have, believe me.

I know that there must be people out there to whom this is old, old
news. Even so, I've never seen
a word about it, and didn't find anything in a Google search. I'd
appreciate any comments.

Well, I hope that helps some.

Jon

#6 June 27th 03, 09:26 PM

In article , Jonathan Kirwan wrote:

The old IBM 360 used base 16, so had to shift by four bits.
That's why its f.p. precision was so awful.

Interesting note about the 360. I only had a few opportunities
to program in BAL and never got into the floating point formats.

IIRC, the Navy's UYK-44 processor (probably UYK-20 as well,
though I'm not sure it did FP) also used base 16 for the
exponent, so increasing the exponent by 1 shifted the mantissa
by 4. I dare anybody to claim that's a useful bit of
information to have retained for 15+ years....

--
Grant Edwards grante Yow! .. bleakness....
at desolation.... plastic
visi.com forks...

#7 June 27th 03, 11:05 PM

In article , Jonathan Kirwan wrote:

Interesting note about the 360. I only had a few opportunities
to program in BAL and never got into the floating point formats.

IIRC, the Navy's UYK-44 processor (probably UYK-20 as well,
though I'm not sure it did FP) also used base 16 for the
exponent, so increasing the exponent by 1 shifted the mantissa
by 4. I dare anybody to claim that's a useful bit of
information to have retained for 15+ years....

I think I've actually read about this, once. Been a while,
though. And... I'm glad I was able to forget it. Now, you've
gone and forced those poor brain cells to re-align on this and
I'm probably going to forget something else important.

Could be worse, I could've explained what BAM variables were...

--
Grant Edwards grante Yow! YOW!! I am having
at FUN!!
visi.com

#8 June 29th 03, 05:24 AM

In article , Everett M. Greene wrote:

IIRC, the Navy's UYK-44 processor (probably UYK-20 as well,
though I'm not sure it did FP) also used base 16 for the
exponent, so increasing the exponent by 1 shifted the mantissa
by 4. I dare anybody to claim that's a useful bit of
information to have retained for 15+ years....

I think I've actually read about this, once. Been a while,
though. And... I'm glad I was able to forget it. Now, you've
gone and forced those poor brain cells to re-align on this and
I'm probably going to forget something else important.

Could be worse, I could've explained what BAM variables were...

BAM is neat!

BTW: Wasn't it AYK-20?

-20 didn't have any FP. I've never been hear a -44.

I _think_ it was UYK, since everybody prounouced it "yuck".

The 44 was a small version of the same architecture that was
done by, um, Sperry (I think). Originally it was designed for
use on submarines (A 44 chassis would fit (barely) through a
submarine's loading hatch). A '20, OTOH, was more of a
standard computer-room VAX-sized thing -- you'd have to build a
sub hull around it.

A '44 consisted of a backplane full of very expensive little
boards (about 3x6 inches). It took several of the boards for
the CPU, and then there were memory and I/O modules. The whole
thing, including power supply was the size of a small suitcase.
The CPU was built out of AM2901 bit-slice processors, and
executed a superset of the 20's instruction set.

The '44 was "standardized" as the Navy's official embedded
computer. It was about as powerful as decent 8086
single-board-computer, only 100X larger and 1000X more
expensive. It did have plug in cards for all the oddball
USN-specific serial/parallel interfaces, which gave it a leg-up
on commercial stuff. The '44 had FP, and the ones I played with
used EEPROM/RAM instead of core (though core memory was
available for it, IIRC). It was sort of cool that it could do
polar-rectangular coordinate transforms in a single machine
instruction.

For the project I worked on, we would have embedded a couple
8086's and done C programs given our 'druthers, but NAVSEA
insisted that we use '44s and write in CMS/2 or CMS-2 or
whatever it was called. The also wanted us to use some OS or
other from the '20. But, there was no way it could deal with
the real-time requirements we had, so they let us write out own
simple kernel.

The whole project was cancelled after a couple years (never
even got a prototype working). A few years later it was revived
and redesigned using "commercial" processors before being
cancelled again.

Sure glad I'm out of defense work...

--
Grant Edwards grante Yow! My forehead feels
at like a PACKAGE of moist
visi.com CRANBERRIES in a remote
FRENCH OUTPOST!!

#9 July 1st 03, 02:03 PM

Jonathan Kirwan wrote:

On Fri, 27 Jun 2003 17:58:50 GMT, Jonathan Kirwan
wrote:

Hmm. Are you *THE* Jack Crenshaw? The "Let's Build A Compiler"
and "Math Toolkit for Real-Time Programming" Jack _W._ Crenshaw?

Actually, I think I've answered my own question, here. You
really *are* that Jack.

Grin! Yep, I really am.

What clues me in is your use of "phantom" he

mantissa -- 23 bits + phantom bit in bit 24.

The same term used on page 50 in "Math toolkit..."

In my own experience, even that predating the Intel 8087 or the
IEEE standardization, it was called a "hidden bit" notation. I
don't know where "phantom" comes from, as my own reading managed
to completely miss it.

So, a hearty "Hello" from me!

Hello. Re the term, phantom bit: I've been using that term since I can
remember -- and that's
a looooonnnngggg time. Then again, I still sometimes catch myself
saying "cycles" or "kilocycles,"
or "B+". I first heard the term in 1975. Not sure when it became
Politically Incorrect. Maybe
someone objected to the implied occult nature of the term, "phantom"?
Who knows?
but as far as I'm concerned the term "hidden bit" is a
Johnny-come-lately on the scene.

Back to the point. I want to thank you and everyone else who responded
(except the guy who said
"stop it") for helping to straighten out my warped brain.

It's nice that you have my book. Thanks for buying it. As a matter of
fact, I first ran across this
"peculiarity" three years ago, when I was writing it. I needed to
twiddle the components of the
floating-point number -- separate the exponent from mantissa -- to write
the fp_hack structure for
the square root algorithm. I looked at the formats for float, double,
and long double, and found the
second two formats easy enough to grok. But when I looked at the format
for floats, I sort of went,
"Gag!" and quickly decided to use doubles for the book.

It's funny how an idea, once formed, can persist. Lo those many years
ago, I didn't have a lot of time
to think about it -- had to get the chapter done. I just managed to
convince myself that the format
used this peculiar convention, what with base-4 exponents, and all. I
had no more need of it at the time,
so never went back and revisited the impression. It's persisted ever
since.

All of the folks who responded are absolutely right. Once I got my head
screwed on straight, it was
quite obvious that the format has no mysteries. It is indeed the IEEE
754 format, plain and simple.
The thing that had me confused was the exponents: 3f8, 400, 408, etc.
With one bit for the sign and
eight for the exponent, it's perfectly obvious that the exponent has to
bleed down one bit into the next
lower hex digit. That's what I was seeing, but somehow in my haste, I
didn't recognize it as such, and
formed this "theory" that it was using a base-4 exponent.

Wanna hear the funny part? After tinkering with it for awhile, I worked
out the rules for my imagined
format, that worked just fine. At work, I've got a Mathcad file that
takes the hex number, shifts it
two bits at a time, diddles the "phantom" bit, and produces the right
results. I can go from integer to
float and back nicely, using this cockamamie scheme.

Needless to say, the conversion is a whole lot easier if one uses the
real format! My Mathcad file just
got a lot shorter.

Thanks again to everyone who responded, and my apologies for bothering
y'all with this imaginary problem.

Jack

#10 July 1st 03, 11:57 PM

On Tue, 01 Jul 2003 13:03:19 GMT, Jack Crenshaw
wrote:

Jonathan Kirwan wrote:

On Fri, 27 Jun 2003 17:58:50 GMT, Jonathan Kirwan
wrote:

Hmm. Are you *THE* Jack Crenshaw? The "Let's Build A Compiler"
and "Math Toolkit for Real-Time Programming" Jack _W._ Crenshaw?

Actually, I think I've answered my own question, here. You
really *are* that Jack.

Grin! Yep, I really am.

Hehe. Nice to know one of my antennas is still sharp.

What clues me in is your use of "phantom" he

mantissa -- 23 bits + phantom bit in bit 24.

The same term used on page 50 in "Math toolkit..."

In my own experience, even that predating the Intel 8087 or the
IEEE standardization, it was called a "hidden bit" notation. I
don't know where "phantom" comes from, as my own reading managed
to completely miss it.

So, a hearty "Hello" from me!

Hello. Re the term, phantom bit: I've been using that term since I can
remember -- and that's a looooonnnngggg time.

I think my first exposure to hidden-bit as a term dates to about
1974. But I could be off, by a year, either way.

Then again, I still sometimes catch myself saying "cycles" or
"kilocycles," or "B+".

Hehe. Now those terms aren't so "hidden" to me. I learned my
early electronics on tube design manuals. One sticking point I
remember bugging me for a long time was exactly, "How do they
size those darned grid leak resistors?" I just couldn't figure
out where they got the current from which to figure their
magnitude. So even B+ is old hat to me.

I first heard the term in 1975.

Well, that's about the time for "hidden bit," too. Probably, at
that time the term was still in a state of flux. I just got my
hands on different docs, I imagine.

Not sure when it became Politically Incorrect.

Oh, it's fine to me, anyway. I knew what was meant the moment I
saw the term. It's pretty clear. I just think time has settled
more on one term than another.

But to take your allusion and run with it a bit... I don't know
of anyone part of some conspiracy to set the term -- in any
case, setting terms usually is propagandistic, designed for
setting agendas in peoples' minds and here is a case where
everyone would want the same agenda.

Maybe someone objected to the implied occult nature of the term,
"phantom"?

Oh, geez. I've never known a geek to care about such things. I
suppose they must exist, somwhere. I've just never met one
willing to let me know they thought like that. But that's an
interesting thought. It would fit the weird times in the US we
live in, with about 30% aligning themselves as fundamentalists.

Nah... it just can't be.

Who knows?

I really think it was more the IEEE settling on a term. But
then, this isn't my area so I could be dead wrong about that --
I'm only guessing.

but as far as I'm concerned the term "hidden bit" is a
Johnny-come-lately on the scene.

Hehe. I've no problem if that's true.

Back to the point. I want to thank you and everyone else who responded
(except the guy who said "stop it") for helping to straighten out my
warped brain.

No problem. It was really pretty easy to recall the details.
Like learning to ride a bicycle, I suppose.

It's nice that you have my book. Thanks for buying it.

Oh, there was no question. I've a kindred interest in physics
and engineering, I imagine. I'm currently struggling through
Robert Gilmore's books, one on lie groups and algebras and the
other on catastrophe theory for engineers as well as polytropes,
packing spheres, and other delights. There were some nice
insights in your book, which helped wind me on just enough of a
different path to stretch me without losing me.

By the way!! I completely agree with you about MathCad! What a
piece of *&!@&$^%$^ it is, now. I went through several
iterations, loved at first the slant or approach in using it,
but absolutely hate it now because, frankly, I can't run it for
more than an hour before I don't have any memory left and it
crashes out. Reboot time every hour is not my idea of a good
thing. And that's only if I don't type and change things too
fast. When I work quick on it, I can go through what's left
with Win98 on a 256Mb RAM machine in a half hour! No help from
them and two versions later I've simply stopped using it. I
don't even want to hear from them, again. Hopefully, I'll be
able to find an old version somewhere. For now, I'm doing
without.

As a matter of fact, I first ran across this
"peculiarity" three years ago, when I was writing it. I needed to
twiddle the components of the
floating-point number -- separate the exponent from mantissa -- to write
the fp_hack structure for
the square root algorithm. I looked at the formats for float, double,
and long double, and found the
second two formats easy enough to grok. But when I looked at the format
for floats, I sort of went,
"Gag!" and quickly decided to use doubles for the book.

Yes. But that's fine, I suspect. I've taught undergrad classes
and most folks just go "barf" when confronted with learning
floating point. In class evaluations, I think having to learn
floating point was the bigger source of complaints about the
classes. You probably addressed everything anyone "normal"
could reasonably care about and more.

It's funny how an idea, once formed, can persist. Lo those many years
ago, I didn't have a lot of time
to think about it -- had to get the chapter done. I just managed to
convince myself that the format
used this peculiar convention, what with base-4 exponents, and all. I
had no more need of it at the time,
so never went back and revisited the impression. It's persisted ever
since.

No problem.

All of the folks who responded are absolutely right. Once I got my head
screwed on straight, it was
quite obvious that the format has no mysteries. It is indeed the IEEE
754 format, plain and simple.
The thing that had me confused was the exponents: 3f8, 400, 408, etc.
With one bit for the sign and
eight for the exponent, it's perfectly obvious that the exponent has to
bleed down one bit into the next
lower hex digit. That's what I was seeing, but somehow in my haste, I
didn't recognize it as such, and
formed this "theory" that it was using a base-4 exponent.

In any case, it's clear that your imagination is able to work
overtime, here! Maybe that's a good thing.

Wanna hear the funny part? After tinkering with it for awhile, I worked
out the rules for my imagined
format, that worked just fine. At work, I've got a Mathcad file that
takes the hex number, shifts it
two bits at a time, diddles the "phantom" bit, and produces the right
results. I can go from integer to
float and back nicely, using this cockamamie scheme.

Hmm. Then you should be able to construct a function to map
between these, proving the consistent results. I've a hard time
believing there is one. But who knows? Maybe this is the
beginning of a new facet of mathematics, like the investigation
into fractals or something!

Needless to say, the conversion is a whole lot easier if one uses the
real format! My Mathcad file just
got a lot shorter.

Hehe!! When you get things right, they *do* tend to become a
little more prosaic, too. Good thing for those of us with
feeble minds, too.

Thanks again to everyone who responded, and my apologies for bothering
y'all with this imaginary problem.

hehe. Best of luck. In the process, I did notice that you are
entertaining thoughts on a revised "Let's build a compiler."
Best of luck on that and if you feel the desire for unloading
some of the work, I might could help a little. I've written a
toy C compiler before, an assembler, several linkers, and a
not-so-toy BASIC interpreter. I can, at least, be a little bit
dangerous. Might be able to shoulder something, if it helps.

Jon

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Installing MoBo		Homebuilt PC's	36	November 28th 04 02:29 AM
Passmark Performance Test, Division, Floating Point Division, 2DShapes	@(none)	General	0	August 19th 04 11:57 PM
Floating Point Operations & AMD	Keith B. Silverman	Overclocking AMD Processors	1	August 5th 04 02:07 PM
AMD64 vs. a floating point operation (FLOP)	Only NoSpammers	AMD x86-64 Processors	8	June 27th 04 03:55 PM
fastest floating point operation as possible	Paul Spitalny	Homebuilt PC's	22	February 10th 04 02:34 PM