Survey of FloatingPoint Formats 
range (overflow value)  precision  bits  B  W_{e}  W_{m}  what 
14  0.6  6  2  3  2  Used in university courses^{21} ,^{22} 
240  0.9  8  2  4  3  Used in university courses^{21} ,^{22} 
65504 = 2^{15}×(22^{10})  3.3  16  2  5  10  2byte excess15^{24} ,^{25} ,^{27} nVidia NV3x GPUs. Also called "half" or "s10e5", "fp16" and "half"; largest minifloat. Can approximate any 16bit unsigned integer or its reciprocal to 3 decimal places. 
2.81×10^{14}  3.6  18  8  2  12  excess15 octal, 4digit mantissa. A fairly decent radix8 format in an 18 bit PDP10 halfword 
9.22×10^{18} = 2^{261}  4.8  24  2  7  16  3byte excess63^{17} ATI R3x0 and Rv350 GPUs. Also called "s16e7" or "fp24". 
1.84×10^{19} = 2^{26}  6.9  30  2  7  23  AMD 9511 (1979)^{5} 
9.90×10^{27} = 8^{82/21}  5.1  24  8  6  17  Octal excess32^{12} 
1.70×10^{28} = 2^{271}  8.1  36  2  8  27  Digital PDP10^{1} ,^{18} , VAX (F and D formats)^{1} ; Honeywell 600, 6000^{1} ,^{16} ; Univac 110x single^{1} ; IBM 709x, 704x^{1} 
3.40×10^{38} = 2^{27}  7.2  32  2  8  1+23  IEEE 754 single 
3.40×10^{38} = 2^{27}  7.2  32  2  8  1+23  Digital PDP11^{19} , PDP 16^{6} , VAX 
9.99×10^{49} = 10^{102/2}  8.0  44  10  2d  8d  Burroughs B220^{7} 
4.31×10^{68} = 8^{76}  11.7  ?  8  7  39  Burroughs 5700, 6700, 7700 single^{1} ,^{14} ,^{16} ,^{17} 
7.24×10^{75} = 16^{63}  7.2  32  16  7  24  IBM 360, 370^{6} ; Amdahl 1; DG Eclipse M/600^{1} 
7.24×10^{75} = 16^{63}  16.8  64  16  7  56  IBM 360 double^{15} 
5.79×10^{76} = 2^{255}  7.2  ?  2  9  24  Burroughs 1700 single^{16} 
1.16×10^{77} = 16^{64}  7.2  32  16  7  24  HP 3000^{1} 
9.99×10^{96} = 10^{3×25+1}  7.0  32  10  8  7d  IEEE 754r decimal32^{3} ,^{4} 
9.99×10^{99} = 10^{102}  10.0  ?  10  2d  10d  Most scientific calculators 
4.9×10^{114} = 8^{127}  12.0  48  8  8  40  Burroughs 7700^{6} 
8.9×10^{307} = 2^{2101}  14.7  60  2  11  1+48  CDC 6000, 6600^{6} , 7000 CYBER 
8.9×10^{307} = 2^{2101}  ?  ?  ?  ?  ?  DEC Vax G format; UNIVAC; 110x double^{1} 
1.8×10^{308} = 2^{210}  15.9  64  2  11  1+52  IEEE 754 double 
1.27×10^{322} = 2^{1070}  ?  ?  ?  ?  ?  CDC 6x00, 7x00, Cyber^{1} 
9.99×10^{384} = 10^{3×27+1}  16.0  64  10  10  16d  IEEE 754r decimal64^{3} ,^{4} 
9.99×10^{499} = 10^{103/2}  12.0  ?  10  3d  12d  HP 71B^{13} , 85^{1} calculators 
9.99×10^{999} = 10^{103}  12.0  ?  10  3d  12d  Texas Instruments 85, 92 calculators 
9.99×10^{999} = 10^{103}  14.0  ?  10  3d  14d  Texas Instruments 89 calculator^{13} 
9.99×10^{999} = 10^{103}  17.0  82  10  3d  17d  68881 Packed Decimal Real (3 BCD digits for exponent, 17 for mantissa, and two sign bits) 
1.4×10^{2465} = 2^{2133}  7.2  38?  2  14  24  Cray C90 half^{8} 
1.4×10^{2465} = 2^{2133}  14.1  61?  2  14  47  Cray C90 single^{8} 
1.4×10^{2465} = 2^{2133}  28.8  110?  2  14  96  Cray C90 double^{8} 
1.1×10^{2466} = 2^{213}  ?  ?  ?  ?  ?  Cray I^{1} 
5.9×10^{4931} = 2^{2141}  ?  ?  ?  ?  ?  DEC VAX H format^{1} 
1.2×10^{4932} = 2^{214}  19.2  80  2  15  64  The minimum IEEE 754 double exended size (Pentium; HP/Intel Itanium; Motorola 68040, 68881, 88110) 
1.2×10^{4932} = 2^{214}  34.0  128  2  15  1+112  IEEE 754r quad^{2} ,^{3} (DEC Alpha^{9} ; IBM S/390 G5^{10} ) 
9.99×10^{6144} = 10^{3×211+1}  34.0  128  10  14  34d  IEEE 754r decimal128^{3} ,^{4} 
5.2×10^{9824} = 2^{215131}  16.0  ?  2  16  47  PRIME 50^{16} 
1.9×10^{29603} = 8^{215+12}  ?  ?  8  16  ?  Burroughs 6700, 7700 double^{1} ,^{16} 
4.3×10^{2525222} = 2^{223}  ?  ?  2  24  ?  PARI 
1.4×10^{323228010} = 2^{2301616}  ?  ?  2  31  ?  Mathematica® 
≅10^{2147483646} = 10^{2312}  ?  ?  ?  ?  ?  Maple® 
value = s 2^{k+1N} n
22^{K} < k < 2^{K}1 and 2^{N1}1 < n < 2^{N}
k = 22^{K} and 0 < n < 2^{N1}
if x>y
" and "if xy>0
" yield different
results, unless you use denorms.
s  exponent  mantissa  value(s) 
0  111.1111.1  111.1111.1111.1111.1111.1111  Quiet NANs 
0  111.1111.1  100.0000.0000.0000.0000.0000  Indeterminate 
0  111.1111.1  0xx.xxxx.xxxx.xxxx.xxxx.xxxx  Signaling NANs 
0  111.1111.1  000.0000.0000.0000.0000.0000  Infinity 
0  111.1111.0  111.1111.1111.1111.1111.1111  3.402×10^{38} 
0  100.0000.1  000.0000.0000.0000.0000.0000  4.0 
0  100.0000.0  100.0000.0000.0000.0000.0000  3.0 
0  100.0000.0  000.0000.0000.0000.0000.0000  2.0 
0  011.1111.1  000.0000.0000.0000.0000.0000  1.0 
0  011.1111.0  000.0000.0000.0000.0000.0000  0.5 
0  000.0000.1  000.0000.0000.0000.0000.0000  1.175×10^{38} (Smallest normalized value) 
0  000.0000.0  111.1111.1111.1111.1111.1111  1.175×10^{38} (Largest denormalized value) 
0  000.0000.0  000.0000.0000.0000.0000.0001  1.401×10^{45} (Smallest denormalized value) 
0  000.0000.0  000.0000.0000.0000.0000.0000  0 
1  000.0000.0  000.0000.0000.0000.0000.0000  0 
1  000.0000.0  000.0000.0000.0000.0000.0001  1.401×10^{45} (Smallest denormalized value) 
1  000.0000.0  111.1111.1111.1111.1111.1111  1.175×10^{38} (Largest denormalized value) 
1  000.0000.1  000.0000.0000.0000.0000.0000  1.175×10^{38} (Smallest normalized value) 
1  011.1111.0  000.0000.0000.0000.0000.0000  0.5 
1  011.1111.1  000.0000.0000.0000.0000.0000  1.0 
1  100.0000.0  000.0000.0000.0000.0000.0000  2.0 
1  100.0000.0  100.0000.0000.0000.0000.0000  3.0 
1  100.0000.1  000.0000.0000.0000.0000.0000  4.0 
1  000.0000.1  000.0000.0000.0000.0000.0000  1.175×10^{38} 
1  111.1111.0  111.1111.1111.1111.1111.1111  3.402×10^{38} 
1  111.1111.1  000.0000.0000.0000.0000.0000  Negative infinity 
1  111.1111.1  0xx.xxxx.xxxx.xxxx.xxxx.xxxx  Signaling NANs 
1  111.1111.1  100.0000.0000.0000.0000.0000  Indeterminate 
1  111.1111.1  111.1111.1111.1111.1111.1111  Quiet NANs

s exponent mantissa value(s) 0 111.11 xx.xxxx.xxxx various NANs 0 111.11 00.0000.0000 Infinity 0 111.10 11.1111.1111 65504 (Largest finite value) 0 100.11 10.1100.0000 27.0 0 100.01 11.0000.0000 7.0 0 100.00 10.0000.0000 3.0 0 011.11 00.0000.0000 1.0 0 011.10 00.0000.0000 0.5 0 000.01 00.0000.0000 6.104×10^{5} (Smallest normalized value) 0 000.00 11.1111.1111 6.098×10^{5} (Largest denormalized value) 0 000.00 00.0000.0001 6×10^{8} (Smallest denormalized value) 0 000.00 00.0000.0000 0 1 011.11 00.0000.0000 1.0 (other negative values are analagous)
s.e.m excess range comments 1.2.2 2 1 to 24 The smallest format worth considering; anything narrower has a range no greater than the same number of bits interpreted as a signedmagnitude integer 1.3.2 3 0.0625 to 14 Used in university courses^{21} ,^{22} 1.4.3 3 1 to 229376 About the best compromise for a 1byte format 1.4.3 7 0.002 to 240 Used in university courses^{21} ,^{22} 1.4.7 7 1 to 4161536 One option for 12 bits 1.5.6 6 1 to 1.35×10^{11} Another option for 12 bits 1.5.10 10 1 to 2.20×10^{12} Largest unbalanced format; range exceeds 32bit unsigned 1.5.10 15 0.000061 to 65504 2byte excess15^{24} ,^{25} ,^{27} , aka "fp16", "s10e5", "half". Can approximate any 16bit unsigned integer or its reciprocal to 3 decimal places. 1.5.12 15 1/M to 2.81×10^{14} A fairly decent radix8 format in an 18 bit PDP10 word 1.7.16 63 1/M to 9.22×10^{18} 3byte excess63^{17} , aka "fp24", "s16e7"
Footnotes and Sources
1 : http://http.cs.berkeley.edu/~wkahan/ieee754status/whyieee.pdf W. Kahan, "Why do we need a floatingpoint arithmetic standard?", 1981
2 : http://http.cs.berkeley.edu/~wkahan/ieee754status/Names.pdf W. Kahan, "Names for Standardized FloatingPoint Formats", 2002 (work in progress)
3 : http://754r.ucbtest.org/ "Some Proposals for Revising ANSI/IEEE Std 7541985"
4 : http://www2.hursley.ibm.com/decimal/DPDecimal.html "A Summary of Densely Packed Decimal encoding" (web page)
5 : http://www3.sk.sympatico.ca/jbayko/cpu1.html
6 : http://twins.pmf.ukim.edu.mk/predava/DSM/procesor/float.htm
7 : http://www.cc.gatech.edu/gvu/people/randy.carpenter/folklore/v5n2.html
8 : http://www.usm.unimuenchen.de/people/puls/f77to90/cray.html
9 : http://www.usm.unimuenchen.de/people/puls/f77to90/alpha.html
10 : http://www.research.ibm.com/journal/rd/435/schwarz.html and http://www.research.ibm.com/journal/rd/435/schwa1.gif
11 : http://babbage.cs.qc.edu/courses/cs341/IEEE754references.html
12 : One source gave 8^{31} as the range for the Burroughs B5500. (I forgot to save my source for this. I have sources for other Burroughs systems, giving 8^{76} as the highest value (and 8^{50} as the lowest, for a field width of 7 bits); and Cowlishaw^{}13 describes its (unrelated) decimal floatingpoint format. I might have inferred it from http://www.cs.science.cmu.ac.th/panutson/433.htm which only gives a field width of 6 bits, and no bias. The Burroughs 5000 manual says the mantissa is 39 bits, but does not talk about exponent range. Did some models have a 6bit exponent field? Since these are the folks who simplified things by storing all integers as floatingpoint numbers with an exponent of 0^{17} , I suspect anything is possible.
13 : http://www2.hursley.ibm.com/decimal/IEEEcowlishawarith16.pdf Michael F. Cowlishaw, "Decimal FloatingPoint: Algorism for Computers", Proceedings of the 16^{th} IEEE Symposium on Computer Arithmetic, 2003; ISSN 10636889/03
14 : http://research.microsoft.com/users/GBell/Computer_Structures_Principles_and_Examples/csp0146.htm D. Siewiorek, C. Gordon Bell and Allen Newell, "Computer Structures: Principles and Examples", 1982, p. 130
15 : http://research.microsoft.com/~gbell/Computer_Structures__Readings_and_Examples/00000612.htm C. Gordon Bell and Allen Newell, "Computer Structures: Readings and Examples", 1971, p. 592
16 : http://www.csit.fsu.edu/~burkardt/f_src/slap/slap.f90 FORTRAN90 implementation of a linear algebra package, which curiously begins with a table of machine floatingpoint register parameters for lots of old mainframes. See also http://interval.louisiana.edu/pub/intervalmath/Fortran90_software/d1i1mach.for , which gives actual binary values of the smallest and largest values for many systems.
17 : http://grouper.ieee.org/groups/754/meetingminutes/020418.html Includes this brief description of the key design feature of the Burroughs B5500: "ints and floats with the same value have the same strings in registers and memory. The octal point at the right, zero exponent." This shows why the exponent range is quoted as 8^{50} (or 8^{51}) to 8^{76}: The exponent ranged from 8^{63} to 8^{63}, and the (for floatingpoint, always normalized) 13digit mantissa held any value from 8^{12} up to nearly 8^{13}, shifting both ends of the range up by that amount.
18 : http://www.inwap.com/pdp10/hbaker/pdp10/FloatingPoint.html
19 : http://nssdc.gsfc.nasa.gov/nssdc/formats/PDP11.htm
20 : This format would be easy to implement on an 8bit microprocessor. It has the sign and exponent in one byte, and a 16bit mantissa and an explicit leading 1 bit (if the leading 1 is hidden/implied, we get twice the range). With only 45 decimal digits it isn't too useful, but it's what you could expect to see on a really small early home computer.
21 : http://turing.cs.plymouth.edu/~wjt/Architecture/CSAPP/L05FloatingPoint.pdf This lecture presentation (or a variation of it) appears at clarkson.edu, plymouth.edu, sc.edu, ucar.edu, umd.edu, umn.edu, utah.edu, utexas.edu and vancouver.wsu.edu. Good discussion of floatingpoint representations, subnormals, rounding modes and various other issues. Pages 1416 use the 1.4.3 microfloat format as an example to illustrate in a very concrete way how the subnormals, normals and NANs are related; pages 1718 use the even smaller 1.3.2 format to show the range of representable values on a number line. Make sure to see page 30 — this alone is worth the effort of downloading and viewing the document!
22 : http://www2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213f98/H3/H3.pdf Homework assignment that uses the microfloat formats 1.4.3 and 1.3.2. Another similar one is here.
23 : http://www.arl.wustl.edu/~lockwood/class/cse306s04/lecture/l11.html Lecture notes that use the 1.3.5 minifloat format for inclass examples.
24 : http://developer.nvidia.com/docs/IO/8230/D3DTutorial1_Shaders.ppt nVidia presentation describing their fp16 format (starting on slide 75).
25 : http://developer.nvidia.com/attach/6655 nVidia language specification including definition of fp16 format (page 175).
26 : http://www.cs.unc.edu/Events/Conferences/GP2/slides/hanrahan.pdf describes the nVidia GeForce 6800 and ATI Radeon 9800 graphics cards as generalpurpose pipelined vector floatingpoint processors, and shows a rough design for a supercomputer employing 16384 of the GPU chips to achieve a theoretical throughput of 2 petaflops (2×10^{15} floatingpoint operations per second). (The rackmount pictured came from here.
27 : http://www.digitlife.com/articles2/psprecision/ This is the only source I have found that describes all of the current hardware standard formats, from IEEE binary128 all the way down to nVIDIA s10e5.
Robert Munafo's home pages on Earthlink
© 19962004 Robert P. Munafo. Email the author
Back to my main page m.3