Skip to content

Commit 3b264df

Browse files
authored
Misc improvements to the float tutorial (GH-102052)
1 parent b513c46 commit 3b264df

File tree

1 file changed

+106
-33
lines changed

1 file changed

+106
-33
lines changed

Doc/tutorial/floatingpoint.rst

Lines changed: 106 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
.. testsetup::
22

33
import math
4+
from fractions import Fraction
45

56
.. _tut-fp-issues:
67

@@ -9,12 +10,13 @@ Floating Point Arithmetic: Issues and Limitations
910
**************************************************
1011

1112
.. sectionauthor:: Tim Peters <[email protected]>
13+
.. sectionauthor:: Raymond Hettinger <python at rcn dot com>
1214

1315

1416
Floating-point numbers are represented in computer hardware as base 2 (binary)
15-
fractions. For example, the **decimal** fraction ``0.125``
16-
has value 1/10 + 2/100 + 5/1000, and in the same way the **binary** fraction ``0.001``
17-
has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only
17+
fractions. For example, the **decimal** fraction ``0.625``
18+
has value 6/10 + 2/100 + 5/1000, and in the same way the **binary** fraction ``0.101``
19+
has value 1/2 + 0/4 + 1/8. These two fractions have identical values, the only
1820
real difference being that the first is written in base 10 fractional notation,
1921
and the second in base 2.
2022

@@ -57,13 +59,15 @@ Many users are not aware of the approximation because of the way values are
5759
displayed. Python only prints a decimal approximation to the true decimal
5860
value of the binary approximation stored by the machine. On most machines, if
5961
Python were to print the true decimal value of the binary approximation stored
60-
for 0.1, it would have to display ::
62+
for 0.1, it would have to display::
6163

6264
>>> 0.1
6365
0.1000000000000000055511151231257827021181583404541015625
6466

6567
That is more digits than most people find useful, so Python keeps the number
66-
of digits manageable by displaying a rounded value instead ::
68+
of digits manageable by displaying a rounded value instead:
69+
70+
.. doctest::
6771

6872
>>> 1 / 10
6973
0.1
@@ -90,7 +94,10 @@ thing in all languages that support your hardware's floating-point arithmetic
9094
(although some languages may not *display* the difference by default, or in all
9195
output modes).
9296

93-
For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits::
97+
For more pleasant output, you may wish to use string formatting to produce a
98+
limited number of significant digits:
99+
100+
.. doctest::
94101

95102
>>> format(math.pi, '.12g') # give 12 significant digits
96103
'3.14159265359'
@@ -101,33 +108,49 @@ For more pleasant output, you may wish to use string formatting to produce a lim
101108
>>> repr(math.pi)
102109
'3.141592653589793'
103110

104-
105111
It's important to realize that this is, in a real sense, an illusion: you're
106112
simply rounding the *display* of the true machine value.
107113

108114
One illusion may beget another. For example, since 0.1 is not exactly 1/10,
109-
summing three values of 0.1 may not yield exactly 0.3, either::
115+
summing three values of 0.1 may not yield exactly 0.3, either:
116+
117+
.. doctest::
110118

111-
>>> .1 + .1 + .1 == .3
119+
>>> 0.1 + 0.1 + 0.1 == 0.3
112120
False
113121

114122
Also, since the 0.1 cannot get any closer to the exact value of 1/10 and
115123
0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with
116-
:func:`round` function cannot help::
124+
:func:`round` function cannot help:
117125

118-
>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
126+
.. doctest::
127+
128+
>>> round(0.1, 1) + round(0.1, 1) + round(0.1, 1) == round(0.3, 1)
119129
False
120130

121131
Though the numbers cannot be made closer to their intended exact values,
122-
the :func:`round` function can be useful for post-rounding so that results
123-
with inexact values become comparable to one another::
132+
the :func:`math.isclose` function can be useful for comparing inexact values:
124133

125-
>>> round(.1 + .1 + .1, 10) == round(.3, 10)
126-
True
134+
.. doctest::
135+
136+
>>> math.isclose(0.1 + 0.1 + 0.1, 0.3)
137+
True
138+
139+
Alternatively, the :func:`round` function can be used to compare rough
140+
approximations::
141+
142+
.. doctest::
143+
144+
>>> round(math.pi, ndigits=2) == round(22 / 7, ndigits=2)
145+
True
127146

128147
Binary floating-point arithmetic holds many surprises like this. The problem
129148
with "0.1" is explained in precise detail below, in the "Representation Error"
130-
section. See `The Perils of Floating Point <https://www.lahey.com/float.htm>`_
149+
section. See `Examples of Floating Point Problems
150+
<https://jvns.ca/blog/2023/01/13/examples-of-floating-point-problems/>`_ for
151+
a pleasant summary of how binary floating point works and the kinds of
152+
problems commonly encountered in practice. Also see
153+
`The Perils of Floating Point <https://www.lahey.com/float.htm>`_
131154
for a more complete account of other common surprises.
132155

133156
As that says near the end, "there are no easy answers." Still, don't be unduly
@@ -158,26 +181,34 @@ statistical operations supplied by the SciPy project. See <https://scipy.org>.
158181
Python provides tools that may help on those rare occasions when you really
159182
*do* want to know the exact value of a float. The
160183
:meth:`float.as_integer_ratio` method expresses the value of a float as a
161-
fraction::
184+
fraction:
185+
186+
.. doctest::
162187

163188
>>> x = 3.14159
164189
>>> x.as_integer_ratio()
165190
(3537115888337719, 1125899906842624)
166191

167192
Since the ratio is exact, it can be used to losslessly recreate the
168-
original value::
193+
original value:
194+
195+
.. doctest::
169196

170197
>>> x == 3537115888337719 / 1125899906842624
171198
True
172199

173200
The :meth:`float.hex` method expresses a float in hexadecimal (base
174-
16), again giving the exact value stored by your computer::
201+
16), again giving the exact value stored by your computer:
202+
203+
.. doctest::
175204

176205
>>> x.hex()
177206
'0x1.921f9f01b866ep+1'
178207

179208
This precise hexadecimal representation can be used to reconstruct
180-
the float value exactly::
209+
the float value exactly:
210+
211+
.. doctest::
181212

182213
>>> x == float.fromhex('0x1.921f9f01b866ep+1')
183214
True
@@ -186,17 +217,43 @@ Since the representation is exact, it is useful for reliably porting values
186217
across different versions of Python (platform independence) and exchanging
187218
data with other languages that support the same format (such as Java and C99).
188219

189-
Another helpful tool is the :func:`math.fsum` function which helps mitigate
190-
loss-of-precision during summation. It tracks "lost digits" as values are
191-
added onto a running total. That can make a difference in overall accuracy
192-
so that the errors do not accumulate to the point where they affect the
193-
final total:
220+
Another helpful tool is the :func:`sum` function which helps mitigate
221+
loss-of-precision during summation. It uses extended precision for
222+
intermediate rounding steps as values are added onto a running total.
223+
That can make a difference in overall accuracy so that the errors do not
224+
accumulate to the point where they affect the final total:
225+
226+
.. doctest::
194227

195228
>>> 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 == 1.0
196229
False
197-
>>> math.fsum([0.1] * 10) == 1.0
230+
>>> sum([0.1] * 10) == 1.0
198231
True
199232

233+
The :func:`math.fsum()` goes further and tracks all of the "lost digits"
234+
as values are added onto a running total so that the result has only a
235+
single rounding. This is slower than :func:`sum` but will be more
236+
accurate in uncommon cases where large magnitude inputs mostly cancel
237+
each other out leaving a final sum near zero:
238+
239+
.. doctest::
240+
241+
>>> arr = [-0.10430216751806065, -266310978.67179024, 143401161448607.16,
242+
... -143401161400469.7, 266262841.31058735, -0.003244936839808227]
243+
>>> float(sum(map(Fraction, arr))) # Exact summation with single rounding
244+
8.042173697819788e-13
245+
>>> math.fsum(arr) # Single rounding
246+
8.042173697819788e-13
247+
>>> sum(arr) # Multiple roundings in extended precision
248+
8.042178034628478e-13
249+
>>> total = 0.0
250+
>>> for x in arr:
251+
... total += x # Multiple roundings in standard precision
252+
...
253+
>>> total # Straight addition has no correct digits!
254+
-0.0051575902860057365
255+
256+
200257
.. _tut-fp-error:
201258

202259
Representation Error
@@ -225,20 +282,28 @@ as ::
225282
J ~= 2**N / 10
226283

227284
and recalling that *J* has exactly 53 bits (is ``>= 2**52`` but ``< 2**53``),
228-
the best value for *N* is 56::
285+
the best value for *N* is 56:
286+
287+
.. doctest::
229288

230289
>>> 2**52 <= 2**56 // 10 < 2**53
231290
True
232291

233292
That is, 56 is the only value for *N* that leaves *J* with exactly 53 bits. The
234-
best possible value for *J* is then that quotient rounded::
293+
best possible value for *J* is then that quotient rounded:
294+
295+
.. doctest::
235296

236297
>>> q, r = divmod(2**56, 10)
237298
>>> r
238299
6
239300

240301
Since the remainder is more than half of 10, the best approximation is obtained
241-
by rounding up::
302+
by rounding up:
303+
304+
.. doctest::
305+
306+
242307

243308
>>> q+1
244309
7205759403792794
@@ -256,27 +321,35 @@ if we had not rounded up, the quotient would have been a little bit smaller than
256321
1/10. But in no case can it be *exactly* 1/10!
257322

258323
So the computer never "sees" 1/10: what it sees is the exact fraction given
259-
above, the best 754 double approximation it can get::
324+
above, the best 754 double approximation it can get:
325+
326+
.. doctest::
260327

261328
>>> 0.1 * 2 ** 55
262329
3602879701896397.0
263330

264331
If we multiply that fraction by 10\*\*55, we can see the value out to
265-
55 decimal digits::
332+
55 decimal digits:
333+
334+
.. doctest::
266335

267336
>>> 3602879701896397 * 10 ** 55 // 2 ** 55
268337
1000000000000000055511151231257827021181583404541015625
269338

270339
meaning that the exact number stored in the computer is equal to
271340
the decimal value 0.1000000000000000055511151231257827021181583404541015625.
272341
Instead of displaying the full decimal value, many languages (including
273-
older versions of Python), round the result to 17 significant digits::
342+
older versions of Python), round the result to 17 significant digits:
343+
344+
.. doctest::
274345

275346
>>> format(0.1, '.17f')
276347
'0.10000000000000001'
277348

278349
The :mod:`fractions` and :mod:`decimal` modules make these calculations
279-
easy::
350+
easy:
351+
352+
.. doctest::
280353

281354
>>> from decimal import Decimal
282355
>>> from fractions import Fraction

0 commit comments

Comments
 (0)