I was writing some tests today and I ran into a peculiar floating point issue. I had generated a sequence of numbers using numpy.linspace:
>>> np.linspace(0.1, 1, 10) array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
Part of the code I was testing ended up testing whether the value 0.3 was in the range 0.3 – 0.8, including the end points. The answer should of course be yes, but there is a twist due to the actual values in the array returned by linspace:
>>> a = np.linspace(0.1, 1, 10) >>> 0.3 in a False >>> 0.3 < a[2] True
What’s happening is that the 0.3 returned by linspace is really 0.30000000000000004, but the 0.3 when I type 0.3 is really 0.29999999999999999. It’s not clear whether this situation would ever actually arise in the normal usage of the code I was testing, but I wanted to make sure this wouldn’t cause problems. My solution was to make a function which would test whether a value was in a given range with a tiny bit of fuzziness at the edges.
NumPy has a useful function for comparing floating point values within tolerances called allclose. But that’s for comparing equality, I need fuzzy (but not very fuzzy) less than / greater than comparisons. To provide just that little bit of fuzziness I turned to the numpy.nextafter function.
nextafter
gives the next representable floating point number after the first input value. The second input value controls the direction so you can get the next value either up or down. It turns out that the two numbers that are tripping me up are right next to each other in their floating point representation:
>>> np.nextafter(0.29999999999999999, 1) 0.30000000000000004 >>> np.nextafter(0.30000000000000004, 0) 0.29999999999999999
So to catch this case my range checking function only needs one ULP of fuzziness (which is not much at all) to handle this floating point error. To allow for this I wrote a function called fuzzy_between that takes a value and the lower and upper bounds of the test range and expands the test range by a couple ULP before doing a simple minval <= val <= maxval
comparison:
import numpy as np def fuzzy_between(val, minval, maxval, fuzz=2, inclusive=True): """ Test whether a value is within some range with some fuzziness at the edges to allow for floating point noise. The fuzziness is implemented by expanding the range at each end `fuzz` steps using the numpy.nextafter function. For example, with the inputs minval = 1, maxval = 2, and fuzz = 2; the range would be expanded to minval = 0.99999999999999978 and maxval = 2.0000000000000009 before doing comparisons. Parameters ---------- val : float Value being tested. minval : float Lower bound of range. Must be lower than `maxval`. maxval : float Upper bound of range. Must be higher than `minval`. fuzz : int, optional Number of times to expand bounds using numpy.nextafter. inclusive : bool, optional Set whether endpoints are within the range. Returns ------- is_between : bool True if `val` is between `minval` and `maxval`, False otherwise. """ # expand bounds for _ in xrange(fuzz): minval = np.nextafter(minval, minval - 1e6) maxval = np.nextafter(maxval, maxval + 1e6) if inclusive: return minval <= val <= maxval else: return minval < val < maxval
For a great discussion on comparing floating point numbers see this randomascii post, and for some interesting discussion on the fallibility of range functions see this post on Google+ by Guido van Rossum. Guido actually calls out numpy.linspace
as a range function not susceptible to floating point drift (since it’s calculating intervals, not adding numbers), but it’s always possible to get surprises with floating point numbers.