Fuzzy Floating Point Comparison

I was writing some tests today and I ran into a peculiar floating point issue. I had generated a sequence of numbers using numpy.linspace:

>>> np.linspace(0.1, 1, 10) 
array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

Part of the code I was testing ended up testing whether the value 0.3 was in the range 0.3 – 0.8, including the end points. The answer should of course be yes, but there is a twist due to the actual values in the array returned by linspace:

>>> a = np.linspace(0.1, 1, 10)
>>> 0.3 in a
False
>>> 0.3 < a[2]
True

What’s happening is that the 0.3 returned by linspace is really 0.30000000000000004, but the 0.3 when I type 0.3 is really 0.29999999999999999. It’s not clear whether this situation would ever actually arise in the normal usage of the code I was testing, but I wanted to make sure this wouldn’t cause problems. My solution was to make a function which would test whether a value was in a given range with a tiny bit of fuzziness at the edges.

NumPy has a useful function for comparing floating point values within tolerances called allclose. But that’s for comparing equality, I need fuzzy (but not very fuzzy) less than / greater than comparisons. To provide just that little bit of fuzziness I turned to the numpy.nextafter function.

nextafter gives the next representable floating point number after the first input value. The second input value controls the direction so you can get the next value either up or down. It turns out that the two numbers that are tripping me up are right next to each other in their floating point representation:

>>> np.nextafter(0.29999999999999999, 1)
0.30000000000000004
>>> np.nextafter(0.30000000000000004, 0)
0.29999999999999999

So to catch this case my range checking function only needs one ULP of fuzziness (which is not much at all) to handle this floating point error. To allow for this I wrote a function called fuzzy_between that takes a value and the lower and upper bounds of the test range and expands the test range by a couple ULP before doing a simple minval <= val <= maxval comparison:


import numpy as np

def fuzzy_between(val, minval, maxval, fuzz=2, inclusive=True):
    """
    Test whether a value is within some range with some fuzziness at the edges
    to allow for floating point noise.

    The fuzziness is implemented by expanding the range at each end `fuzz` steps
    using the numpy.nextafter function. For example, with the inputs
    minval = 1, maxval = 2, and fuzz = 2; the range would be expanded to
    minval = 0.99999999999999978 and maxval = 2.0000000000000009 before doing
    comparisons.

    Parameters
    ----------
    val : float
        Value being tested.

    minval : float
        Lower bound of range. Must be lower than `maxval`.

    maxval : float
        Upper bound of range. Must be higher than `minval`.

    fuzz : int, optional
        Number of times to expand bounds using numpy.nextafter.

    inclusive : bool, optional
        Set whether endpoints are within the range.

    Returns
    -------
    is_between : bool
        True if `val` is between `minval` and `maxval`, False otherwise.

    """
    # expand bounds
    for _ in xrange(fuzz):
        minval = np.nextafter(minval, minval - 1e6)
        maxval = np.nextafter(maxval, maxval + 1e6)

    if inclusive:
        return minval <= val <= maxval

    else:
        return minval < val < maxval

For a great discussion on comparing floating point numbers see this randomascii post, and for some interesting discussion on the fallibility of range functions see this post on Google+ by Guido van Rossum. Guido actually calls out numpy.linspace as a range function not susceptible to floating point drift (since it’s calculating intervals, not adding numbers), but it’s always possible to get surprises with floating point numbers.

Pen and Pants

Don't Leave Home Without Them

Fuzzy Floating Point Comparison

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply