Tuesday, January 10, 2012

Ruby Float quirks

Working with Float values sometimes really leads to unexpected behavior. Take the following example:

def greater_than_sum?(float1, float2)
  difference = float1 - float2
  difference + float2 > float1
end

You would expect this method to return false all the time. But try calling greater_than_sum(214.04, 74.79) and it will return true!

The problem

Open up an irb shell and type 139.25 + 74.79, then you’ll see the reason for this quirk: The result is 214.04000000000002, which is apparently slightly greater than 214.04.

Float numbers cannot store decimal numbers precisely. The reason is that Float is a binary number format and always needs to convert from decimal to binary and vice versa.

As you probably know, not all numbers can be represented in decimal format. When you calculate 1/3, the result is 0.33333333333333…. An endless chain of 3s.

The same holds true for binary numbers. When a decimal number is converted to binary format, the resulting binary number might be endless. But since we don’t have endless memory on our computers to represent the number, the computer cuts off that number at some point. This produces a rounding error and that’s exactly what we have to deal with here.

The solution

In our previous example you can add a second condition to check if a number is within a certain interval around another number, called delta:

def greater_than_sum?(float1, float2)
  delta = 0.0001
  difference = float1 - float2
  difference + float2 > float1 && difference + float2 - float1 < delta
end

What you choose as delta is pretty much up to you. It depends on the precision you need in your comparisons and the expected error: Unfortunately the calculation error increases the more calculations you perform on a number.

Decimal solution

If you need an exact comparison of 2 decimal values, you might wanna go with BigDecimal. All calculations on decimal numbers are much more accurate than with Floats.

So I just use BigDecimal instead of Float from now on!

Of course you can do that, but there is one big disadvantage: Calculations on BigDecimals are considerably slower. The floating point logic that Floats use is implemented directly in hardware on your computer’s processor already. This makes it blazingly fast. BigDecimals on the other hand are just a software implementation to cope with Float’s issues with no direct hardware support. More on this on Wikipedia.

So whenever it is possible use Floats – that’s why they are default. When you need precise decimal numbers, e.g., because you are working with currencies, go for BigDecimal.

If you want to dig a little deeper check out What Every Computer Scientist Should Know About Floating-Point Arithmetic. Enjoy ;)

Update: Jarmo Pertman compares Floating point arithmetics in 19 programming languages and proposes a general solution for Ruby using BigDecimals for Float operations. However, be aware of the performance concerns of his solution.

Update 2: Kyrylo Silin created a Russian version of this article.

24 comments:

  1. this problem affects only newer versions of ruby than 1.8.7 ...

    $ irb
    ruby-1.8.7-p174 :001 > 139.25 + 74.79
    => 214.04

    ReplyDelete
    Replies
    1. It affects 1.8.7, also. This is a problem with floating point math, not a Ruby 1.x problem.


      ruby-1.8.7-p249> 214.04 - 74.79 + 74.79 > 214.04
      => true

      Delete
  2. That example might not affect 1.8.7 but I assure you the issue is real. I've had to deal with this for years. Normally I do my own rounding to 2 or 3 decimal points for comparing floats with == or > <

    ReplyDelete
  3. At first glance it seems like the problem doesn’t affect older ruby versions. But try the following:

    $irb

    > float1 = 214.04
    => 214.04

    > float2 = 74.79
    => 74.79

    > difference = float1 - float2
    => 139.25

    > difference + float2 > float1
    => true

    Apparently Ruby 1.8.7 rounds imprecise numbers for output, but still uses them for calculations. The reason why they changed this behavior may be that now it is easier to find bugs like mine.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. And it's not Ruby specific problem. This problem exists in any language. I've written a blog post about it some time ago http://itreallymatters.net/post/386327451/floating-point-arithmetics-in-19-programming-languages

    ReplyDelete
  7. @Jarmo: You are right, thanks for your amendment. I like the comparison, but I don’t recommend the fix you suggested for the reasons named above.
    Floats work the way they do for good reasons, you just have to be aware of their quirks.

    ReplyDelete
    Replies
    1. I didn't actually mean that using BD instead of Float should be the case, but just provided one solution if anyone really needs to do it. I wouldn't use it in real life scenario. I'm sorry for my posts' obscurity.

      Delete
  8. You could also use http://flt.rubyforge.org/ BCD number representation and not floats

    ReplyDelete
  9. This is nothing to do with Ruby per se. Every computer language that uses an IEEE floating point number representation has the same issues.

    See http://floating-point-gui.de/, http://en.wikipedia.org/wiki/Floating_point or http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html for more background on floating point numbers.

    ReplyDelete
  10. @Cafonso: Muito obrigado! Do you have any experience on how it performs in comparison to "native" Floats? I imagine that you could run into the same performance issues as with BigDecimals or even worse, since it’s only written in Ruby.

    @Sean: Right, maybe I didn’t point this out enough. Anyway, since this is a Ruby blog I guess it doesn’t make much difference to the readers. But thanks for the clarification and the links!

    ReplyDelete
  11. @Cafonso what's the difference between that and "Rational" class?

    ReplyDelete
  12. Looks like since today it’s possible to reply to comments directly on Blogger. Awesome!

    ReplyDelete
  13. Without trying to compare accuracy, on the surface I find BigDecimal to be faster than float. Thoughts?

    https://gist.github.com/1607539

    ReplyDelete
    Replies
    1. You have just tested the speed of two cars by standing them side-by-side and just starting the engine.

      If you wanted to "test" BigDecimal vs. Float why you did not do any basic mathematical operation at all? (You just created a bunch of objects, nowhere are a single addition, multiplication or division, which would really TEST speed).

      Delete
  14. Always fun to see people trip over a fundamental computer science issue that has been around as long as computers. not.

    ReplyDelete
  15. Had a problem like that once and ONLY at the prod server, it didn't happen in any of the dev/stg machines. Yes, I used the same values for testing.

    If you're working with fixed decimal places, I suggest work with Integers internally and only modify them for view purposes. It will be faster than BigDecimal.

    ReplyDelete
  16. Thanks a lot for the clarification – just went nuts because something like

    (703.0 / 2449.0).round(2) * 100

    results in 28.999999999999996 – I now know why ;)

    ReplyDelete
  17. You can see the problem without doing any operation at all on a Float, just show enough decimal in the fractional part:

    > "%.18f" % 1.2
    => "1.199999999999999956"

    ReplyDelete
  18. Jarmo Pertman's article was an interesting collection but he was just collecting the obvious, because this precision problem is not a problem but the default behavior.

    This precision error is built in the system (therefore it traverses down all the way to the programmer), because it cannot be avoided. Ruby (and every language) has to behave like this if it wants to be IEEE 754 conform. That is, a program written in other language (and working properly) could be ported to Ruby without rethinking and rewriting the mathematical operations in the code.

    ReplyDelete
  19. Maddening. Stick with BigDecimal. Really. And don't convert your float to BigDecimal. Start with the right type.

    # Ruby 2.0
    > 8.63.to_d.to_s
    => "8.630000000000001" # surprise, surprise!
    > BigDecimal('8.63').to_s
    => "8.63" # back to sanity

    ReplyDelete

 
DreamHost coupon code