Tuesday, February 1, 2011

Bugs or no bugs in awk, Perl, ...? Floating point arithmetic issues after all

A couple of years ago I ran into one of my biggest 'blushing red' cases in my professional career.

The starting point was some misbehaviour of awk and Perl which I reported as a bug. There weren't any bugs though, just normal behaviour which I hadn't known (when I should have, that's the blushing red). Here's the story.

You don't have to be a math wizard to tell that the following equation is true:
1.2/3 - 0.4 = 0
but awk/nawk and Perl get it wrong:
nawk 'END{ print 1.2/3-0.4}' </dev/null
-5.55112e-17

perl -e 'print 1.2/3-.4'
-5.55111512312578e-17
and even stranger
nawk 'END{ x=1.2/3; print x,x-0.4}' </dev/null
0.4 -5.55112e-17
i.e. x prints as .4 but when you subtract .4 the result is not zero.

I found more examples like 1.4/7-0.2 and first I thought it boiled down to the division by uneven prime numbers and how they are stored internally (strange thoughts one can have) and I also thought it must be a bug in some common C library.

The impact can be severe: if you set x=1.2/3-0.4 and then your code depends on whether x is less than, equal to or greater than zero you might - metaphorically speaking - miss the moon.

Eventually I created a bug report here but I got myself educated by some experts and further reading about floating point arithmetic and why the described behaviour is correct.

The bugs described above aren't bugs but a feature of how floating point works on todays computers, an area which I have neglected and never had any serious issue with in my work life.

The issue manifests in numerous ways, not just the division (my starting point), look at these examples where a simple multiplication goes wrong, pay attention to the 600.93 example:
awk
awk '{printf "%d %d %d\n", 500.93*100, 600.93*100, 700.93*100 }'
50093 60092 70093
Perl
perl -e 'printf("%d %d %d\n", 500.93*100, 600.93*100, 700.93*100)'
50093 60092 70093
C
main() {
  int x = 600.93*100;
  printf("%d\n",x);      
}

results in 60092
Why? In short because float variables are stored in binary format where certain numbers cannot be stored exactly. Any subsequent operation will also lose some of the information and thus wrong decimal results appear. When and how this information loss happens is hard to foresee unfortunately.

Is there a way around this? that dependes on the programming language. Perl offers modules like BigMath which seem to resolve this, I don't know of a general way.

The issue I'm facing is how to write safe arithmetic in awk and Perl (and who is going to look after existing scripts and rewrites them?).

Look at the awk example again replacing printf by print:
awk '{print 500.93*100, 600.93*100, 700.93*100; }'
50093 60093 70093
so this is correct now. My starting point though was an example where print provided a wrong result so the issue can manifest itself anywhere, there is no guaranteed way out of it it seems.

No comments:

Post a Comment