Tuesday, August 09, 2011

Beware when comparing floating point numbers

I have been writing a lot of test code verifying some of the short segments of code in my assembly book.  I ran into a bizarre state of affairs with floating point comparisons.  I had read the Intel instruction manual too quickly and ASSUMED that the floating point ucomiss instruction, being fairly new, would be designed to make for easy programming.  Imagine my shock when I tried to compare with ucomiss and then use jle to jump on less than or equal and it did not work properly.

I wrote a C program with all 5 arithmetic comparisons using floats to study more carefully what gcc does to cope with ucomiss.  Here's what gcc did for a less than comparison on 2 registers:

   ucomiss    %xmm1, %xmm0
   seta       %al
   testb      %al, %al
   je        .L3


That's some of the most unreadable code I could ever imagine for such a simple goal.  So I wrote some assembly code to determine for myself which flags are set when operands satisfy less than, greater than and equal comparisons:

        segment .data
a       dd      1.5
b       dd      2.5
c       dd      7.5
        segment .text
        global  main
main:
        push    rbp
        mov     rbp, rsp
        movss   xmm0, [b]
.lt     ucomiss xmm0, [c]
.gt     ucomiss xmm0, [a]
.eq     ucomiss xmm0, [b]
done    leave
        ret
 Armed with this code in the debugger I printed eflags after executing each of the ucomiss instructions.  Here is what I got:

For less than:     CF IF
For greater than:  IF
For equal:         ZF

Thankfully testing for equals or not equals can be done using je or jne.  After that it looked pretty ugly.  This is close to insane since it's generally a good practice not to test floating point values for equality.  The really useful comparisons are quite convoluted if gcc has the right strategy.  Fortunately for me I have only a short section in the book about floating point comparisons, so my error is minor though my ignorance was great.

The method used by gcc is almost pointless to try to teach to beginners. So I started searching the jCC instructions for some matches on the flags as produced by ucomiss.  I found them:

jb     CF = 1
jbe    CF = 1 or ZF = 1
ja     CF = 0 and ZF = 0
jae    CF = 0
je     ZF = 1
jne    ZF = 0

So it made me wonder if gcc did better with -O3.  It generated appropriate code.  It is very strange to try to imagine how the non-optimized sequences ever get generated by a compiler.

So now there is more rationality than it first seemed, though I still wonder what made anyone think it would be cool to use different jump instructions after floating point comparisons than after integer comparisons.




No comments: