[ILUG] a quick gcc microoptimisation question

Paul Jakma paul at clubi.ie
Fri Dec 17 11:55:29 GMT 2004


On Fri, 17 Dec 2004 P at draigBrady.com wrote:

> I don't know what the default is but you can give branch prediction 
> hints with __builtin_expect() in gcc >= 3, and playing with that 
> and looking at the resulting assembly should show the default 
> behaviour.

Interesting..

$ cat test.c
int main (int argc, char **argv)
{
   int ret = 1;
   if (__builtin_expect((argc > 1),LIKELYNESS))
     ret = atoi(argv[2]);
   else
     ret = atoi(argv[1]);

   return ret;
}

$ gcc -DLIKELYNESS=1 -S -O2 -mcpu=x86-64 -o test.likely ./test.c
$ gcc -DLIKELYNESS=0 -S -O2 -mcpu=x86-64 -o test.unlikely ./test.c
$ diff -u test.likely test.unlikely
--- test.likely 2004-12-17 11:35:19.885634219 +0000
+++ test.unlikely       2004-12-17 11:35:27.613726954 +0000
@@ -8,15 +8,15 @@
         subq    $8, %rsp
  .LCFI0:
         decl    %edi
-       jle     .L2
-       movq    16(%rsi), %rdi
+       jg      .L6
+       movq    8(%rsi), %rdi
  .L4:
         xorl    %eax, %eax
         call    atoi
         addq    $8, %rsp
         ret
-.L2:
-       movq    8(%rsi), %rdi
+.L6:
+       movq    16(%rsi), %rdi
         jmp     .L4
  .LFE2:
         .size   main, .-main

So the likely branch is put immediately after the branch instruction, 
the unlikely branch is moved to after the common code of the 
function. Same holds true for -mcpu=pentiumpro and -mcpu=athlon. 
Essentially to make best use of instruction prefetching and I cache. 
Sort of obvious after you've seen the asm ;). Interestingly, this may 
be at odds with Brian's advice on style.

Though, the gcc manual notes that using profiling information is 
preferred to annotating with __builtin_expect as:

      In general, you should prefer to use
      actual profile feedback for this (`-fprofile-arcs'), as
      programmers are notoriously bad at predicting how their programs
      actually perform.

;)

The annoying thing about such profiling information though is that is 
utterly specific to compiled binaries. Recompile with slightly 
different options or code and that profiling information becomes 
worthless, which means you have to do it each time. :(

Ie, I wish there was a way for this profiling information to be 
collected in a more high-level format, such that it could remain 
relevant across different compiles.

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
Fortune:
May a Misguided Platypus lay its Eggs in your Jockey Shorts.



More information about the ILUG mailing list