On double and or


A small remark on programming conditionals. This is true for C and C++, but it might be true also for Java and other languages. I haven’t checked all of them.

It is about conditionals.

I’m seeing lots and lots of double OR and double AND in conditionals. In fact, I find it rare to see a single single operator these days. (Sorry, I couldn’t resist the pun). Almost without exception, every multiple conditional I see has the form:

if ((x>0) && (x<MAX_VAL))
{
   do something....
}

Which is kind of silly, really. The double operator && is the right thing to use when you have a complex conditional with function calls. But when all you do is compare arithmetical conditions? What one should do is use the simple single-operator

if ((x>0) & (x<MAX_VAL))
{
   do something....
}

When you use a double-operator, what the compiler actually generates is this:

if (x>0)
   if (x<MAX_VAL)
   {
      do something....
   }

Which is double-branching. While this has it’s uses, it is usually not what you want the computer to do. Now, I admit that using the double logical operator is safer. But it can come with a cost. Use it when you need double branching.

Modern day computers have “pipelines” in the CPU, which enable several instructions to run almost in parallel. All of the instructions you send to the CPU are translated to a multiple-stage process. For example, the check

x<0

means that the variable ‘x’ is copied into a register on the CPU, the value in the register is then processed by an arithmetic unit to check if it has a non-zero bit, and the result is stored in another register and then sent over to a specific location in memory. While the arithmetic unit is busy checking the register’s bits, the CPU can manage other instructions. So, modern day machines always load a whole bunch of instructions together. By the time instruction number 1200 is complete, instructions 1201-1220 are already half-way through. Instruction 1221 is now loaded into the pipeline, and it will start processing while 1201 is being completed. If you want a complete explanation of pipe-lining, check out:

http://en.wikibooks.org/wiki/Microprocessor_Design/Print_Version

Now, suppose that instruction 1210 is a branch (e.g. an if statement). The compiler had to guess which instructions will follow, but it might have guessed wrong. If it guessed wrong, then the entire contents of the pipe-line following instruction 1210 needs to be discarded and replaced. So, what seems to be a single instruction (branch) may take x10 time because it wasted an entire CPU clock and forced a rewrite of the entire pipe. This is why branches are evil. Minimize them.

So, you need to think of a branching statement as if it’s about x10 longer than any other statement in your program. This means that double-branches (or triple branches) will be even worse. When you do:

if ((x>0) & (x<MAX_VAL))

You have 1 branch and 3 arithmetic operations. Those arithmetic operations are very fast, very efficient, and there’s nothing you should do about them. When you do this:

if ((x>0) && (x<MAX_VAL))

You have double branch, which might cost you 20 operations or more, depending on the hardware.

When to use double branches

This doesn’t mean that you should never use double branch. Use them when you want to save on something costly, like a function call, or when you need to check that a pointer is non empty. Then the branch won’t cost you more than what you have to pay anyway. So, these are perfectly reasonable uses of the double- AND:

if ( (p != null) && (*p >0 ) )
if ( (x > 0) && ( y=some_function(x) ) )

As a general rule, if you’re not sure, using the double logical operator is safer. The compiler transforms the conditions to boolean and might issue a warning if you made an error.

Why should you care?

But what’s the cost of doing redundant branches? 10-20 CPU operations per branch? Not much, is it? After all, modern CPU clock is GHz. Negligible, isn’t it?

Of course, if you’re writing heavy-duty, or time-critical applications, this might be important. If you’re writing stock market bots, 10 microseconds might be the difference between profit and loss, which is critical. Same if you’re analyzing Big Data. 10 microseconds multiplied by 10 Giga will be make a big impact.

But there’s another issue:

Know what you’re doing

I think it’s mighty important that you know why you’re doing things this way and not that way. Understanding the details will make you a better programmer. Always doing double-&& or double-|| without understanding why is a bad habit. Don’t.

compiler optimization

As commentator Michael has noted, modern compilers can optimize past this issue.

I compiled it with “gcc -O2 -S”, and lo and behold at the assembly code:
f:
.LFB11:
.cfi_startproc
subl    $100, %edi
cmpl    $99, %edi
jbe     .L4
rep
ret
.p2align 4,,10
.p2align 3
.L4:
jmp     g
.cfi_endproc

which is a single branch.

Indeed. Still, I don’t like trusting the optimizer on issues which are neither cumbersome nor complicated. Loop unrolling is not something you want to do by hand. This is.

Advertisements

14 thoughts on “On double and or

  1. ליבוביץ

    I think many programmers knows && and || are short-circuit operations, and it is usually required, since for instance if you have

    if (cheap_test() && expensive_test()) {}

    Your micro-optimization is not as efficient as not doing expensive_test is.

    Using & in general is a pitfall, especially in C. Functions that returns boolean true in C, can return arbitrary values greater than 0, and you might end up by mistake with:

    if (four_is_true() & two_is_true()) oy_vey();

    This is not about knowing the internal of the compiler, it’s about having a single standard with a large codebase. If you’re always using && in conditionals, it’s easier to review code, and not to worry about what exactly this if statement is using. Everyone is using boolean logic in conditionals.

    Want to optimize? No problem

    int cond = first() *& sec;
    if (cond != 0) …

    Here, you didn’t break the rule, and got the exact same results.

    Reply
  2. mousomer Post author

    I should have emphasized more the last paragraphs – the double && is useful when you do want a double branch – like with function calls. You’re right, also, that as a general rule, it’s usualy better than dumb use of single &.

    Reply
  3. Michael

    1. IMHO, the example you intended to give is “if ((x>0) && (x0) || (x=100) && (x<=200))
    g();
    }

    compiled it with "gcc -O2 -S", and lo and behold at the assembly code:
    f:
    .LFB11:
    .cfi_startproc
    subl $100, %edi
    cmpl $99, %edi
    jbe .L4
    rep
    ret
    .p2align 4,,10
    .p2align 3
    .L4:
    jmp g
    .cfi_endproc

    clearly, there's only one conditional jump here…

    If you're writing code which should be readable (that is, most code you'll write), strive for readability. Only optimize functions and jumps when you really have too… (based on measurements, the least)

    Reply
    1. mousomer Post author

      I’m an idiot it took me so long to check. But – what you see here is a special case following the fact this specific condition was lame to start with.

      Try gcc -S -O3
      on:

      int main(int argc, char* argv[])
      {
      int a=atoi(argv[1]);
      int b=atoi(argv[2]);
      int c=atoi(argv[3]);
      int d=atoi(argv[4]);
      int e=atoi(argv[5]);

      if ((a<b) &&(c<d))
      printf("true 1/n");

      if ((a<c) && (d<e))
      printf("true 2/n");

      if ((a<d) || (c<b))
      printf("true 3/n");

      return a;
      }

      And count the occurrences of cmpl. It's double branch. Double branch all the way down.

      Reply
  4. Yoed

    Most code written has no measurable effect on performance and the logical operators have a much more intuitive precedence than the bitwise operators (fewer subtle bugs).

    Not to mention the implicit conversion to bool type when you use the logical operator that you’re not getting when you use the bitwise (fewer subtle bugs again).

    Modern CPUs have very good branch predictors which renders the performance argument moot.

    This advice sounds clever but I’m afraid it fails to consider many things.

    Reply
    1. mousomer Post author

      Well, I agree that, as a general rule, if you don’t want to bother with thinking, it is better to err on the safe side and use the double && always.
      Still, it seems to me somewhat of a slack.

      Reply
      1. Yoed

        Honestly, how long would it take you to figure out the bug here:
        if( foo() & bar() ) { /* do crtitical thing */ }

        foo returns 2 while bar returns sometimes 3 and very rarely 4.

        Reply
        1. Yoed

          I’m sure YOU are responsible enough to always write it like this:
          if( (0 != foo()) & (0 != bar()) ) { /* do critical thing */ }

          But what about that colleague of yours that’s not to bright and hates verbosity as much as you hate sub-optimal code?

          Reply
          1. mousomer Post author

            I hope I made it clearer this time.
            When you have a complex conditional with function calls, do use the double operator, by all means.
            if( (0 != foo()) && (0 != bar()) )
            is far better – not only is it safer, here you do wand double branching.

            But for SIMPLE ARTHRITIC? Really?

  5. Yoed

    Even with simple arithmetic, the benefits of more intuitive precedence and implicit conversion to a Boolean expression greatly outweigh the potential performance gain.

    Using & when you mean && ,just because it may be quicker, is poor style.

    Proving that it’s equivalent is a burden that’s much more appropriate for a smart optimizing compiler than a human being (which is inherently error prone).

    Reply
    1. mousomer Post author

      I’ll try to summarize the points so far:

      1. Readability: C/C++ can be used to generate very clean code, but they are not usually very readable. Try and read a boost lib, and talk to me then. I don’t see the big difference in readability between
      (g==5) & (f>6)
      versus
      (g==5) && (f>6).
      I stay unconvinced.

      2. There are simple things which should be done to prevent errors in C conditionals. S.a.
      Always put the literal before the variable – “if (g==5)” could kill you if you accidentally erase one of the equality signs. if “(5==g)” is safer.

      3. If you always do the same, sure && is better than single-&. Better to err on the safe side.

      4. && is absolutely the right thing to do when you have a function call or a nullptr check.

      5. Compilers should Micro-optimize whenever possible.

      But you go further – you claim, implicitly, that there is no place, ever, to using single-& on a conditional. And I find it rather harsh. I mean, OK, most of us are using just a convenient subset of the C++ language anyway. And you might as well choose a subset not containing single logical operators. But I really don’t see why it’s such a horrible burden thinking about the conditional when you write it. There are so many caveats to fall into, one should anyway spend a few seconds per code line.

      And – I have bumped into optimization issues. I can’t always compile with -O3. Sometimes I do need to debug / run valgrind, and saving a few branches on a 30MB array isn’t negligible.

      Reply
      1. mousomer Post author

        I’ll add one more example. Compare:
        if ( (false == init1(…) ) && (false == init2(…) ) )
        … // fix something

        if ( (false == init1(…) ) & (false == init2(…) ) )
        … // fix something

        Those 2 conditionals will do very different things. The second conditional will run both init procedures, while the 1st conditional will, occasionally, scrap the second init.

        I don’t know about you, guys, but to me the 1st one seems non-intuitive. All shortcuts are error-prone, and the double && is, originally, a dangerous optimization in itself. I have nothing against it, but it’s certainly not obvious or trivial to use it. And it should NOT be used as a trivial replacement of the single operator &;.

        A good programmer must know both.

        Reply
  6. Yoed

    I said nothing about readability, I was very careful about that. it’s something that can’t be measured accurately and is therefore not worth discussing.

    Also, I completely agree that every programmer has to be very aware of the semantics of short-circuit evaluation and their absence when using bitwise operators.
    I go further and claim that good programmers who critique other people’s code must also be aware of the implicit conversion to bool and the difference in precedence that logical operators have vs. bitwise operators.

    The part where we disagree, is where I say that preferring & over && and/or | over || as a *performance optimization* when the semantics call for a logical operator and not a bitwise operator is terrible style, dangerous and often a redundant over-thought when using an optimizing compiler.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s