Onno Garms wrote:Dann Corbit wrote:You did it the right way.
Well, I'm right to use a workaround, because things like a<<-1 are undefined.
I'd like to know if it's possible to write a branchless workaround, i.e. well-defined code that leads to the same result like my workaround.
What about the assembler instructions for shift and rot? Are they also undefined?
Even if they are undefined, I might use one rotl, one rotr and one mask operation to implement x<<=(a-b) without a branch. But this looks compicated and might be even slower than the solution with a branch.
Yes, todays branch predictors are quite sophisticated.
http://wbforum.vpittlik.org/viewtopic.php?t=6108
x64 64(32)-bit shift (and rotate) do an implicite "and" 63(31).
Thus shr rax,-1 is like shr rax,63, -32 == +32 etc. Beside rotate plus mask there are some other options to do generalized shifts branchless.
To "speculative" perform both shifts and to use cmove on the condition. May be even compiler where able to produce that assembly by conditional assignment.
- Code: Select all
x = (a-b > 0) ? x >> (a-b) : x << -(a-b);
One may use a {0,-1}-mask on the sign of the difference, by shifting it arithmetical right (also not strictly defined in C).
- Code: Select all
int diff = b-a;
int mask = diff >> 31; // 31 == 8*sizeof(int) - 1
x >>= -diff & mask;
x <<= diff & ~mask;
But likely the additional instructions and register pressure will only pay off, if the branch is difficult to predict.
Or what about?
- Code: Select all
x >>= a;
x <<= b;
oups, wont work or course