Page 1 of 1

Negative shift values

PostPosted: 06 Sep 2007, 23:40
by Onno Garms
Hello,

is it possible to handle negative shift values without a branch? For example, a<<-1 should equal a>>1

Currently I use
Code: Select all
if (a>b)
  x>>=(a-b);
else
  x<<=(b-a);


Can I get rid of the "if"?

Re: Negative shift values

PostPosted: 07 Sep 2007, 02:32
by Dann Corbit
You did it the right way.

6.5.7 Bitwise shift operators
Syntax
1 shift-expression:
additive-expression
shift-expression << additive-expression
shift-expression >> additive-expression
Constraints
2 Each of the operands shall have integer type.
Semantics
3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

Re: Negative shift values

PostPosted: 07 Sep 2007, 08:14
by Onno Garms
Dann Corbit wrote:You did it the right way.


Well, I'm right to use a workaround, because things like a<<-1 are undefined.

I'd like to know if it's possible to write a branchless workaround, i.e. well-defined code that leads to the same result like my workaround.

What about the assembler instructions for shift and rot? Are they also undefined?

Even if they are undefined, I might use one rotl, one rotr and one mask operation to implement x<<=(a-b) without a branch. But this looks compicated and might be even slower than the solution with a branch.

Re: Negative shift values

PostPosted: 07 Sep 2007, 10:07
by Gerd Isenberg
Onno Garms wrote:
Dann Corbit wrote:You did it the right way.


Well, I'm right to use a workaround, because things like a<<-1 are undefined.

I'd like to know if it's possible to write a branchless workaround, i.e. well-defined code that leads to the same result like my workaround.

What about the assembler instructions for shift and rot? Are they also undefined?

Even if they are undefined, I might use one rotl, one rotr and one mask operation to implement x<<=(a-b) without a branch. But this looks compicated and might be even slower than the solution with a branch.


Yes, todays branch predictors are quite sophisticated.

http://wbforum.vpittlik.org/viewtopic.php?t=6108

x64 64(32)-bit shift (and rotate) do an implicite "and" 63(31).
Thus shr rax,-1 is like shr rax,63, -32 == +32 etc. Beside rotate plus mask there are some other options to do generalized shifts branchless.

To "speculative" perform both shifts and to use cmove on the condition. May be even compiler where able to produce that assembly by conditional assignment.
Code: Select all
x = (a-b > 0) ? x >> (a-b) : x << -(a-b);

One may use a {0,-1}-mask on the sign of the difference, by shifting it arithmetical right (also not strictly defined in C).
Code: Select all
int diff = b-a;
int mask = diff >> 31; // 31 == 8*sizeof(int) - 1
x >>= -diff &  mask;
x <<=  diff & ~mask;

But likely the additional instructions and register pressure will only pay off, if the branch is difficult to predict.

Or what about?
Code: Select all
x >>= a;
x <<= b;
oups, wont work or course ;-)

Re: Negative shift values

PostPosted: 07 Sep 2007, 10:40
by H.G.Muller
Note, however, that in cases where you know in advance that the shift is not going to clip off any 1 bits, the rotates are a valid alternative.

Re: Negative shift values

PostPosted: 09 Sep 2007, 11:04
by rjgibert
Using a 32 bit word and limiting oneself to 16 bit values (x = 0 to 65535), you can do a shift right by s = -16 to 16 in both directions instead of 0 to 32 in one direction with:

((x << 16) >> (s + 16)) & 65535

I have assumed here that the (x << 16) and (s + 16) parts can execute in parallel.

You may find the limitations I've imposed unsuitable for your purposes. However this can be generalized to larger values.

The above was informally tested in Python where it appeared to work. As for C, I don't know.

Re: Negative shift values

PostPosted: 09 Sep 2007, 12:29
by Onno Garms
rjgibert wrote:Using a 32 bit word and limiting oneself to 16 bit values (x = 0 to 65535),


The latter limitation will not be acceptable in most cases.