The Nuances of Bit Shifting
This is Part 2 in a three-part series about binary representations and bit shifting. Follow these links for Part 1 and Part 2.
While the content in the previous two posts are good general knowledge, I want to cover one more somewhat advanced topic here: bit shifting.
The book covers bit shifting in some depth, but with the stuff we’ve learned recently, we have a lot more information and can discuss bit shifting in more depth.
C# has three bit shifting operators.
All three of these operators take the bits in a number and scoots them all to the left or right.
The left shift operator, <<
shifts the bits to the left:
int number = 0b00001010;
int shiftedNumber = number << 2; // Results in the bit pattern 00101000
Remember that starting a number literal with 0b
means the rest of it will be written out in binary.
(That’s not quite the same as specifying the exact bits to use, but it is close enough for this.)
Writing it out that way makes it easy to see what the bits will look like, and therefore, also to predict what the final result will look like after being shifted.
The above code shifts everything to the left (the <<
operator looks like little arrows to tell you which direction to move the bits) by two spots.
The bits on the left side will just drop off the end and disappear, while zeroes will be placed on the right side in the newly open spots.
The right shift operator does the same thing, just to the right:
int number 0b00001010;
int shiftedNumber = number >> 2; // Results in the bit pattern 00000010
The 10
on the far right end are just gone, while zeroes are filled in on the left side.
However, right shifting with >>
does not always result in values being filled in with zeroes.
Here’s how it works:
- If you right shift an unsigned type–
ulong
,uint
,ushort
, orbyte
–then the left end will always be filled in with zeroes. - If you right shift a signed type–
long
,int
,short
, orsbyte
–and the number is positive, then the left end will be filled in with zeroes. - If you right shift a signed type and the number is negative, then the left end will be filled in with ones.
Remember from the last post that all negative numbers in signed integer types have a 1
in the leftmost bit.
In short, with a signed type and the >>
operator, the left edge will be filled in with more of whatever was in the leftmost bit.
Why does it work that way?
It comes down to a trick: shifting is an efficient way to multiply or divide integers by powers of two.
The bit pattern for three is 00000011
.
If you shift this left by one spot, it results in doubling the number: 00000110
is six.
If you shift left by two spots, it results in quadrupling the number (multiplying by 4): 00001100
is twelve.
The same is true for right shift, only it divides by multiples of two. If you shift the bit pattern 00000110
(6) to the right one spot, you get 00000011
, which is 3.
Shift that to the right one spot and you get the bit pattern 00000001
, which is 1.
3/2 is, mathematically, 1.5, but with integer division, since we shed the fractional part and keep the whole number, so 3/2 should be 1 in integer division.
The thing is, sometimes it is convenient for this to also work for negative numbers.
With eight bits and the number -8, which is the bit pattern 11111000
, it would be convenient to be able to use the same “right sifting once is division by two” trick.
For that to happen, we’d need to end up with the bit pattern 11111100
, which is the bit pattern for -4.
So that’s why right shifting fills in the top bits with whatever was already there–to preserve this trick of being able to multiply and divide by two, regardless of the sign of the number, by shifting.
(It is also why unsigned types work differently, because for them, even if the topmost/leftmost bit is a 1
, it is still a positive number.)
Because of all of this, the >>
operator is referred to as an arithmetic right shift.
It isn’t just about moving bits, but achieving some arithmetic benefit.
Whether that is good or bad depends on what you were hoping to achieve.
If you wanted to treat the bits as just a pile of bits, then it is annoying that it doesn’t just fill the vacated by the shift with zeroes.
But if you’re hoping to use >>
to divide by two, then you need it to fill it with whatever was already in the topmost/leftmost bit to achieve that result.
However, if you’re using C# 11 or newer, then you’re in luck!
In addition to the arithmetic right shift operator, >>
, there’s a bitwise right shift operator, >>>
, which will treat it as a pile of bits and fill in with zeroes in all circumstances.
This is the tool to use if you are using a signed type like int
or long
but still are thinking of it as bits, not numbers.
Consider this:
int positive = 0b0010; // 2
int shifted1 = positive >> 1; // 1
int shifted2 = positive >>> 1; // Also 1.
For unsigned types and for positive numbers, there was never a difference. But for negative values in signed types:
int negative = int.MinValue; // 0b10000000_00000000_00000000_00000000
int shifted1 = negative >> 1; // 0b11000000_00000000_00000000_00000000
int shifted2 = negative >>> 2; // 0b01000000_00000000_00000000_00000000
Use >>
when you want an arithmetic shift, such as an optimization for dividing by two.
Use >>>
when it is just a blob of bits masquerading as an int
, and you just want to shift them all down and fill in the new gap with zeroes in all cases.
As a sidenote, you honestly don’t usually want to write out x >> 1
to mean x / 2
, or x >> 2
to mean x / 4
, even though it works and even though bit shifting is faster than division.
The downside is that your code no longer obviously reflects intent.
Sure, some programmers know the trick that shifting accomplishes multiplying and dividing by powers of two (2, 4, 8, etc.).
But most don’t.
Writing x >> 1
instead of x / 2
has a severe impact on readability.
This is especially true if the one reading it doesn’t know the trick.
But even without it, programmers deal with division far more often than we do with bit shifting, so x / 2
is just more natural.
The only time it might make sense to write out the x >> 1
to divide by two is when the performance is so critical that it must be done, and that is rare.
Besides, the JIT compiler will likely already generate a right shift instruction in these cases anyway.
It is almost certainly doing the optimization so that you don’t need to.
(I checked on Windows-x64 and saw the JIT compiler turn it into a right shift instruction. I can’t make a promise that all flavors of .NET on all operating systems on all hardware instruction sets will always do the same thing, but it is probably a reasonable assumption.)