Counting in Other Bases
This is Part 1 in a three-part series about binary representations and bit shifting. Follow these links for Part 2 and Part 3.
We’re going to talk about how to count today. You may have learned from Sesame Street that you count like this: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
That is not always the right way to count!
Most of us are quite familiar with this number system. It is referred to as a base-10 number system because there are 10 different symbols that can be used to represent a digit. It is also referred to as a decimal number system, because deci means “10” in Latin.
But mathematicians have long pondered number systems with a different number of symbols. Indeed, over the course of human history, different cultures have used a different number of symbols. The Babylonians used 60 symbols for a base-60 number system, while the Mayans used 20 symbols for a base-20 number system. And in computing, base-2 (binary), base-8 (octal), and base-16 (hexadecimal) are all quite common.
Given the significance in computing, it is worth taking some time to analyze how these alternative systems work a little more carefully.
While you can do arithmetic in these other systems, our main focus today will be on counting in these other systems. To do that, we need to forget everything we learned about “normal” base-10 counting and start fresh. (You’ve been doing it so long, it is just natural; you maybe haven’t put much thought into why we count the way we do, since it has just become second nature by this point.)
The first thing we must do is identify the symbols we’ll use. The number of symbols will tell us what base our system is using. While we could use any symbol, such as triangles, circles, and smiley face emojis, it is more conventional to stick with the Arabic digits that we know and love (0, 1, 2, 3, …) until we run out of them, and then use letters, starting with A, until we have the full set that we want.
For example, in a base-2 system, we’ll use the symbols 0 and 1. In a base-4 system, we’ll use the symbols 0, 1, 2, and 3. In a base-8 system, we’ll use the symbols 0, 1, 2, 3, 4, 5, 6, and 7 (but not 8 or 9). In a base-10 system (our long-time friend), we’ll use the symbols 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. In a base-16 system, we’ll use the symbols 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, followed by A, B, C, D, E, and F. In a base-20 system, we’ll use 0 through 9, then A through J.
Of course, we could have a base-240 system or a base-147345 system, we’d just need to come up with enough symbols for it.
Second, the symbols must be ordered from smallest to largest. We’ll just use the “natural” ordering shown above, with the digits and letters in the order we traditionally have come to expect them in. There’s nothing inherently wrong with using a different order. If you were crazy enough, you could define a base-4 system that uses the symbols 2, 0, *, and 9, in that order. But the order of the symbols 0 through 9 are so engrained into our minds that such a system would be extremely difficult to use.
For the sake of our own santity, we’ll just stick with the natural order we’ve grown used to.
Third, we can begin counting.
We start at the symbol for zero, which is the first symbol we’ve identified.
In all of the schemes above except the crazy one, that will be just a plain 0
.
(To be fair, most people don’t start counting at 0, but we’re going to here.)
To continue counting, we run through all of the symbols in order until we run out. For a small base like base-2 (binary), we’re going to run out fast. For a big base like base-16 (hexadecimal), it will be a while.
So to begin counting in base-10 (decimal), we start with 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, at which point, we’re out of unique symbols.
To begin counting in base-2 (binary), we start with 0 and 1, at which point, we’re out of unique symbols.
To begin counting in base-16 (hexadecimal), we start with 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F, at which point, we’re out of symbols.
(Pop quiz: what does the first few numbers in base-8 look like?)
At this point, we’re out of symbols to use on their own, and we need to start combining multiple symbols. What do we do in base-10 (decimal)? We move up to the “tens place” and reset the “ones place”. After 9 comes 10. There’s a new digit to represent a larger bundle while the lower digit goes back to zero.
We can apply the same trick in other bases, though we wouldn’t call it the “tens place” anymore.
If we’re counting in base-2, after 0 and 1, we’d move up to the next spot (which would could call the “twos spot” if we wanted): 10. It looks like a ten, and it is. But it is a ten in base-2, rather than a 10 in base-10. It is the third number in the counting sequence (starting at 0), so it is numerically equivalent to the number 2 in base-10. It just looks different.
While knowing how to convert between numbers in different bases is a useful skill, you can always find a converter online. So we won’t worry too much about conversions today, just counting.
Anyway…
In base-10, once we reach 9, we go to 10. In base-2 (binary) once we reach 1, we go to 10. In base-16, after A, B, C, D, E, F, we do the same thing by rolling to the next digit and resetting, and also end up at 10.
From there, we continue counting by incrementing the lowest digit until it overflows, at which point, we reset and bump up the digit higher.
So in binary, we count with 0, 1, 10, 11, 100, 101, 110, 111, 1000.
In base-4, we count with 0, 1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 100.
In base-8, we count with 0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, .. 17, 20, 21, 22, … 32, 33, 100.
In base-16, we count with 0, 1, 2, 3, … D, E, F, 10, 11, 12, … 1F, 20, 21, … FC, FD, FE, FF, 100.
It’s all a bit weird and takes a bit of time to wrap your mind around it, but the exercise is worth the time for programmers, who often need to work with numbers in different bases.
Let’s talk about two other things: negative numbers and numbers with a “decimal” point (or decimal comma/separator in some cultures).
In a purely mathematical world, completely separated from computers, it is completely reasonable to simply use the -
symbol to represent negative numbers in any base, and to use .
(or ,
) to mark the beginning of the fractional part of a number.
-100 is a valid base-2 number, as is 10.1011.
-A30B.88E is a valid base-16 (hexadecimal) number.
It is, perhaps, not fair to call the .
a decimal point, given that “decimal” refers to the base-10 nature. The more generic term is radix point, but unless you’re a math geek, that term might not stick in your mind.
In future blog posts, we’ll look into how this fits in to the programming world. Click here to go on to Part 2.