This node is an introduction to common number systems used in computers. It is designed to help the uninitiated get familar with how they work.
Binary numbers are composed of only 0's and 1's. The 1 means "on" or True, 0 means "off" or False. A single binary number is called a bit.
The most common use of binary is with fixed length strings of bits. Common examples are 8-bit, 16-bit, and 32-bit. An 8-bit example would look like this:
00000000
In our example, we'll assume the bit on the left end <- is the "highest order bit" and the bit farthest right -> is the "lowest order bit". This refers to the magnitude each bit carries. The sequence starts on the right with 1, and increments by powers of 2 as you traverse to the left. Here is a graphical translation of that for 8 bits:
0 0 0 0 0 0 0 0
128 64 32 16 8 4 2 1
So the binary number
01000001
is 65. The 7th-bit is 64, the 1st-bit is 1, add them together to get 65. Another way to compute the total value if you know the position of the bits (m,n,.., 7 and 1 in this case) is 2^(m-1) + 2^(n-1) + 2^(..-1). In this case 2^(7-1) + 2^(1-1) = 2^6 + 2^0 = 64 + 1 = 65.
Now to find what the total range of a fixed length of bits is, you could set all the bits to 1 and add them up:
11111111
That will give you 255. The actual range is 256: 0 is a possible value. Note that 255 - 128 = 127. So the bits below the highest order bit add up to one less than the highest order bit. The easier way to calculate the total range is just (2^m) where m is the number of the highest order bit, 8 in this case. 2^8 = 256.
Time for some real world examples of range. Your graphical display interface has a setting that specifies the total number of colors available. This setting is usually 8-bit, 16-bit, 24-bit, or 32-bit. So if you're on an 8-bit display you have (2^8) colors, or 256. If you're on a 24-bit display, you have (2^24) colors, or 16777216. This is the often quoted "16 million colors". Another example is the text you read or write. Plain text is dictated by the ASCII standard, which declares 128 number to character relations. Since they used only 128 characters, only 7 bits are needed (we'll add the eighth bit here anyways). The above example of 65 translates to a capital A. Here is a snippit from 'man ascii' showing a few of the assignments:
Oct Dec Hex Char Oct Dec Hex Char
001 1 01 SOH 101 65 41 A
002 2 02 STX 102 66 42 B
003 3 03 ETX 103 67 43 C
004 4 04 EOT 104 68 44 D
...
040 32 20 SPACE 140 96 60 `
041 33 21 ! 141 97 61 a
042 34 22 " 142 98 62 b
043 35 23 # 143 99 63 c
044 36 24 $ 144 100 64 d
The decimal value (Dec column) is the one you're used to. Try to spell out AaBbCc in binary using the table above (really, TRY IT, it's the best way to learn. Pencil and paper are good for this, but whatever tickles your fancy).
01000001 A
01100001 a
01000010 B
01100010 b
01000011 C
01100011 c
It should be obvious that to get from A to B to C or a to b to c you just add one. What you might not have noticed was that to get from A to a, B to b, and C to c, you only had to turn one extra bit on, the 6th-bit. The 6th-bit is 2^(6-1) = 32. Looking at the decimal values for the letters, this makes sense. 97 - 65 = 32, 98 - 66 = 32, 99 - 67 = 32. According to 'man ascii', only having to change one bit for a lower to upper case conversion made it easier for manual encoders (people) to encode manually (what manual encoders do best).
A final note on text conversions. Other standards, such as those in ISO 8859, use 8 bits (256 possible values). For the lower order 7 bits (0-128), they use the ASCII conversions. These are called character sets, and there are many different ones to accomodate different characters used in parts of the world. The current list from 'man iso_8859-1':
ISO 8859-1 west European languages (Latin-1)
ISO 8859-2 east European languages (Latin-2)
ISO 8859-3 southeast European and miscellaneous languages (Latin-3)
ISO 8859-4 Scandinavian/Baltic languages (Latin-4)
ISO 8859-5 Latin/Cyrillic
ISO 8859-6 Latin/Arabic
ISO 8859-7 Latin/Greek
ISO 8859-8 Latin/Hebrew
ISO 8859-9 Latin-1 modification for Turkish (Latin-5)
ISO 8859-10 Lappish/Nordic/Eskimo languages (Latin-6)
ISO 8859-11 Thai
ISO 8859-13 Baltic Rim languages (Latin-7)
ISO 8859-14 Celtic (Latin-8)
ISO 8859-15 west European languages (Latin-9)
Understanding binary can be useful for understanding octal. Octal is simply base 8. Nothing more, nothing less. If you understand how to use the different bases, this will just be review. However, considering we are indoctrinated with base 10 from our first days in the schoolhouse, it is no wonder many people have trouble with octal. First, here is an example of base 10:
1425
What is that number? 1425 you say? It depends on what base you're in. Let's expand it in base 10 form:
1 4 2 5
1000 100 10 1
You may have learned to do addition like this in elementary school. As you move from right to left, each digit gets a zero added to it:
5
20
400
1000
----
1425
Neat, huh? Except you're not really adding a zero to each, you're raising ten to the power of the position of the digit, starting on the right. 5 is in the 0 position, 2 in the 1 position, 4 in 2, 1 in 3. So what you're actually doing is:
5 * 10^0 = 5 * 1 = 5
2 * 10^1 = 2 * 10 = 20
4 * 10^2 = 4 * 100 = 400
1 * 10^3 = 1 * 1000 = 1000
Make sense? If not, re-read it until you understand it. Now in comes octal, or base 8. The "base" literally means the base of the exponential operation, the number raised to the position of the digit. So taking the sequence 1425, we get:
5 * 8^0 = 5 * 1 = 5
2 * 8^1 = 2 * 8 = 16
4 * 8^2 = 4 * 64 = 256
1 * 8^3 = 1 * 512 = 512
Which yields 789. So now you understand that 1425 is not the number 1425, but rather a sequence of digits before being operated on. As one would expect, since 8 < 10, the resulting 789 < 1425. Note that in base 10 the available digits are 0-9. In octal, only 0-7 are used.
There a several interesting things in the relationship between binary and octal. To demonstrate this, we'll be using the programming construct &, which (in C at least), will combine binary strings. It will turn on bits which are present in each operand (the things before and after the &), and turn off bits which are not present in either, or are present only in one. For example (note that this is pseudo code: you can't actually use plain binary strings in C):
00100011 & 00100010 = 00100010
In C, octal values are differentiated from decimal values by preceeding the number with a 0, ie 0654. Say you want to turn on just the three lowest order bits in a binary string. You would need to know what the lowest three order bits add up to.
000000111
Using the chart above, we can see that the on bits are 1, 2, 4, which add up to 1 + 2 + 4 = 7. So (01010101 could be any binary string):
01010101 & 7 = 000000111
Great, that was easy. Now let's turn on just the next three bits, 4, 5, 6. We can see that 2^(4-1) + 2^(5-1) + 2^(6-1) = 2^3 + 2^4 + 2^5 = 8 + 24 + 32 = 56. So
01010101 & 56 = 000111000
And for the next three it would be 448. After that is gets really messy, more than anyone would want to deal with. Now octal enters the picture again. Instead of using 7, 56, 488, etc, we can just use octal 7 in different positions.
01010101 & 07 = 000000111
01010101 & 070 = 000111000
01010101 & 0700 = 111000000
And so on. If we examine this, we can clearly see why this is:
0 * 8^0 = 0 * 1 = 0
7 * 8^1 = 7 * 8 = 56
7 * 8^2 = 7 * 64 = 448
This is a very common use of octal in the computer world. The most prevelant example in the Unix world is file permissions. There are separate permissions for owner, group, and other. These permissions are read, write, and execute.
Owner Group Other
000 000 000
RWE RWE RWE
Thus, to give all permissions to the owner, we would use 0700. For owner and other it would be 0707, and to give everyone all permissions it would be be 0777. So now you know better than to 'chmod 777 world', right?
Hexadecimal is base 16. Used in the same fashion as octal, just for larger bit strings. A good exercise for the reader would be to experiment with hex and see what sorts of bit patterns can be produced. To use hex in C, preceed the hex number with 0x, ie 0x330.
5 * 16^0 = 5 * 1 = 5
2 * 16^1 = 2 * 16 = 32
4 * 16^2 = 4 * 256 = 1024
1 * 16^3 = 1 * 4096 = 4096
As with octal, the available digits are different than with base 10. In this case, 0-15 are used. However, it is impossible to distinguish 10 from a 1 followed by a 0, so the numbers 10-15 are represented by the letters A-F.
Decimal Hexadecimal
10 A
11 B
12 C
13 D
14 E
15 F
0x3F8 therefore expands to:
8 * 16^0 = 8 * 1 = 8
F * 16^1 = 15 * 16 = 240
3 * 16^2 = 3 * 768
Example
Here is a simple C program that will convert any base 10 integer to any other base and print the result. Compile with 'gcc -o base base.c -lm'.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main (int argc, char *argv)
{
/* Base to work in */
int base = 8;
/* Number to operate on */
int number = 1425;
int i, final = 0;
/* Get each digit, multiply by (base ^ position) */
for (i = 0; number; i++)
{
final += ((number % 10) * pow (base, i));
number = number/10;
}
printf ("%d\n", final);
}