Data Representation: Floating point binary
8-bit two's compliment floating point binary to denary
Floating point binary is a method that allows for the storage of very large and very small numbers by taking advantage of a system very similar to scientific notation where a number is stored in two parts.
First a mantissa which is a number between -1 and 1 and second an exponent which tells us by what power of 2 we need to multiply that to form our number.
Example
For our example let's consider an 8-bit floating point number that stores the mantissa in 4 bits and the exponent in 4 bits both of them in two's compliment. We will convert the binary number 10010101 to denary.
Mantissa |
Exponent |
-1 |
0.5 |
0.25 |
0.125 |
-8 |
4 |
2 |
1 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
-0.875 |
5 |
This means the number is -0.875 x 2^5 = -28
If we change the number of bits allocated to the mantissa and the exponent we affect the range and accuracy with which we can represent numbers. The more bits we use for the exponent the more greater the range of numbers we can represent. The more bits we use for the mantissa the ore accurately we can represent fractional numbers.
With 4 bits for each of mantissa and exponent the smallest number we can make is -1 x 2^7 = -128 and the biggest number is 0.875 x 2^7 = 112.
If we use 5 bits for the mantissa and 3 bits for the exponent the smallest number we can make is -1 x 2^3 = -8 and the biggest number is 0.875 x 2^3 = 7.
It is also possible to calculate the denary number by putting the mantissa out in two's compliment fixed point binary and shifting the bits according to the exponent. To the left if the exponent is positive and to the right if the exponent is negative.
Let's see that with our example of 10010101.
The exponent is plus 5 so we need to move the number 5 places to the left.
-32 |
16 |
8 |
4 |
2 |
1 |
.5 |
.25 |
.125 |
|
|
|
|
|
1 |
0 |
0 |
1 |
Old_number |
1 |
0 |
0 |
1 |
|
|
|
|
|
New_number |
Reading off the shifted number you can see we still get -28.
Denary to 8-bit two's compliment floating point binary
The process of converting from denary to 8 bit two's compliment floating point binary involves writing out the number in standard two's compliment fixed point binary using as many bits as is necessary. Then shift the number so
the most significant bit becomes -1. if you move the number to the right, then increment the exponent by 1 and if you move it left, then decrement the exponent by 1.
Example
To convert the number -5 to floating point binary with 4 bits for mantissa and 4 bits for exponent both in two'c compliment format we first write out the number in fixed point two's compliment form. Then shift the number so the most significant bit becomes the
1 column.
-8 |
4 |
2 |
(-)1 |
.5 |
.25 |
.125 |
1 |
0 |
1 |
1 |
|
|
|
|
|
|
1 |
0 |
1 |
1 |
This number was shifted right three places so the exponent is 3. In two's compliment in 4 bits that is:
Exponent |
-8 |
4 |
2 |
1 |
0 |
0 |
1 |
1 |
3 |
So the final number becomes:
Mantissa |
Exponent |
-1 |
0.5 |
0.25 |
0.125 |
-8 |
4 |
2 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
-0.675 |
3 |
By multplying 0.625 x 23 we get -5.
Denary to 8-bit two's compliment floating point binary with 4 bits for the mantissa and 4 the exponent practice.
I haven't figured a sensible way to generate the right table for each question so instead I suggest you use pen and paper to draw out the number, shift the bits, and then enter your answer.
Click the button to get a number to convert