# Real Numbers: Floating point numbers

In Medieval times people who were left handed were considered by some to be evil, a negative view. So if you forget which way to move the decimal point, negative exponents move left. Whilst positive exponents move right |

If you study other subjects such as Physics or Chemistry, you may come across Floating Point numbers like this

(Planck's constant)

The first bit defines the non-zero part of the number and is called the **Mantissa**, the second part defines how many positions we want to move the decimal point, this is known as the **Exponent** and can be positive when moving the decimal point to the right and negative when moving to the left.

If you wanted to write out that number in full you would have to move the decimal point in the exponent 34 places to the left, resulting in:

Which would take a lot of time to write and is very hard for the human eye to see how many zeros there are. Therefore, when we can accept a certain level of accuracy (6.63 = 3 significant figures), we can store a many digit number like planks constant in a small number of digits. You are always weighing up the scope (or range) of the number against its accuracy (number of significant bits).

The same is true with binary numbers and is even more important. When you are dealing with numbers and their computational representation you must always be aware of how much space the numbers will take up in memory. As we saw with the above example, the non floating point representation of a number can take up an unfeasible number of digits, imagine how many digits you would need to store in binary‽

A binary floating point number may consist of 2, 3 or 4 bytes, however the only ones you need to worry about are the 2 byte (16 bit) variety. The first 10 bits are the Mantissa, the last 6 bits are the exponent.

Just like the denary floating point representation, a binary floating point number will have a mantissa and an exponent, though as you are dealing with binary (base 2) you must remember that that instead of having you will have to use .

### Why use binary floating point numbers[edit]

Fixed point binary allows a computer to hold fractions but due to its nature is very limited in its scope. Even using 4 bytes to hold each number, with 8 bits for the fractional part after the point, the largest number that can be held is just over 8 million. Another format is needed for holding very large numbers.

In decimal, very large numbers can be shown with a mantissa and an exponent. i.e. 0.12*10² Here the 0.12 is the mantissa and the 10² is the exponent. the mantissa holds the main digits and the exponents defines where the decimal point should be placed.

The same technique can be used for binary numbers. For example two bytes could be split so that 10 bits are used for the mantissa and the remaining 6 for the exponent. This allows a much greater scope of numbers to be used.

### Converting binary floating point to decimal[edit]

There are several stages to take when working out a floating point number in binary. In fact it is much like a disco dance routine - known on this page as the Noorgat Dance, Kemp variation (you wont be tested on name but it should help you to remember)

*Sign*- find the sign of the mantissa (make a note of this)*Slide*- find the value of the exponent and whether it is positive or negative*Bounce*- move the decimal the distance the exponent asks, left for a negative exponent, right for a positive*Flip*- If the mantissa is negative perform twos complement on it*Swim*- starting at the decimal point work out the values of the mantissa, going left, then right. Now make sure you refer back to the sign you recorded on the sign move.

Example: binary floating point worked example
Lets try it out. We are given the following 16 bit floating point number, with 10 bits for the mantissa, and 6 bits for the exponent. Remember the decimal point is between the first and second most significant bits The first action we need to perform is the It is 0 so the mantissa is positive The second step in the Noorgat dance is the So we know that the exponent is of size positive one and we will have to move the decimal point one place to the right. The third step in the Noorgat dance is the The fourth step is the optional The fifth and final step is the Voila! the answer is 1 |

Exercise: Simple binary floating point
Work out the denary for the following, using 10 bits for the mantissa and 6 bits for the exponent: 0.001101000 000110
1. Sign: the mantissa starts with a zero, therefore it is a 000110 = +6 3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was 0.001101000 -> 0001101.000 4. Flip: as the number isn't negative we don't need to do this 1+4+8 = +13 FINISHED! 0 101000000 111111
1. Sign: the mantissa starts with a zero, therefore it is a 111111 It starts with a one therefore it is a negative number 000001 = -1 3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was 0.101000000 -> 0.0101000000 4. Flip: as the mantissa number isn't negative we don't need to do this 1/4 + 1/16 = +0.3125 FINISHED! 1 011111010 000101
1. Sign: the mantissa starts with a one, therefore it is a 000101 = +5 3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was 1.011111010 -> 101111.1010 4. Flip: the mantissa is negative as noted in step one so we need to convert this number 101111.1010 -> 010000.0110 5. Swim: work out the value on the left hand side and right hand side of the decimal point 16+1/4+1/8 = -16.375 FINISHED! 1 101000000 111101
1. Sign: the mantissa starts with a one, therefore it is a 111101 It starts with a one therefore it is a negative number 000011 = -3 3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was 1.101000000 -> 1.111101000000 note that we placed extra ones on the front of the number. Consider the exponent being negative and the mantissa positive, we would add extra zeros on the front 0.01 * 2^-3 = 0.00001 If both are negative placing zeros in front of the mantissa would make it positive! Therefore we need to add extra ones to keep the mantissa negative With the flip we'll lose these 'extra' ones 4. Flip: the mantissa is negative as noted in step one so we need to convert this number 1.111101000000 -> 0.000011000000 5. Swim: work out the value on the left hand side and right hand side of the decimal point 1/32+1/64 = -0.046875 Remember the number was negative! FINISHED! 1 111111010 000011
1. Sign: the mantissa starts with a one, therefore it is a 000011 = +3 3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was 1.111111010 -> 1111.111010 4. Flip: the mantissa is negative as noted in step one so we need to convert this number 1111.1110100 -> 0000.000110 5. Swim: work out the value on the left hand side and right hand side of the decimal point 1/16+1/32 = -0.09375 Remember the number was negative! FINISHED! |

### Converting denary into binary floating point[edit]

You might also be asked to convert a denary number into its binary floating point equivalent.

- work out the binary equivalent
- work out how far to move the binary point (y)
- set the exponent to be reverse of the number of places you moved the binary point (-y)
- pad the number with extra bits

Example: denary to binary floating point
If we are asked to convert the denary number 39.75 into binary floating point we first need to find out the binary equivalent: 128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 0 0 1 0 0 1 1 1 . 1 1 0 How far do we need to move the binary point to the left so that the number is normlised? 0 0 . 1 0 0 1 1 1 1 1 0 (6 places to the left) So to get our decimal point back to where it started, we need to move 6 places to the right. 6 now becomes your exponent. 0.100111110 | 000110 If you want to check your answer, convert the number above into decimal. You get 39.75! |

Exercise: Simple binary floating point
Work out the binary floating point for the following, using 10 bits for the mantissa and 6 bits for the exponent: 67
128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 0 1 0 0 0 0 1 1 . 0 0 0 How far do we need to move the binary point to the left so that the number is normlised? 0 . 1 0 0 0 0 1 1 0 0 0 (7 places to the left) To get the front to be normalised we must move the decimal point 7 places. (moving it 6 places would have made the number negative!) 0.100001100 | 000111 23.25
128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 0 0 0 1 0 1 1 1 . 0 1 0 How far do we need to move the binary point to the left so that the number is normlised? 0 0 0 . 1 0 1 1 1 0 1 0 (5 places to the left) To get the front to be normalised we must move the decimal point 5 places. (moving it 4 places would have made the number negative!) 0.101110100 | 000101 123.80
128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 0 1 1 1 1 0 1 1 . 1 1 1 How far do we need to move the binary point to the left so that the number is normlised? 0 . 1 1 1 1 0 1 1 1 1 1 (7 places to the left) To get the front to be normalised we must move the decimal point 7 places. 0.1111011111 | 000111 But this is using 11 bits for the mantissa, we have to drop one, losing accuracy! 0.111101111 | 000111 128.25
128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 1 0 0 0 0 0 0 0 . 0 1 0 How far do we need to move the binary point to the left so that the number is normlised? 0.1 0 0 0 0 0 0 0 0 1 0 (8 places to the left) To get the front to be normalised we must move the decimal point 8 places. (moving it 7 plaaces would have made it negative!) 0.100000000 | 001000 Notice that we have had to drop the .25, as this would not have fitted into 10 bits for the mantissa. -513
1024 512 256 128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 0 1 0 0 0 0 0 0 0 0 1 . 0 0 0 Convert this into its negative form using the flipping rule: 1024 512 256 128 64 32 16 8 4 2 1 . ½ ¼ ⅛ 1 0 1 1 1 1 1 1 1 1 1 . 0 0 0 How far do we need to move the binary point to the left so that the number is normlised? 1 . 0 1 1 1 1 1 1 1 1 1 0 0 0 (10 places to the left) To get the front to be normalised we must move the decimal point 10 places. 1.011111111 | 001010 Notice that we have had to drop the last one as this would not have fitted into 10 bits for the mantissa. This means that the number shown is only: 10111111110.0 converting this into denary: 01000000010.0 = -514 You'll look at errors using floating point numbers very soon |

For when you have a 16bit number where the mantissa is 10bits and the exponent is 6 bits:

the **largest positive** number will be:

Mantissa: 0.111111111 Exponent: 011111

the **smallest positive** number will be:

Mantissa: 0.000000001 Exponent: 100000

the **largest negative** number will be:

Mantissa: 1.000000000 Exponent: 011111

the **smallest negative** number will be:

Mantissa: 1.111111111 Exponent: 100000