Floating point:
7. .625= a) 0011 1111 0010 0000 0000 0000 0000 0000 b) same as (a)
+123 4567 8 |---- truncated/rounded off here
.625 = 1*1/2 + 0/(2*2)+1/(2^3) = 101 with an ASSUMED decimal point after the
first digit, giving 1.01. To indicate that we need to move the radix left 1,
to get the actual value, we need an exponent of -1.
ActualExp+127=-1 +127= 126, which converts to binary as
16*7 + 14 =112 + 14 = 126 = 0x7E = 0111 1110
now add the leading digit (0) for the sign (+), giving 0011 1111 0 (9 bits)
now tack on the fractional part (101)
omitting the leading one, and fill with zeros on the right to get 0011 1111 0 100
now round off (or truncate) the last bit: 0011 1111 0100 0000
8. 25.625= a) 0100 0001 1100 1101 0000 0000 0000 0000 b) same as (a)
+123 4567 8 |--- trunc/round here
Rounding floating-point numbers requires four steps which are: