Module int_to_float

Expand description

Conversions from integers to floats.

The algorithm is explained here: https://blog.m-ou.se/floats/. It roughly does the following:

Calculate a base mantissa by shifting the integer into mantissa position. This gives us a mantissa with the implicit bit set!
Figure out if rounding needs to occur by classifying the bits that are to be truncated. Some patterns are used to simplify this. Adjust the mantissa with the result if needed.
Calculate the exponent based on the base-2 logarithm of i (leading zeros). Subtract one.
Shift the exponent and add the mantissa to create the final representation. Subtracting one from the exponent (above) accounts for the explicit bit being set in the mantissa.

§Terminology

i: the original integer
i_m: the integer, shifted fully left (no leading zeros)
n: number of leading zeroes
e: the resulting exponent. Usually 1 is subtracted to offset the mantissa implicit bit.
m_base: the mantissa before adjusting for truncated bits. Implicit bit is usually set.
adj: the bits that will be truncated, possibly compressed in some way.
m: the resulting mantissa. Implicit bit is usually set.

exp 🔒: Calculate the exponent from the number of leading zeros.
m_adj 🔒: Adjust a mantissa with dropped bits to perform correct rounding.
repr 🔒: Shift the exponent to its position and add the mantissa.
shift_f_gt_i 🔒: Shift distance from an integer with n leading zeros to a smaller float.
shift_f_lt_i 🔒: Shift distance from a left-aligned integer to a smaller float.
signed: Perform a signed operation as unsigned, then add the sign back.
u32_to_f32_bits
u32_to_f64_bits
u32_to_f128_bits
u64_to_f32_bits
u64_to_f64_bits
u64_to_f128_bits
u128_to_f32_bits
u128_to_f64_bits
u128_to_f128_bits