Module sqrt

Source
Expand description

Generic square root algorithm.

This routine operates around m_u2, a U.2 (fixed point with two integral bits) mantissa within the range [1, 4). A table lookup provides an initial estimate, then goldschmidt iterations at various widths are used to approach the real values.

For the iterations, r is a U0 number that approaches 1/sqrt(m_u2), and s is a U2 number that approaches sqrt(m_u2). Recall that m_u2 โˆˆ [1, 4).

With Newton-Raphson iterations, this would be:

  • w = r * r w ~ 1 / m
  • u = 3 - m * w u ~ 3 - m * w = 3 - m / m = 2
  • r = r * u / 2 r ~ r

(Note that the righthand column does not show anything analytically meaningful (i.e. r ~ r), since the value of performing one iteration is in reducing the error representable by ~).

Instead of Newton-Raphson iterations, Goldschmidt iterations are used to calculate s = m * r:

  • s = m * r s ~ m / sqrt(m)
  • u = 3 - s * r u ~ 3 - (m / sqrt(m)) * (1 / sqrt(m)) = 3 - m / m = 2
  • r = r * u / 2 r ~ r
  • s = s * u / 2 s ~ s

The above is precise because it uses the original value m. There is also a faster version that performs fewer steps but does not use m:

  • u = 3 - s * r u ~ 3 - 1
  • r = r * u / 2 r ~ r
  • s = s * u / 2 s ~ s

Rounding errors accumulate faster with the second version, so it is only used for subsequent iterations within the same width integer. The first version is always used for the first iteration at a new width in order to avoid this accumulation.

Goldschmidt has the advantage over Newton-Raphson that sqrt(x) and 1/sqrt(x) are computed at the same time, i.e. there is no need to calculate 1/sqrt(x) and invert it.

Enumsยง

Exp ๐Ÿ”’
Representation of whether we shift the exponent into a u32, or modify it in place to save the shift operations.

Staticsยง

RSQRT_TAB ๐Ÿ”’
A U0.16 representation of 1/sqrt(x).

Traitsยง

SqrtHelper
Size-specific constants related to the square root routine.

Functionsยง

goldschmidt ๐Ÿ”’
Perform count goldschmidt iterations, returning (r_u0, s_u?).
sqrt
sqrt_round
wmulh ๐Ÿ”’
Multiply at the wider integer size, returning the high half.