ed0ecfff891f9b69d5dfd86fa877b90b4566634c - platform_external_arm-optimized-routines

commit	ed0ecfff891f9b69d5dfd86fa877b90b4566634c	[log] [tgz]
author	Szabolcs Nagy <szabolcs.nagy@arm.com>	Wed Jun 06 18:17:16 2018 +0100
committer	Szabolcs Nagy <szabolcs.nagy@arm.com>	Mon Jun 11 16:01:55 2018 +0100
tree	4bbe6eacb0892b4e870356726480adb5110ace8d
parent	f79ee89ad4abe65c84d95c359c577f346cb2aabf [diff]

Add new pow implementation

The algorithm is exp(y * log(x)), where log(x) is computed with about
1.8*2^-66 relative error, returning the result in two doubles, and the
exp part uses the same algorithm (and lookup tables) as exp, but takes
the input as two doubles and a sign (to handle negative bases with odd
integer exponent).

There is separate code path when fma is not available but the worst
case error is about 0.67 ULP in both cases.  The lookup table and
consts for log are 4224 bytes, the code is 1196 bytes.  The non-nearest
rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
latency: 1.8x
thruput: 2.5x

math/math_config.h[diff]
math/pow.c[Added - diff]
math/pow_log_data.c[Added - diff]
test/mathtest.c[diff]
test/testcases/directed/pow.tst[Added - diff]
test/testcases/random/double.tst[diff]

6 files changed

tree: 4bbe6eacb0892b4e870356726480adb5110ace8d