string: Optimize strlen

Optimize strlen using a mix of scalar and SIMD code. On modern micro
architectures large strings are 55% faster than the current version,
and 35% faster than strlen-mte.  On the random strlen benchmark the
speedup is 3.4% and 40% respectively.
1 file changed