ARMv8.5 MTE: Add MTE compatible version of memchr.

Reading outside the range of the string is only allowed within 16 byte
aligned granules when MTE is enabled.

This implementation is based on string/aarch64/memchr.S

The 64-bit syndrome value is changed to contain only 16 bytes of data.
The 32 byte loop is unrolled to two 16 byte reads.

Testing done:
optimized-routines/string/test/memchr.c
Booted nanodroid with MTE enabled.
bionic string tests with MTE enabled.
3 files changed