Switch to a working UTF-8 mb/wc implementation.
Although glibc gets by with an 8-byte mbstate_t, OpenBSD uses 12 bytes (of
the 128 bytes it reserves!).
We can actually implement UTF-8 encoding/decoding with a 0-byte mbstate_t
which means we can make things work on LP32 too, as long as we accept the
limitation that the caller needs to present us with a complete sequence
before we'll process it.
Our behavior is fine when going from characters to bytes; we just
update the source wchar_t** to say how far through the input we got.
I'll come back and use the 4 bytes we do have to cope with byte sequences
split across multiple input buffers. The fact that we don't support
UTF-8 sequences longer than 4 bytes plus the fact that the first byte of
a UTF-8 sequence encodes the length means we shouldn't need the other
fields OpenBSD used (at the cost of some recomputation in cases where a
sequence is split across buffers).
This patch also makes the minimal changes necessary to setlocale(3) to
make us behave like glibc when an app requests UTF-8. (The difference
being that our "C" locale is the same as our "C.UTF-8" locale.)
Change-Id: Ied327a8c4643744b3611bf6bb005a9b389ba4c2f
diff --git a/libc/Android.mk b/libc/Android.mk
index a0eb612..8c2ebd6 100644
--- a/libc/Android.mk
+++ b/libc/Android.mk
@@ -214,6 +214,7 @@
bionic/utimes.cpp \
bionic/wait.cpp \
bionic/wchar.cpp \
+ bionic/wctype.cpp \
libc_upstream_freebsd_src_files := \
upstream-freebsd/lib/libc/gen/ldexp.c \
@@ -356,6 +357,7 @@
upstream-openbsd/lib/libc/locale/wcstoumax.c \
upstream-openbsd/lib/libc/locale/wcsxfrm.c \
upstream-openbsd/lib/libc/locale/wctob.c \
+ upstream-openbsd/lib/libc/locale/wctomb.c \
upstream-openbsd/lib/libc/stdio/asprintf.c \
upstream-openbsd/lib/libc/stdio/clrerr.c \
upstream-openbsd/lib/libc/stdio/fdopen.c \