Offer explicit 3-byte vs 4-byte modified UTF-8.
As documented in art/runtime/jni/jni_internal.cc, ART has deviated
from the RI by using a 4-byte encoding instead of the 3-byte encoding
required by the JNI specification.
Some users are okay with this 4-byte encoding (where they control
both the reading and writing logic) but other users require
compatibility with the DataOutput/DataInput API contract, so this
change lets users request either behavior.
This change now exercises all tests in both 4-byte and 3-byte modes,
and exhaustively confirms that all valid code-points match the
DataOutput/DataInput contract when in 3-byte mode.
Benchmark results still show significant performance benefits when
using this 3-byte encoding over the upstream RI:
timeRead_Upstream_mean (ns): 5090068
timeRead_LocalUsing3ByteSequences_mean (ns): 1996032
timeRead_LocalUsing4ByteSequences_mean (ns): 1813250
timeWrite_Upstream_mean (ns): 3856276
timeWrite_LocalUsing3ByteSequences_mean (ns): 1632697
timeWrite_LocalUsing4ByteSequences_mean (ns): 886503
Bug: 236923096
Test: atest FrameworksCoreTests:CharsetUtilsTest
Test: atest FrameworksCoreTests:FastDataTest
Test: atest FrameworksCoreTests:XmlTest
Test: atest FrameworksCoreTests:BinaryXmlTest
Test: ./frameworks/base/libs/hwui/tests/scripts/prep_generic.sh little && atest CorePerfTests:FastDataPerfTest
Change-Id: Ibddd36410a0d4a909522de011f23a337b53d6889
12 files changed