AAPT2: Encode 4-byte strings in Modified UTF-8

Codepoints that are encoded to 4 bytes in UTF-8 are not allowed in
Modified UTF-8. They instead should be encoded as surrogate pairs in the
same way that CESU-8 allows for surrogate pairs. This will also cause 4
byte UTF-8 codes to be represented in 6 bytes.

Bug: 37140916
Test: aapt2_tests
Change-Id: I155dc24f166139d1d36a16bac088dcfcd59eb321
4 files changed