AAPT2: Encode 4-byte strings in Modified UTF-8
Codepoints that are encoded to 4 bytes in UTF-8 are not allowed in
Modified UTF-8. They instead should be encoded as surrogate pairs in the
same way that CESU-8 allows for surrogate pairs. This will also cause 4
byte UTF-8 codes to be represented in 6 bytes.
Bug: 37140916
Test: aapt2_tests
Change-Id: I155dc24f166139d1d36a16bac088dcfcd59eb321
diff --git a/tools/aapt2/util/Util.h b/tools/aapt2/util/Util.h
index 0eb35d1..36b7333 100644
--- a/tools/aapt2/util/Util.h
+++ b/tools/aapt2/util/Util.h
@@ -197,6 +197,9 @@
return error_.empty();
}
+// Converts a UTF8 string into Modified UTF8
+std::string Utf8ToModifiedUtf8(const std::string& utf8);
+
// Converts a UTF8 string to a UTF16 string.
std::u16string Utf8ToUtf16(const android::StringPiece& utf8);
std::string Utf16ToUtf8(const android::StringPiece16& utf16);