Fix DOM parsing of character references/entities.

Our DOM parser didn't support { or š character references,
and didn't merge adjacent text nodes into one (so "a&b" would be
three text nodes rather than one; SAX allows the former, but DOM
guarantees the latter).

This patch fixes both bugs, and adds tests.

Bug: 2607 (and duplicates)
4 files changed