Here's an abridged sample:
[explain ISO-8859-1]. It appears that MSXML stores the character in the XML document as a two byte encoding. UTF-8 uses variable-width encoding, which means the ASCII character codes between 0 and 127 map directly to their hex values in UTF-8, e.g. the capital letter X, is 0x58 in both ASCII and UTF-8.
The hex value, C2B0 (1100 0100 1011 0000), represents a "lead unit". After some Googling, I found a very clear explanation from Andy Hassall to a similar problem:
Mike Brown explains it this way:
This was pretty much the same thing I was seeing. The XML processing instruction does not specify an encoding, so it defaults to ISO-8859-1. When MSXML renders the two byte degree character to single-byte ISO-8859-1, the first byte is saved as the Unicode lead unit 0xC2, followed by the single byte for the degree symbol, 0xB0. When the XML file is opened with a text editor (jEdit), the first byte is displayed as the a-circumflex, which happens to be 0xC2 in ISO-8859-1.
I fixed the problem by specifying the encoding in the XML processing instruction for MSXML:
<?xml version="1.0" encoding="UTF-8"?>
It works equally well if the encoding is specified as ISO-8895-1. Now the XML declaration includes an explicit processing instruction for the encoding scheme, and MSXML renders the degree character as a single byte, 0xB0.