This can get a little complicated. If you want UTF-8 you can't just use the DOM's .xml property because it always converts to UTF-16LE/UCS-2. It has to, because it returns a String and a String is always UTF-16.
Note the differences in results from the two MsgBox popups (modified example from MSDN Library):
First:Code:Dim xmlDoc As New MSXML2.DOMDocument40 Const attribs As String = _ "version=""1.0"" encoding=""UTF-8"" standalone=""no""" Dim pi As IXMLDOMProcessingInstruction Dim stm As New ADODB.Stream xmlDoc.async = False xmlDoc.loadXML "<root><child/></root>" If xmlDoc.parseError.errorCode <> 0 Then MsgBox xmlDoc.parseError.reason, , "Parse error" Else Set pi = xmlDoc.createProcessingInstruction("xml", attribs) xmlDoc.insertBefore pi, xmlDoc.childNodes.Item(0) MsgBox xmlDoc.xml, , "String property .xml" With stm .Open xmlDoc.save stm .Position = 0 .Type = adTypeText .Charset = "utf-8" MsgBox .ReadText(adReadAll), , "Method .save output converted to UTF-16" .Close End With End If
Second:Code:<?xml version="1.0" standalone="no"?> <root><child/></root>
The second result is not UTF-8. It shows the UTF-8 output from the xmlDoc.save after converting it from UTF-8 to "Unicode" (UTF-16LE) using an ADODB.Stream object.Code:<?xml version="1.0" encoding="UTF-8" standalone="no"?> <root><child/></root>




Reply With Quote