Where are these strange characters coming from?
:wave:
So I have the following code that uses and xsl file to transform an xml file to html and outputs it to a memory stream:
Code:
Public Shared Function GetTransformedHTML(ByVal PesXmlDoc As Xml.XmlDocument, ByVal TransformFile As String) As IO.MemoryStream
'create an XSL transformation
Dim Transform As New System.Xml.Xsl.XslCompiledTransform
Transform.Load(AppFramework.Configuration.ConfigFiles.GetConfigurationFileFullName(TransformFile))
'perform the transform to an html file in the temp folder
Dim OutStream As New IO.MemoryStream
Transform.Transform(PesXmlDoc, Nothing, OutStream)
Return OutStream
End Function
Then, I output the memory stream to a file using the following:
Code:
Dim Writer As New IO.StreamWriter(OutputFileName)
Dim Buffer As Byte() = HtmlStream.GetBuffer
For I As Integer = 0 To Buffer.Length - 1
Writer.Write(Chr((Buffer(I))))
Next
Writer.Close()
It works great, except for one wrinkle. The first three characters in the file are "". They aren't in the xsl document nor in the xml document - near as I can tell. Where in the world did they come from?
I can hack my way around it by simply skipping the first 3 bytes of the memory stream, but that seems dangerous. What if 'in the wild' those first 3 characters aren't there, or maybe there will be 5, or maybe they will be somewhere else?
Any ideas anyone?
Re: Where are these strange characters coming from?
Have you tried opening this file in say notepad?
Might try using a StreamReader to read it as a text file and see if there are any bad characters in there.
If there aren't any then my best guess would be something is get messed up in load method.
-zd
Re: Where are these strange characters coming from?
Yeah, I think my next test will be to write it to a StreamWriter instead of a MemoryStream in the function that transforms the xml - just as a test to see if the strange characters are coming out of the transformation or are tied to the memorystream somehow.
Re: Where are these strange characters coming from?
So my next test was actually to pass the memory stream to a 3rd party tool we have that converts html to pdf. That worked fine. So then I wondered if there wasn't something about the buffer that was causing the problem, so I changed the second block of code above to look like this:
Code:
HtmlStream.Position = 0
Dim R As New IO.StreamReader(HtmlStream)
Dim W As New IO.StreamWriter(OutputFileName)
W.Write(R.ReadToEnd)
W.Close()
This works fine. So what I'm guessing is that those first few strange characters were some sort of buffer definition bytes. When I just read the stream, those aren't being included.
Anyway, problem solved - but if anybody has any insight into what's going on under the hood I'd be interested to hear.
Re: Where are these strange characters coming from?
These weird 3 characters are the UTF-8 BOM, which is actually discouraged by the Unicode standard. You may disable it using:
Code:
var sw = new IO.StreamWriter(path, new System.Text.UTF8Encoding(false));
doc.Save(sw);
sw.Close();
Re: Where are these strange characters coming from?
Aha! I knew that had to be some explanation. Thanks.