Results 1 to 6 of 6

Thread: Where are these strange characters coming from?

  1. #1

    Thread Starter
    Frenzied Member dolot's Avatar
    Join Date
    Nov 2007
    Location
    Music city, U.S.A.
    Posts
    1,253

    Where are these strange characters coming from?


    So I have the following code that uses and xsl file to transform an xml file to html and outputs it to a memory stream:
    Code:
            Public Shared Function GetTransformedHTML(ByVal PesXmlDoc As Xml.XmlDocument, ByVal TransformFile As String) As IO.MemoryStream
    
                'create an XSL transformation
                Dim Transform As New System.Xml.Xsl.XslCompiledTransform
                Transform.Load(AppFramework.Configuration.ConfigFiles.GetConfigurationFileFullName(TransformFile))
    
                'perform the transform to an html file in the temp folder
                Dim OutStream As New IO.MemoryStream
                Transform.Transform(PesXmlDoc, Nothing, OutStream)
                Return OutStream
    
            End Function
    Then, I output the memory stream to a file using the following:
    Code:
                                Dim Writer As New IO.StreamWriter(OutputFileName)
                                Dim Buffer As Byte() = HtmlStream.GetBuffer
                                For I As Integer = 0 To Buffer.Length - 1
                                    Writer.Write(Chr((Buffer(I))))
                                Next
                                Writer.Close()
    It works great, except for one wrinkle. The first three characters in the file are "". They aren't in the xsl document nor in the xml document - near as I can tell. Where in the world did they come from?

    I can hack my way around it by simply skipping the first 3 bytes of the memory stream, but that seems dangerous. What if 'in the wild' those first 3 characters aren't there, or maybe there will be 5, or maybe they will be somewhere else?

    Any ideas anyone?
    I always add to the reputation of those whose posts are helpful, and even occasionally to those whose posts aren't helpful but who obviously put forth a valiant effort. That is, when the system will allow it.
    My war with a browser-redirect trojan

  2. #2
    Lively Member
    Join Date
    Oct 2006
    Location
    USA
    Posts
    122

    Re: Where are these strange characters coming from?

    Have you tried opening this file in say notepad?

    Might try using a StreamReader to read it as a text file and see if there are any bad characters in there.

    If there aren't any then my best guess would be something is get messed up in load method.

    -zd

  3. #3

    Thread Starter
    Frenzied Member dolot's Avatar
    Join Date
    Nov 2007
    Location
    Music city, U.S.A.
    Posts
    1,253

    Re: Where are these strange characters coming from?

    Yeah, I think my next test will be to write it to a StreamWriter instead of a MemoryStream in the function that transforms the xml - just as a test to see if the strange characters are coming out of the transformation or are tied to the memorystream somehow.
    I always add to the reputation of those whose posts are helpful, and even occasionally to those whose posts aren't helpful but who obviously put forth a valiant effort. That is, when the system will allow it.
    My war with a browser-redirect trojan

  4. #4

    Thread Starter
    Frenzied Member dolot's Avatar
    Join Date
    Nov 2007
    Location
    Music city, U.S.A.
    Posts
    1,253

    Re: Where are these strange characters coming from?

    So my next test was actually to pass the memory stream to a 3rd party tool we have that converts html to pdf. That worked fine. So then I wondered if there wasn't something about the buffer that was causing the problem, so I changed the second block of code above to look like this:
    Code:
                                HtmlStream.Position = 0
                                Dim R As New IO.StreamReader(HtmlStream)
                                Dim W As New IO.StreamWriter(OutputFileName)
                                W.Write(R.ReadToEnd)
                                W.Close()
    This works fine. So what I'm guessing is that those first few strange characters were some sort of buffer definition bytes. When I just read the stream, those aren't being included.

    Anyway, problem solved - but if anybody has any insight into what's going on under the hood I'd be interested to hear.
    I always add to the reputation of those whose posts are helpful, and even occasionally to those whose posts aren't helpful but who obviously put forth a valiant effort. That is, when the system will allow it.
    My war with a browser-redirect trojan

  5. #5
    Serge's Avatar
    Join Date
    Feb 1999
    Location
    Scottsdale, Arizona, USA
    Posts
    2,744

    Re: Where are these strange characters coming from?

    These weird 3 characters are the UTF-8 BOM, which is actually discouraged by the Unicode standard. You may disable it using:

    Code:
    var sw = new IO.StreamWriter(path, new System.Text.UTF8Encoding(false));
    doc.Save(sw);
    sw.Close();

  6. #6

    Thread Starter
    Frenzied Member dolot's Avatar
    Join Date
    Nov 2007
    Location
    Music city, U.S.A.
    Posts
    1,253

    Re: Where are these strange characters coming from?

    Aha! I knew that had to be some explanation. Thanks.
    I always add to the reputation of those whose posts are helpful, and even occasionally to those whose posts aren't helpful but who obviously put forth a valiant effort. That is, when the system will allow it.
    My war with a browser-redirect trojan

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width