Results 1 to 8 of 8

Thread: How Do You Ignore Illegal Characters In An XmlDocument?

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Jul 2006
    Posts
    219

    How Do You Ignore Illegal Characters In An XmlDocument?

    Hi,

    I'm trying to read quite a large (7mb) XML file which has quite a lot of illegal characters in it. How do I ignore them or ignore the run-time exceptions?

    Thanks

    Louix

  2. #2
    Frenzied Member
    Join Date
    Jul 2008
    Location
    Rep of Ireland
    Posts
    1,380

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    I dunno if this would work but its just a thought, I would read the file as a text document into a string variable and regex.replace() whatever needed to be changed to make the document "valid" then save and re-read.

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Jul 2006
    Posts
    219

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    I have this code which uses RegEx, but it doesn't work:

    Code:
       
            String xmlFile = File.ReadAllText("myfile.xml"));
    
           string xmlFileFinal = Regex.Replace(xmlFile, "#x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])", "", RegexOptions.IgnoreCase);
     
            XmlDocument doc = new XmlDocument();
    
         
            doc.LoadXml(xmlFileFinal);

  4. #4
    Frenzied Member
    Join Date
    Jul 2008
    Location
    Rep of Ireland
    Posts
    1,380

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    I'd need the XML file to verify your regex string or you could google regex buddy. It is a program that will help you verify your regex. That will show you if it is your regex that is wrong.

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Jul 2006
    Posts
    219

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    I'm not allowed to give you the XML file but I can tell you that I'm using XML 1.0 and my RegEx looks like this:

    #x((10?|[2-F])FFF[EF]|FDD[0-9A-F]|7F|8[0-46-9A-F]9[0-9A-F])

  6. #6
    Hyperactive Member Arrow_Raider's Avatar
    Join Date
    Dec 2001
    Location
    AVR Lovers Club
    Posts
    423

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    Can you tell us why the xml file has illegal characters in it? That seems like a very odd situation. If the file is corrupted or something, removing the illegal characters won't fix it.
    My monkey wearing the fedora points and laughs at you.

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Jul 2006
    Posts
    219

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    We are converting a CSV file from our product supplier to XML. The CSV file already has some illegal characters in it, so they get passed through when we convert it. Any way to ignore these exceptions and continue reading the file?

  8. #8

    Thread Starter
    Addicted Member
    Join Date
    Jul 2006
    Posts
    219

    Re: How Do You Ignore Illegal Characters In An XmlDocument?

    I found the solution!!

    http://seattlesoftware.wordpress.com...lid-character/

    Very handy!!

    Thank you for your help guys

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width