Can anyone offer a more ellegant/efficient solution?
I've writen a function that cleans out an XML string. The string of XML eventually gets read into a dataset like so
so this function is setup to clean the offending characters without removing them if they are actually valid xml characters.Code:favoriteDataSet.ReadXml(New System.IO.StringReader(sCleanedXML))
So if I sent in
<Some&Text/ToClean> blah blah blah </Some&Text/ToClean>
I would get back
<Some_Text_ToClean> blah blah blah </Some_Text_ToClean>
It works but I'd really like to learn how to implement it more efficiently.
It accepts...
sOriginal which is a long string of xml with tags like blah blah blah
asBetween which is a string of characters to look between to find the offending characters. The string of characters is pipe delimited. (ie. asBetween = "<|>")
sRemovalTokens which is a string of pipe delimited values to clean out. (ie. sRemovalTokens = "&|/|#" )
Here is the code...
Code:Private Function CleanXMLTags(sOriginal As String, asBetween As String, sRemovalTokens As String) As String Try Dim aBookEnds() As String = asBetween.Split("|") Dim aRemovalTokens() As String = sRemovalTokens.Split("|") Dim iStartIndex As Integer = 0 Dim iEndIndex As Integer = 0 Dim sCurToken As String = String.Empty Dim sReplaceToken As String = String.Empty For i As Integer = 0 To aRemovalTokens.Length - 1 If aBookEnds.Length = 2 AndAlso Not String.Equals(aRemovalTokens(i).Trim, "", StringComparison.CurrentCultureIgnoreCase) Then iStartIndex = 0 iEndIndex = 0 While iStartIndex > -1 iStartIndex = sOriginal.IndexOf(aBookEnds(0), iEndIndex) iEndIndex = sOriginal.IndexOf(aBookEnds(1), iEndIndex + 1) If iStartIndex < 0 OrElse iEndIndex < 0 Then Exit While End If sCurToken = sOriginal.ToString.Substring(iStartIndex, iEndIndex - iStartIndex + 1) If sCurToken.ToString.StartsWith("</") Then sCurToken = sCurToken.Substring(2) End If If sCurToken.ToString.EndsWith("/>") Then sCurToken = sCurToken.Substring(0, sCurToken.Length - 2) End If If sCurToken.Contains(aRemovalTokens(i)) Then sReplaceToken = sCurToken.Replace(aRemovalTokens(i), "_") sOriginal = sOriginal.Replace(sCurToken, sReplaceToken) End If End While End If Next Return sOriginal 'Cleaned Catch ex As Exception Return sOriginal End Try End Function


Reply With Quote

