... Are you seriously saying there is no way of ever removing the anomolous characters easily? ...
That's exactly right, unfirtunately. The real problem is that you don't know what to look for as those Binary/Unicode characters are unknown yet as you will have to "translate" (convert if you will) them (if it is possible at all).