Replacing characters in a string
The following code, instead of working as expected, replaces the characters in lookup with question marks, so that, for instance, the string
"l'élection" becomes "L'?lection". Note that the acute accent e is represented by the character code 233 in the string. If I substitute normal alphabetic characters in the ChangeTo array (e.g. values such as 88 for "X" instead of 130) it converts to them OK. Can you tell me what is wrong here?
Code:
Public Function UnFixAccents(ByVal p1 As String) As String
Static ChangeTo() As Byte = {130, 133, 138, 136, 131, 135, 137, 132, 134, 129, 139, 140, 150, 151, 154, 156, 147}
Static Lookup() As Byte = {233, 224, 232, 234, 226, 231, 235, 228, 229, 252, 239, 238, 251, 249, 220, 163, 244}
Dim temp As New StringBuilder(p1)
Dim c As Char
For i As Integer = 0 To Lookup.Count - 1
c = ChrW(Lookup(i))
temp.Replace(ChrW(Lookup(i)), ChrW(ChangeTo(i)))
Next
Return temp.ToString
End Function
Re: Replacing characters in a string
This is ChrW of ChangeTo >
This is Chr of ChangeTo > ‚…Šˆƒ‡‰„†‹Œ–—šœ“
This is ChrW of LookUp > éàèêâçëäåüïîûùÜ£ô
This is Chr of LookUp > éàèêâçëäåüïîûùÜ£ô
Would I be right in thinking that you assumed you'd get ASCII?
Re: Replacing characters in a string
You've got your numeric unicode values wrong. For this to work, 'e' is not a value of 130 in decimal, it's 101. so 'e' with the accent, to the regular 'e' would be from 233, to 101 in decimal.
vbnet Code:
Static ChangeTo() As Byte = {130, 133, 138....
Should be:
vbnet Code:
Static ChangeTo() As Byte = {101, 133, 138....
And I haven't checked the rest.
Re: Replacing characters in a string
Quote:
This is ChrW of ChangeTo >
This is Chr of ChangeTo > ‚…Šˆƒ‡‰„†‹Œ–—šœ“
This is ChrW of LookUp > éàèêâçëäåüïîûùÜ£ô
This is Chr of LookUp > éàèêâçëäåüïîûùÜ£ô
Would I be right in thinking that you assumed you'd get ASCII?
Yes, I wanted ASCII. So what is my problem?
Re: Replacing characters in a string
Quote:
Originally Posted by
AceInfinity
You've got your numeric unicode values wrong. For this to work, 'e' is not a value of 130 in decimal, it's 101. so 'e' with the accent, to the regular 'e' would be from 233, to 101 in decimal.
vbnet Code:
Static ChangeTo() As Byte = {130, 133, 138....
Should be:
vbnet Code:
Static ChangeTo() As Byte = {101, 133, 138....
And I haven't checked the rest.
I do not want to replace accented e with unaccented e. I want to replace accented e when it is represented as code 233 with accented e as represented by 130. So why do I get question marks?
Re: Replacing characters in a string
Self-evidently, VB.Net characters aren't in ASCII! Windows hasn't used ASCII since ... well, actually I don't think it's ever used ASCII. The latest versions of Windows use UTF-16 (see the complete character set here).
Re: Replacing characters in a string
You get question marks because there is no printable ChrW(130) and you'd get a , from Chr(130) because there is no accented e in the character set at this point either. It really shouldn't be that difficult to work out that the character set isn't going to contain two copies of the same character whether it's ASCII or not.
Re: Replacing characters in a string
You're going to have to change the encoding then. Where are you outputting these values? If this is a textbox then you'll have to be using a font that will display these characters.
Re: Replacing characters in a string
I've got it figured out now. I should have used CHR instead of CHRW. Now everyting's fine.