-
Oct 27th, 2022, 01:15 PM
#1
Safer way to convert full ANSI strings
Thanks to wqweto, I found a faster and safer way to convert full ANSI strings to byte array.
Code:
Option Explicit
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
Private Sub DebugPrintByte(sDescr As String, bArray() As Byte)
Dim lPtr As Long
Debug.Print sDescr & ":"
If GetbSize(bArray) = 0 Then Exit Sub
For lPtr = 0 To UBound(bArray)
Debug.Print Right$("0" & Hex$(bArray(lPtr)), 2) & " ";
If (lPtr + 1) Mod 16 = 0 Then Debug.Print
Next lPtr
Debug.Print
End Sub
Private Function GetbSize(bArray() As Byte) As Long
On Error GoTo GetSizeErr
GetbSize = UBound(bArray) + 1
Exit Function
GetSizeErr:
GetbSize = 0
End Function
Private Sub Command1_Click()
Dim sTmp As String
Dim bTmp() As Byte
sTmp = "0123456789ƒ„…†‡ˆ"
ReDim bTmp(Len(sTmp) - 1)
CopyMemory bTmp(0), ByVal sTmp, Len(sTmp)
DebugPrintByte "bTmp", bTmp
ReDim bTmp(LenB(sTmp) - 1)
CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
DebugPrintByte "bTmp", bTmp
End Sub
The end 6 characters of the string are ANSI characters &H83 to &H88.
bTmp:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
However, when VB stores the characters as wide characters, it converts them to Unicode characters.
bTmp:
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02
J.A. Coutts
-
Oct 27th, 2022, 10:16 PM
#2
Fanatic Member
Re: Safer way to convert full ANSI strings
strconv can do the same thing
-
Nov 1st, 2022, 02:33 AM
#3
Fanatic Member
Re: Safer way to convert full ANSI strings
Originally Posted by xxdoc123
strconv can do the same thing
Code:
Private Sub Command1_Click()
Dim sTmp As String
Dim bTmp() As Byte
sTmp = "小0123456789"
MsgBox lstrlen(sTmp)
bTmp = StrConv(sTmp, vbFromUnicode)
DebugPrintByte "bTmp2", bTmp
Erase bTmp
ReDim bTmp(lstrlen(sTmp) - 1)
CopyMemory bTmp(0), ByVal sTmp, lstrlen(sTmp)
DebugPrintByte "bTmp", bTmp
ReDim bTmp(LenB(sTmp) - 1)
CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
DebugPrintByte "bTmp", bTmp
End Sub
-
Nov 1st, 2022, 11:56 AM
#4
Re: Safer way to convert full ANSI strings
Originally Posted by xxdoc123
strconv can do the same thing
As long as you use StrConv to convert to and from wide character strings, everything is fine. But if you want to send just the single byte characters (ANSI) to the other end, what you end up with is:
30 31 32 33 34 35 36 37 38 39 92 1E 26 20 21 C6
instead of:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
When you are using encryption, you use the entire ANSI range (0 to 255), not just the ASCII range (0 to 127).
It is for that reason (among others), that I no longer use the StrConv function.
J.A. Coutts
-
Nov 1st, 2022, 12:15 PM
#5
Re: Safer way to convert full ANSI strings
I use this code when a byte array has to be a string.
Code:
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" ( _
lpvDest As Any, lpvSource As Any, ByVal cbCopy As Long)
Function ByteArray2String(bIn() As Byte, Bytes As Long) As String
If Bytes Mod 2 = 1 Then
ByteArray2String = StrConv(String$(Bytes, Chr(0)), vbFromUnicode)
Else
ByteArray2String = String$((Bytes + 1) \ 2, Chr(0))
End If
CopyMemory ByVal StrPtr(ByteArray2String), bIn(0), LenB(ByteArray2String)
End Function
Last edited by georgekar; Nov 1st, 2022 at 12:27 PM.
-
Nov 1st, 2022, 07:15 PM
#6
Fanatic Member
Re: Safer way to convert full ANSI strings
Originally Posted by couttsj
As long as you use StrConv to convert to and from wide character strings, everything is fine. But if you want to send just the single byte characters (ANSI) to the other end, what you end up with is:
30 31 32 33 34 35 36 37 38 39 92 1E 26 20 21 C6
instead of:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
When you are using encryption, you use the entire ANSI range (0 to 255), not just the ASCII range (0 to 127).
It is for that reason (among others), that I no longer use the StrConv function.
J.A. Coutts
may be your are right.
-
Nov 5th, 2022, 11:38 AM
#7
Re: Safer way to convert full ANSI strings
Previously I showed how the ANSII string "0123456789ƒ„…†‡ˆ"
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
was stored in memory as:
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02
Now the question arises as to how to restore it back to a string?
If you use StrConv:
Code:
ANSII String: 0123456789ƒ„…†‡ˆ
StrConv vbFromUnicode:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
StrConv vbUnicode:
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02
0123456789ƒ„…†‡ˆ
If you use wide character conversion:
Code:
ANSII String: 0123456789ƒ„…†‡ˆ
StrToUtf8:
30 31 32 33 34 35 36 37 38 39 C6 92 E2 80 9E E2
80 A6 E2 80 A0 E2 80 A1 CB 86
Utf8ToStr:
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02
0123456789ƒ„…†‡ˆ
Using my own routines:
Code:
ANSII String: 0123456789ƒ„…†‡ˆ
StrToByte:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
ByteToStr:
30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
38 00 39 00 83 00 84 00 85 00 86 00 87 00 88 00
0123456789??????
The answer depends entirely on what you want to accomplish, but it is anything but straight forward.
J.A. Coutts
Code:
Option Explicit
Private Const CP_UTF8 = 65001
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cbMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Private Function ByteToStr(bArray() As Byte) As String
Dim lPntr As Long
Dim bTmp() As Byte
On Error GoTo ByteErr
ReDim bTmp(UBound(bArray) * 2 + 1)
For lPntr = 0 To UBound(bArray)
bTmp(lPntr * 2) = bArray(lPntr)
Next lPntr
Let ByteToStr = bTmp
Exit Function
ByteErr:
ByteToStr = ""
End Function
Private Sub DebugPrintByte(sDescr As String, bArray() As Byte)
Dim lPtr As Long
Debug.Print sDescr & ":"
If GetbSize(bArray) = 0 Then Exit Sub
For lPtr = 0 To UBound(bArray)
Debug.Print Right$("0" & Hex$(bArray(lPtr)), 2) & " ";
If (lPtr + 1) Mod 16 = 0 Then Debug.Print
Next lPtr
Debug.Print
End Sub
Private Function GetbSize(bArray() As Byte) As Long
On Error GoTo GetSizeErr
GetbSize = UBound(bArray) + 1
Exit Function
GetSizeErr:
GetbSize = 0
End Function
Private Function StrToByte(strInput As String) As Byte()
Dim bTmp() As Byte
ReDim bTmp(Len(strInput) - 1)
CopyMemory bTmp(0), ByVal strInput, Len(strInput)
StrToByte = bTmp
End Function
Private Function StrToUtf8(strInput As String) As Byte()
Dim nBytes As Long
Dim abBuffer() As Byte
If Len(strInput) < 1 Then Exit Function
' Get length in bytes *including* terminating null
nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, 0&, 0&, 0&, 0&)
' We don't want the terminating null in our byte array, so ask for `nBytes-1` bytes
ReDim abBuffer(nBytes - 2) ' NB ReDim with one less byte than you need
nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, ByVal VarPtr(abBuffer(0)), nBytes - 1, 0&, 0&)
StrToUtf8 = abBuffer
End Function
Public Function Utf8ToStr(abUtf8Array() As Byte) As String
Dim nBytes As Long
Dim nChars As Long
Dim strOut As String
' Catch uninitialized input array
nBytes = GetbSize(abUtf8Array)
If nBytes <= 0 Then Exit Function
' Get number of characters in output string
nChars = MultiByteToWideChar(CP_UTF8, 0&, VarPtr(abUtf8Array(0)), nBytes, 0&, 0&)
' Dimension output buffer to receive string
strOut = String(nChars, 0)
nChars = MultiByteToWideChar(CP_UTF8, 0&, VarPtr(abUtf8Array(0)), nBytes, StrPtr(strOut), nChars)
Utf8ToStr = Replace(strOut, Chr$(0), "") 'Remove Null terminating characters
End Function
Private Sub Command1_Click()
Dim sTmp As String
Dim bTmp() As Byte
sTmp = "0123456789ƒ„…†‡ˆ"
Debug.Print "ANSII String: "; sTmp
ReDim bTmp(Len(sTmp) - 1)
CopyMemory bTmp(0), ByVal sTmp, Len(sTmp)
DebugPrintByte "CopyMemory ByVal", bTmp
ReDim bTmp(LenB(sTmp) - 1)
CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
DebugPrintByte "ByVal StrPtr", bTmp
End Sub
Private Sub Command2_Click()
Dim sTmp As String
Dim sTemp As String
Dim bTmp() As Byte
Dim bTemp() As Byte
sTmp = "0123456789ƒ„…†‡ˆ"
Debug.Print "ANSII String: "; sTmp
bTmp = StrConv(sTmp, vbFromUnicode)
DebugPrintByte "StrConv vbFromUnicode", bTmp
sTemp = StrConv(bTmp, vbUnicode)
Debug.Print sTemp
ReDim bTemp(LenB(sTemp) - 1)
CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
DebugPrintByte "StrConv vbUnicode", bTemp
End Sub
Private Sub Command3_Click()
Dim sTmp As String
Dim sTemp As String
Dim bTmp() As Byte
Dim bTemp() As Byte
sTmp = "0123456789ƒ„…†‡ˆ"
Debug.Print "ANSII String: "; sTmp
bTmp = StrToUtf8(sTmp)
DebugPrintByte "StrToUtf8", bTmp
sTemp = Utf8ToStr(bTmp)
Debug.Print sTemp
ReDim bTemp(LenB(sTemp) - 1)
CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
DebugPrintByte "Utf8ToStr", bTemp
End Sub
Private Sub Command4_Click()
Dim sTmp As String
Dim sTemp As String
Dim bTmp() As Byte
Dim bTemp() As Byte
sTmp = "0123456789ƒ„…†‡ˆ"
Debug.Print "ANSII String: "; sTmp
bTmp = StrToByte(sTmp)
DebugPrintByte "StrToByte", bTmp
sTemp = ByteToStr(bTmp)
Debug.Print sTemp
ReDim bTemp(LenB(sTemp) - 1)
CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
DebugPrintByte "ByteToStr", bTemp
End Sub
-
Nov 5th, 2022, 12:58 PM
#8
Re: Safer way to convert full ANSI strings
I still fail to understand how StrToByte (i.e. CopyMemory ByVal string) is any different that StrConv vbFromUnicode.
Btw, UTF-8 encoded strings are not using wide chars. Only a subset of UTF-16 codepoints can be called wide-chars (the ones which take up 2 bytes, so called UCS-2)
Btw, your ByteToStr is completely wrong for any codepage other than 1252 i.e. the one you (obvious) test this under. For anyone else it yields unwieldy results.
cheers,
</wqw>
-
Nov 5th, 2022, 12:58 PM
#9
Re: Safer way to convert full ANSI strings
I'm trying to figure out what you are trying to do, but if it's just to stuff a byte array into a string, and then get the same bytes back out of the string, then direct assignment string = bytes() and back bytes() = string will do that. When you need to the bytes as unicode (into a TextBox.Text property for example), then using StrConv(bytestring, vbUnicode) will do the trick.
Code:
Option Explicit
Sub Test()
Dim s As String
Dim b() As Byte
Dim ii As Long
s = GetBytes ' Stuff bytes in a string unmolested
b = s ' Stuff string bytes back into a byte array unmolested
Debug.Print
Debug.Print "Bytes after String<->ByteArray roundtrip: "
For ii = LBound(b) To UBound(b)
Debug.Print Hex$(b(ii)) & " ";
Next ii
Debug.Print
Debug.Print
Debug.Print "StrConv ""ANSI bytes"" String to Unicode: " & StrConv(s, vbUnicode)
Debug.Print "StrConv ANSI Byte Array to Unicode: " & StrConv(b, vbUnicode)
End Sub
Function GetBytes() As Byte()
Const BytesHex = "30313233343536373839838485868788"
Dim ii As Long
Dim la_Bytes() As Byte
ReDim la_Bytes(Len(BytesHex) \ 2 - 1)
' Convert Hex "bytes" to actual bytes and stuff them into array
Debug.Print "Source Bytes:"
For ii = LBound(la_Bytes) To UBound(la_Bytes)
la_Bytes(ii) = CLng("&H" & Mid$(BytesHex, ii * 2 + 1, 2))
Debug.Print Hex$(la_Bytes(ii)) & " ";
Next ii
Debug.Print
' Return "ANSI" byte array
GetBytes = la_Bytes
End Function
Results:
Code:
Source Bytes:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
Bytes after String<->ByteArray roundtrip:
30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
StrConv "ANSI bytes" String to Unicode: 0123456789ƒ„…†‡ˆ
StrConv ANSI Byte Array to Unicode: 0123456789ƒ„…†‡ˆ
Or have I missed something?
-
Nov 5th, 2022, 02:57 PM
#10
Re: Safer way to convert full ANSI strings
Originally Posted by jpbro
I'm trying to figure out what you are trying to do, but if it's just to stuff a byte array into a string, and then get the same bytes back out of the string, then direct assignment string = bytes() and back bytes() = string will do that. When you need to the bytes as unicode (into a TextBox.Text property for example), then using StrConv(bytestring, vbUnicode) will do the trick.
Or have I missed something?
What I am trying to get across is that for anyone who thinks that VB wide character strings handle all ANSI characters, they don't. They can only be counted on to retain ASCII characters. Characters above &H7F don't necessarily get stored as ANSI characters.
J.A. Coutts
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|