Results 1 to 10 of 10

Thread: Safer way to convert full ANSI strings

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Dec 2012
    Posts
    1,477

    Safer way to convert full ANSI strings

    Thanks to wqweto, I found a faster and safer way to convert full ANSI strings to byte array.
    Code:
    Option Explicit
    
    Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
    
    Private Sub DebugPrintByte(sDescr As String, bArray() As Byte)
        Dim lPtr As Long
        Debug.Print sDescr & ":"
        If GetbSize(bArray) = 0 Then Exit Sub
        For lPtr = 0 To UBound(bArray)
            Debug.Print Right$("0" & Hex$(bArray(lPtr)), 2) & " ";
            If (lPtr + 1) Mod 16 = 0 Then Debug.Print
        Next lPtr
        Debug.Print
    End Sub
    
    Private Function GetbSize(bArray() As Byte) As Long
        On Error GoTo GetSizeErr
        GetbSize = UBound(bArray) + 1
        Exit Function
    GetSizeErr:
        GetbSize = 0
    End Function
    
    Private Sub Command1_Click()
        Dim sTmp As String
        Dim bTmp() As Byte
        sTmp = "0123456789ƒ„…†‡ˆ"
        ReDim bTmp(Len(sTmp) - 1)
        CopyMemory bTmp(0), ByVal sTmp, Len(sTmp)
        DebugPrintByte "bTmp", bTmp
        ReDim bTmp(LenB(sTmp) - 1)
        CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
        DebugPrintByte "bTmp", bTmp
    End Sub
    The end 6 characters of the string are ANSI characters &H83 to &H88.
    bTmp:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88

    However, when VB stores the characters as wide characters, it converts them to Unicode characters.
    bTmp:
    30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
    38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02

    J.A. Coutts

  2. #2
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    679

    Re: Safer way to convert full ANSI strings

    strconv can do the same thing

  3. #3
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    679

    Re: Safer way to convert full ANSI strings

    Quote Originally Posted by xxdoc123 View Post
    strconv can do the same thing
    Code:
    Private Sub Command1_Click()
        Dim sTmp   As String
        Dim bTmp() As Byte
        
        sTmp = "小0123456789"
        MsgBox lstrlen(sTmp)
        bTmp = StrConv(sTmp, vbFromUnicode)
        DebugPrintByte "bTmp2", bTmp
        Erase bTmp
        ReDim bTmp(lstrlen(sTmp) - 1)
        CopyMemory bTmp(0), ByVal sTmp, lstrlen(sTmp)
        DebugPrintByte "bTmp", bTmp
        ReDim bTmp(LenB(sTmp) - 1)
        CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
        DebugPrintByte "bTmp", bTmp
    End Sub

  4. #4

    Thread Starter
    Frenzied Member
    Join Date
    Dec 2012
    Posts
    1,477

    Re: Safer way to convert full ANSI strings

    Quote Originally Posted by xxdoc123 View Post
    strconv can do the same thing
    As long as you use StrConv to convert to and from wide character strings, everything is fine. But if you want to send just the single byte characters (ANSI) to the other end, what you end up with is:
    30 31 32 33 34 35 36 37 38 39 92 1E 26 20 21 C6
    instead of:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
    When you are using encryption, you use the entire ANSI range (0 to 255), not just the ASCII range (0 to 127).

    It is for that reason (among others), that I no longer use the StrConv function.

    J.A. Coutts

  5. #5
    Frenzied Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    1,289

    Re: Safer way to convert full ANSI strings

    I use this code when a byte array has to be a string.

    Code:
    Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" ( _
        lpvDest As Any, lpvSource As Any, ByVal cbCopy As Long)
    
    
    Function ByteArray2String(bIn() As Byte, Bytes As Long) As String
    
    
        If Bytes Mod 2 = 1 Then
            ByteArray2String = StrConv(String$(Bytes, Chr(0)), vbFromUnicode)
        Else
            ByteArray2String = String$((Bytes + 1) \ 2, Chr(0))
        End If
        CopyMemory ByVal StrPtr(ByteArray2String), bIn(0), LenB(ByteArray2String)
    
    
    End Function
    Last edited by georgekar; Nov 1st, 2022 at 12:27 PM.

  6. #6
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    679

    Re: Safer way to convert full ANSI strings

    Quote Originally Posted by couttsj View Post
    As long as you use StrConv to convert to and from wide character strings, everything is fine. But if you want to send just the single byte characters (ANSI) to the other end, what you end up with is:
    30 31 32 33 34 35 36 37 38 39 92 1E 26 20 21 C6
    instead of:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
    When you are using encryption, you use the entire ANSI range (0 to 255), not just the ASCII range (0 to 127).

    It is for that reason (among others), that I no longer use the StrConv function.

    J.A. Coutts
    may be your are right.

  7. #7

    Thread Starter
    Frenzied Member
    Join Date
    Dec 2012
    Posts
    1,477

    Re: Safer way to convert full ANSI strings

    Previously I showed how the ANSII string "0123456789ƒ„…†‡ˆ"
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88
    was stored in memory as:
    30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00
    38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02

    Now the question arises as to how to restore it back to a string?
    If you use StrConv:
    Code:
    ANSII String: 0123456789ƒ„…†‡ˆ
    StrConv vbFromUnicode:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88 
    
    StrConv vbUnicode:
    30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00 
    38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02 
    0123456789ƒ„…†‡ˆ
    If you use wide character conversion:
    Code:
    ANSII String: 0123456789ƒ„…†‡ˆ
    StrToUtf8:
    30 31 32 33 34 35 36 37 38 39 C6 92 E2 80 9E E2 
    80 A6 E2 80 A0 E2 80 A1 CB 86 
    
    Utf8ToStr:
    30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00 
    38 00 39 00 92 01 1E 20 26 20 20 20 21 20 C6 02 
    0123456789ƒ„…†‡ˆ
    Using my own routines:
    Code:
    ANSII String: 0123456789ƒ„…†‡ˆ
    StrToByte:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88 
    
    ByteToStr:
    30 00 31 00 32 00 33 00 34 00 35 00 36 00 37 00 
    38 00 39 00 83 00 84 00 85 00 86 00 87 00 88 00 
    0123456789??????
    The answer depends entirely on what you want to accomplish, but it is anything but straight forward.

    J.A. Coutts
    Code:
    Option Explicit
    
    Private Const CP_UTF8 = 65001
    
    Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
    
    Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cbMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
    Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
    
    Private Function ByteToStr(bArray() As Byte) As String
        Dim lPntr As Long
        Dim bTmp() As Byte
        On Error GoTo ByteErr
        ReDim bTmp(UBound(bArray) * 2 + 1)
        For lPntr = 0 To UBound(bArray)
            bTmp(lPntr * 2) = bArray(lPntr)
        Next lPntr
        Let ByteToStr = bTmp
        Exit Function
    ByteErr:
        ByteToStr = ""
    End Function
    
    Private Sub DebugPrintByte(sDescr As String, bArray() As Byte)
        Dim lPtr As Long
        Debug.Print sDescr & ":"
        If GetbSize(bArray) = 0 Then Exit Sub
        For lPtr = 0 To UBound(bArray)
            Debug.Print Right$("0" & Hex$(bArray(lPtr)), 2) & " ";
            If (lPtr + 1) Mod 16 = 0 Then Debug.Print
        Next lPtr
        Debug.Print
    End Sub
    
    Private Function GetbSize(bArray() As Byte) As Long
        On Error GoTo GetSizeErr
        GetbSize = UBound(bArray) + 1
        Exit Function
    GetSizeErr:
        GetbSize = 0
    End Function
    
    Private Function StrToByte(strInput As String) As Byte()
        Dim bTmp() As Byte
        ReDim bTmp(Len(strInput) - 1)
        CopyMemory bTmp(0), ByVal strInput, Len(strInput)
        StrToByte = bTmp
    End Function
    
    Private Function StrToUtf8(strInput As String) As Byte()
        Dim nBytes As Long
        Dim abBuffer() As Byte
        If Len(strInput) < 1 Then Exit Function
        ' Get length in bytes *including* terminating null
        nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, 0&, 0&, 0&, 0&)
        ' We don't want the terminating null in our byte array, so ask for `nBytes-1` bytes
        ReDim abBuffer(nBytes - 2)  ' NB ReDim with one less byte than you need
        nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, ByVal VarPtr(abBuffer(0)), nBytes - 1, 0&, 0&)
        StrToUtf8 = abBuffer
    End Function
    
    Public Function Utf8ToStr(abUtf8Array() As Byte) As String
        Dim nBytes As Long
        Dim nChars As Long
        Dim strOut As String
        ' Catch uninitialized input array
        nBytes = GetbSize(abUtf8Array)
        If nBytes <= 0 Then Exit Function
        ' Get number of characters in output string
        nChars = MultiByteToWideChar(CP_UTF8, 0&, VarPtr(abUtf8Array(0)), nBytes, 0&, 0&)
        ' Dimension output buffer to receive string
        strOut = String(nChars, 0)
        nChars = MultiByteToWideChar(CP_UTF8, 0&, VarPtr(abUtf8Array(0)), nBytes, StrPtr(strOut), nChars)
        Utf8ToStr = Replace(strOut, Chr$(0), "") 'Remove Null terminating characters
    End Function
    
    Private Sub Command1_Click()
        Dim sTmp As String
        Dim bTmp() As Byte
        sTmp = "0123456789ƒ„…†‡ˆ"
        Debug.Print "ANSII String: "; sTmp
        ReDim bTmp(Len(sTmp) - 1)
        CopyMemory bTmp(0), ByVal sTmp, Len(sTmp)
        DebugPrintByte "CopyMemory ByVal", bTmp
        ReDim bTmp(LenB(sTmp) - 1)
        CopyMemory bTmp(0), ByVal StrPtr(sTmp), LenB(sTmp)
        DebugPrintByte "ByVal StrPtr", bTmp
    End Sub
    
    Private Sub Command2_Click()
        Dim sTmp As String
        Dim sTemp As String
        Dim bTmp() As Byte
        Dim bTemp() As Byte
        sTmp = "0123456789ƒ„…†‡ˆ"
        Debug.Print "ANSII String: "; sTmp
        bTmp = StrConv(sTmp, vbFromUnicode)
        DebugPrintByte "StrConv vbFromUnicode", bTmp
        sTemp = StrConv(bTmp, vbUnicode)
        Debug.Print sTemp
        ReDim bTemp(LenB(sTemp) - 1)
        CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
        DebugPrintByte "StrConv vbUnicode", bTemp
    End Sub
    
    Private Sub Command3_Click()
        Dim sTmp As String
        Dim sTemp As String
        Dim bTmp() As Byte
        Dim bTemp() As Byte
        sTmp = "0123456789ƒ„…†‡ˆ"
        Debug.Print "ANSII String: "; sTmp
        bTmp = StrToUtf8(sTmp)
        DebugPrintByte "StrToUtf8", bTmp
        sTemp = Utf8ToStr(bTmp)
        Debug.Print sTemp
        ReDim bTemp(LenB(sTemp) - 1)
        CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
        DebugPrintByte "Utf8ToStr", bTemp
    End Sub
    
    Private Sub Command4_Click()
        Dim sTmp As String
        Dim sTemp As String
        Dim bTmp() As Byte
        Dim bTemp() As Byte
        sTmp = "0123456789ƒ„…†‡ˆ"
        Debug.Print "ANSII String: "; sTmp
        bTmp = StrToByte(sTmp)
        DebugPrintByte "StrToByte", bTmp
        sTemp = ByteToStr(bTmp)
        Debug.Print sTemp
        ReDim bTemp(LenB(sTemp) - 1)
        CopyMemory bTemp(0), ByVal StrPtr(sTemp), LenB(sTemp)
        DebugPrintByte "ByteToStr", bTemp
    End Sub

  8. #8
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    5,163

    Re: Safer way to convert full ANSI strings

    I still fail to understand how StrToByte (i.e. CopyMemory ByVal string) is any different that StrConv vbFromUnicode.

    Btw, UTF-8 encoded strings are not using wide chars. Only a subset of UTF-16 codepoints can be called wide-chars (the ones which take up 2 bytes, so called UCS-2)

    Btw, your ByteToStr is completely wrong for any codepage other than 1252 i.e. the one you (obvious) test this under. For anyone else it yields unwieldy results.

    cheers,
    </wqw>

  9. #9
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,452

    Re: Safer way to convert full ANSI strings

    I'm trying to figure out what you are trying to do, but if it's just to stuff a byte array into a string, and then get the same bytes back out of the string, then direct assignment string = bytes() and back bytes() = string will do that. When you need to the bytes as unicode (into a TextBox.Text property for example), then using StrConv(bytestring, vbUnicode) will do the trick.

    Code:
    Option Explicit
    
    Sub Test()
       Dim s As String
       Dim b() As Byte
       Dim ii As Long
       
       s = GetBytes   ' Stuff bytes in a string unmolested
       b = s ' Stuff string bytes back into a byte array unmolested
       
       Debug.Print
       Debug.Print "Bytes after String<->ByteArray roundtrip: "
       For ii = LBound(b) To UBound(b)
          Debug.Print Hex$(b(ii)) & " ";
       Next ii
       Debug.Print
       
       Debug.Print
       Debug.Print "StrConv ""ANSI bytes"" String to Unicode: " & StrConv(s, vbUnicode)
       Debug.Print "StrConv ANSI Byte Array to Unicode: " & StrConv(b, vbUnicode)
    End Sub
    
    Function GetBytes() As Byte()
       Const BytesHex = "30313233343536373839838485868788"
       
       Dim ii As Long
       Dim la_Bytes() As Byte
       
       ReDim la_Bytes(Len(BytesHex) \ 2 - 1)
       
       ' Convert Hex "bytes" to actual bytes and stuff them into array
       Debug.Print "Source Bytes:"
       For ii = LBound(la_Bytes) To UBound(la_Bytes)
          la_Bytes(ii) = CLng("&H" & Mid$(BytesHex, ii * 2 + 1, 2))
          Debug.Print Hex$(la_Bytes(ii)) & " ";
       Next ii
       Debug.Print
       
       ' Return "ANSI" byte array
       GetBytes = la_Bytes
    End Function
    Results:

    Code:
    Source Bytes:
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88 
    
    Bytes after String<->ByteArray roundtrip: 
    30 31 32 33 34 35 36 37 38 39 83 84 85 86 87 88 
    
    StrConv "ANSI bytes" String to Unicode: 0123456789ƒ„…†‡ˆ
    StrConv ANSI Byte Array to Unicode: 0123456789ƒ„…†‡ˆ
    Or have I missed something?

  10. #10

    Thread Starter
    Frenzied Member
    Join Date
    Dec 2012
    Posts
    1,477

    Re: Safer way to convert full ANSI strings

    Quote Originally Posted by jpbro View Post
    I'm trying to figure out what you are trying to do, but if it's just to stuff a byte array into a string, and then get the same bytes back out of the string, then direct assignment string = bytes() and back bytes() = string will do that. When you need to the bytes as unicode (into a TextBox.Text property for example), then using StrConv(bytestring, vbUnicode) will do the trick.

    Or have I missed something?
    What I am trying to get across is that for anyone who thinks that VB wide character strings handle all ANSI characters, they don't. They can only be counted on to retain ASCII characters. Characters above &H7F don't necessarily get stored as ANSI characters.

    J.A. Coutts

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width