Results 1 to 36 of 36

Thread: [RESOLVED] Replacement for StrConv(vbFromUnicode)

  1. #1

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Resolved [RESOLVED] Replacement for StrConv(vbFromUnicode)

    im looking for a replacement for the StrConv-function using "vbFromUnicode" because the function make problems when running the app on windows system with unicode languages like chinese:

    Code:
    StrConv(sText, vbFromUnicode)
    i need to convert a string like

    Name:  Screenshot - 10.08.2021 , 09_47_08.png
Views: 1218
Size:  975 Bytes

    to this format:

    Name:  Screenshot - 10.08.2021 , 09_47_18.png
Views: 1228
Size:  611 Bytes

    Maybe someone already owns a handy function to convert such strings?

  2. #2
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Replacement for StrConv(vbFromUnicode)

    I'm mot sure what that is.

    If those are DBCS characters playing "let's pretend this is ANSI" then you might consider additional StrConv() conversion options such as vbNarrow and then a second call using vbWide. Those don't mean anything unless you have a Far East locale set though, so most people have no experience with them.

    But it isn't clear how you created that String literal. If I had to guess I'd think you pasted something from the clipboard manually and it got mangled in transition.

  3. #3

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by dilettante View Post
    I'm mot sure what that is.
    The original text:
    Name:  Screenshot - 10.08.2021 , 09_47_08.png
Views: 1207
Size:  975 Bytes
    comes from:
    Code:
    Call SendMessageW(hEditBox, WM_GETTEXT, lLen, ByVal sText)
    and the convert text:
    Name:  Screenshot - 10.08.2021 , 09_47_18.png
Views: 1191
Size:  611 Bytes
    is from:
    Code:
    sEditText = StrConv(sText, vbFromUnicode)
    The pictures of the text is taken from a debug.print at the IDE direct window.

    The displayed text at the GUI looks like this:
    Name:  Screenshot - 10.08.2021 , 10_37_19.png
Views: 1203
Size:  946 Bytes

  4. #4
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Replacement for StrConv(vbFromUnicode)

    It looks like sText is As String, so that SendMessage() call is probably doing several rounds of conversion to and from ANSI.

    Most important here, when SendMessageA() gets a return from SendMessageW() (which it calls) it translates the "sText" contents from Unicode to ANSI, and then when VB6 gets the return from SendMessageA() it translates this translated "sText" back to Unicode from ANSI and stores it into your sText.

    Even if your SendMessage() is an alias for SendMessageW() passing sText As String bumbles through the second half of the output translation by VB6. There is also input translation on String parameters to Declare Function calls, but we don't care about that here.

    If you are using explicit calls to StrConv() on top of all of that things can get pretty mangled up. It sounds like you were trying to un-break a broken egg.


    I think you are FAR over your head here. API calls are not for casual VB users. There is a reason why after the gamma-test version of VB6 (marketed as "Visual Basic 5.0") Microsoft dropped all but Professional and Enterprise SKUs.


    If we can assume that hEditBox is the window handle of a Unicode Edit control, then you'd want to Declare Function SendMessage() as an alias for SendMessageW() passing ByVal lParam As Long (a pointer).

    Then when calling SendMessage() you must first allocate sText to enough characters to hold the longest result expected. You could use WM_GETTEXTLENGTH for that. Once you have allocated the text buffer in sText, you call SendMessage() passing StrPtr(sText) and NOT sText itself.
    Last edited by dilettante; Aug 9th, 2021 at 11:17 PM.

  5. #5
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Replacement for StrConv(vbFromUnicode)

    BTW:

    After the return from your SendMessage() call you will have to trim the excess buffer if any. You also need to be careful about managing the terminating NUL.

    If you used WM_GETTEXTLENGTH and made the buffer exactly that many characters, the NUL can be the out-of-band NUL character after the end of the text of any proper String (BSTR).

    That is safe for Unicode but not for ANSI calls, because WM_GETTEXTLENGTH cannot return a precise result and might be too large due to possible DBCS characters. See the documentation for WM_GETTEXTLENGTH.

  6. #6

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: Replacement for StrConv(vbFromUnicode)

    Thanks for pointing me into the right direction!

    I changed the code and now it works fine on a windows system with chinese language:

    Code:
    Private Declare Function SendMessageW2 Lib "user32" Alias "SendMessageW" (ByVal hWnd As Long, _
                                                       ByVal wMsg As Long, _
                                                       ByVal wParam As Long, _
                                                       ByVal lParam As Long) As Long
    
    Call SendMessageW2(hEditBox, WM_GETTEXT, lLen, StrPtr(sText))
    sEditText = Left$(sText, InStr(sText, Chr$(0)) - 1)

  7. #7
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Great! I am happy to have been of assistance to you.

  8. #8
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    597

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    StrConv Replacement? LCMapStringEx API is the core.

  9. #9

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by DaveDavis View Post
    StrConv Replacement? LCMapStringEx API is the core.
    I guess that StrConv internally uses the API LCMapStringEx to convert strings, or?

    My problem is that StrConv and LCMapStringEx using a language locale to convert the text and this will screw up the text on a windows system with the option for non-unicode programs set to a unicode language like chiese, russian, etc.

    example:

    i read the whole content of a small file into a byte array buffer and convert the buffer to a string:

    Code:
    hFile = CreateFileWide(sFile, GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0)
    ReadFile(hFile, bBuffer(0), lBufferLen, lBytesRead, 0)
    sText = StrConv(bBuffer, vbUnicode)
    This code works fine on any windows system if the windows option for non-unicode programs is set to a non-unicode language like english:

    Name:  windows-language-for-non-unicode-programs-eng.jpg
Views: 1174
Size:  99.7 KB

    If the windows option for non-unicode programs is set to a unicode language like chinese some characters in the text are converted wrong:

    Name:  windows-language-for-non-unicode-programs-chinese.png
Views: 1052
Size:  34.8 KB

    Thats the reason i need a function to convert the buffer to a VBstring without using a locale language ID.
    I just want to convert the raw data from the buffer without any special language conversion.

    I guess this must be somehow possible. Maybe parsing the whole buffer by step 2 and convert every 2 bytes into a vbstring...

    I checked the encoding of the files with notepad: some are encoded with ANSI and some with UTF8.
    Last edited by Mith; Aug 10th, 2021 at 08:38 PM.

  10. #10
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Man you really are in love with StrConv aren't you lol....

    If you're dealing in pure Unicode, StrConv shouldn't even be part of the conversation. Your relationship with Unicode would require a divorce from StrConv. Hope you have a prenup
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  11. #11

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Niya View Post
    Man you really are in love with StrConv aren't you lol
    This function gaves me a lot of headache not mention the extra work because of the incompatibility with unicode languages!
    Hey, now my code is nearly free from StrConv!
    The last one is the use with the API ReadFile...any help would be nice

    How do you read the whole content of a file (with unicode characters in the name) into a string without using StrConv?

  12. #12
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Mith View Post
    This function gaves me a lot of headache not mention the extra work because of the incompatibility with unicode languages!
    lol now I know why Dilletante said this:-
    Quote Originally Posted by dilettante View Post
    I think you are FAR over your head here. API calls are not for casual VB users.
    However, I don't think you're over your head. You have shown enough understanding to be able to handle APIs. I just think you don't understand strings.

    There is no such thing as a "Unicode language". ANSI, DBCS, ASCII, UTF-16 etc are all different ways of encoding text. Now, the older formats like ANSI need to be paired with a code page which would be different depending on locale settings. If you have a piece of ANSI text in a buffer, how that text gets displayed will depend on the code page. You use the wrong code page, it comes out wrong. Code pages are a non issue when you're writing code in and for a single language. You just use a single code page and everything works. But when you start to mess with locale settings, all hell can break loose which is what is happening with you.

    Unicode was invented to solve just that problem. Unicode, as the name implies is a universal code for representing text. When a piece of text is in a Unicode format like UTF-16, it doesn't need something like a code page to tell your computer what language it is in. A Unicode document can contain multiple languages at the same time, something you cannot do with ANSI without a lot of extra work to switch up between multiple code pages so they all display correctly.

    The good thing for all VB6 programmers programming for any computer that runs any Windows version from XP to Windows 10 is that VB6 and Windows itself represent their Strings as UTF-16 which is a Unicode encoding. The problem is that VB6 has this one tiny quirk which is, whenever you pass a String to an API, VB6 tries to convert that String to an ANSI String(or some MBCS String) and without the proper code page to represent the correct language, things start getting screwed up. Now since both VB6's String type and the W versions of Win32 calls are both UTF-16 Unicode systems, they can talk to each other very easily with no extra work on your part. All you have to do is stop VB6 from trying to convert the Unicode Strings to ANSI Strings. You do that by passing Strings to API functions as either a pointer or a byte array.

    StrConv was invented for the purpose of converting between Unicode and ANSI among other things. This was necessary when Windows itself was still an ANSI operating system but it is not anymore so StrConv is not really needed for any of this. When you have Unicode functions talking to Unicode functions, StrConv is completely unnecessary.

    Quote Originally Posted by Mith View Post
    Hey, now my code is nearly free from StrConv!
    The last one is the use with the API ReadFile...any help would be nice

    How do you read the whole content of a file (with unicode characters in the name) into a string without using StrConv?
    You mean CreateFileW right? ReadFile doesn't take a file name. In any case. The principle is the exact same as we've shown before. Whenever you want to pass a String to an API, use StrPtr on the VB String and in the API declaration, the String parameter should be declared as a Long. This would allow the String to pass from VB6 to the Win32 API unmolested. Byte arrays are another way to pass Strings unmolested.

    If you're still having trouble, feel free to post you ReadFile code and we can help you out.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  13. #13

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Niya View Post
    If you're still having trouble, feel free to post you ReadFile code and we can help you out.
    my code to read the whole content of small files:

    Code:
    Public Declare Function CreateFileWide Lib "kernel32" Alias "CreateFileW" (ByVal lpFileName As Long, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long
    Public Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    Dim hFile As Long
    Dim BytesRead As Long
    Dim BufferLen As Long
    Dim bBuffer() As Byte
    Dim sText as String
    hFile = CreateFileWide(StrPtr(sFile), GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0)
    ReadFile(hFile, bBuffer(0), lBufferLen, lBytesRead, 0)
    sText = StrConv(bBuffer, vbUnicode)
    ...
    The files i read can be encoded with ANSI, Unicode or UTF8. I checked the encoding with notepad.

    Maybe i read the file content directly into a string variable using:

    Code:
    ReadFile(hFile, StrPtr(sText), lBufferLen, lBytesRead, 0)

  14. #14

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    I tried this code to read the file content directly into a string variable but the app crashes at ReadFile:

    Code:
    Public Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    lBufferLen = lFileSize
    sText = Space(lBufferLen)
    ReadFile(hFile, StrPtr(sText), lBufferLen, lBytesRead, 0)
    I changed the API declaration for the buffer to Long but the app still crashes:

    Code:
    Public Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Long, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    lBufferLen = lFileSize
    sText = Space(lBufferLen)
    ReadFile(hFile, StrPtr(sText), lBufferLen, lBytesRead, 0)
    Any ideas to read the content directly into a String to avoid StrConv?

  15. #15

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    ok, using a string buffer at the ReadFile API is not possible.

    Now i try to understand how the byte array buffer is filled up.

    I opend a file with UTF8-encoding.
    File size: 3419
    Buffer length: 3419
    Array Buffer Ubound: 3418

    After using the API ReadFile the whole content of the file is inside the byte array.
    The first 10 values from the array buffer(0) to buffer(9) are:
    Code:
    54   54   49   49   49   51   49   49   49   49
    Now i do StrConv(bBuffer, vbUnicode) into a string variable.
    String variable length: 3419 (LenB:6838)

    The first line of the text from the string variable is:
    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    The first 10 ASCII codes from the string variable are:
    Code:
    60   63   120   109   108   32   118   101   114   115
    How did the StrConv function converted these values?

    54/54 into 60
    49/49 into 63
    49/51 into 120
    49/49 into 109
    49/49 into 108

    How is it possible to convert the same values at the array into different ASCII values?
    The byte array buffer and the converted string text have the same size: how is it possible that the buffer uses double bytes for representing one character?
    If the buffer uses double bytes for representing one character the converted text string must be the half of the size as the buffer, or?
    Im really confused

  16. #16
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    597

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Mith View Post
    I tried this code to read the file content directly into a string variable but the app crashes at ReadFile:

    Code:
    Public Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Any, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    lBufferLen = lFileSize
    sText = Space(lBufferLen)
    ReadFile(hFile, StrPtr(sText), lBufferLen, lBytesRead, 0)
    I changed the API declaration for the buffer to Long but the app still crashes:

    Code:
    Public Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, lpBuffer As Long, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    lBufferLen = lFileSize
    sText = Space(lBufferLen)
    ReadFile(hFile, StrPtr(sText), lBufferLen, lBytesRead, 0)
    Any ideas to read the content directly into a String to avoid StrConv?
    For 2nd argument, try to use:
    Call ReadFile(hFile, ByVal StrPtr(sText), lBufferLen, lBytesRead, ByVal 0&)

  17. #17

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by DaveDavis View Post
    For 2nd argument, try to use:
    No crash anymore!

    I opened a ANSI-encoded file but the string contains only garbage:

    Name:  Screenshot - 11.08.2021 , 16_29_25.png
Views: 1111
Size:  3.0 KB

    The string content should look like this:

    Name:  Screenshot - 11.08.2021 , 16_33_25.png
Views: 1117
Size:  6.4 KB

    my code:

    Code:
    Declare Function ReadFile2 Lib "kernel32" (ByVal hFile As Long, lpBuffer As Long, ByVal nNumberOfBytesToRead As Long, lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    sText = Space(lBufferLen)
    ReadFile2(hFile, ByVal StrPtr(sText), lBufferLen, lBytesRead, 0)

  18. #18
    Junior Member
    Join Date
    Sep 2019
    Posts
    22

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Well, you've just read a file encoded in UTF-8. You need to use StrConv to convert it to UTF-16.

    Since you're trying to get rid of StrConv, you could use MultiByteToWideChar. Here's a snippet (air code):
    Code:
    Private Declare Function MultiByteToWideChar Lib "kernel32.dll" ( _
      ByVal CodePage As Long, _
      ByVal dwFlags As Long, _
      ByVal lpMultiByteStr As Long, _
      ByVal cchMultiByte As Long, _
      ByVal lpWideCharStr As Long, _
      ByVal cchWideChar As Long _
    ) As Long
    Private Const CP_UTF8 As Long = 65001
    
    Private Function ToUTF16(Data As String, Optional FromEncoding As Long = CP_UTF8) As String
      Dim lSize As Long
      
      lSize = MultiByteToWideChar(CP_UTF8, 0, StrPtr(Data), Len(Data), 0, 0)
      ToUTF16 = Space$(lSize)
      MultiByteToWideChar CP_UTF8, 0, StrPtr(Data), Len(Data), StrPtr(ToUTF16), lSize
    End Function
    
    'And you'd call it like this after your ReadFile2 call:
    sText = ToUTF16(sText)
    'and do something else with it

  19. #19
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    For many purposes you can just do something like:

    Code:
    Private Function GetUtf8AsUnicode(ByVal File As String) As String
        Dim BOM() As Byte
        Dim HasBOM As Boolean
    
        With New ADODB.Stream
            .Open
            .Type = adTypeBinary
            .LoadFromFile File
            BOM = .Read(3)
            HasBOM = BOM(0) = &HEF And BOM(1) = &HBB And BOM(2) = &HBF
            .Position = 0
            .Type = adTypeText
            .Charset = "utf-8"
            If HasBOM Then .Position = 3 'Skip the BOM.
            GetUtf8AsUnicode = .ReadText(adReadAll)
            .Close
        End With
    End Function

  20. #20
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Caine View Post
    Well, you've just read a file encoded in UTF-8. You need to use StrConv to convert it to UTF-16.

    Since you're trying to get rid of StrConv, you could use MultiByteToWideChar. Here's a snippet (air code):
    Code:
    Private Declare Function MultiByteToWideChar Lib "kernel32.dll" ( _
      ByVal CodePage As Long, _
      ByVal dwFlags As Long, _
      ByVal lpMultiByteStr As Long, _
      ByVal cchMultiByte As Long, _
      ByVal lpWideCharStr As Long, _
      ByVal cchWideChar As Long _
    ) As Long
    Private Const CP_UTF8 As Long = 65001
    
    Private Function ToUTF16(Data As String, Optional FromEncoding As Long = CP_UTF8) As String
      Dim lSize As Long
      
      lSize = MultiByteToWideChar(CP_UTF8, 0, StrPtr(Data), Len(Data), 0, 0)
      ToUTF16 = Space$(lSize)
      MultiByteToWideChar CP_UTF8, 0, StrPtr(Data), Len(Data), StrPtr(ToUTF16), lSize
    End Function
    
    'And you'd call it like this after your ReadFile2 call:
    sText = ToUTF16(sText)
    'and do something else with it
    You have the right idea here but this implementation looks a lil suspect.

    I am assuming the data is read from a UTF-8 text file. You then stuff that directly into a VB String and then you then convert it to UTF-16 String. One of the problems I see here is that you're using the Len function which I expect would make the assumption that you're trying to measure a UTF-16 String. But it's not, it's measuring a UTF-8 String that's been shoved into a UTF-16 BSTR. I'm surprised that even works but it looks like it would break as soon as you put it to work on a very serious UTF-8 document with all kinds of languages and glyphs mixed in.

    I'll work up a more trust worthy example in a subsequent post. But you definitely have the right idea here. You should be working with byte arrays right up until you convert it to UTF-16.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  21. #21

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Caine View Post
    Well, you've just read a file encoded in UTF-8. You need to use StrConv to convert it to UTF-16.

    Since you're trying to get rid of StrConv, you could use MultiByteToWideChar
    You pointed me in the right direction! One more problem bites the dust! Thanks a lot!

    Now i check the beginning of the file for different BOM headers and using different convert functions depending on the header!

  22. #22
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Here's my own take of reading a UTF-8 file using pure Win32 with no risk of corruption by VB6 String marshalling:-
    Code:
    Private Declare Function CreateFileW Lib "kernel32" (ByVal lpFileName As Long, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long
    Private Declare Function ReadFile Lib "kernel32" (ByVal hFile As Long, ByVal lpBuffer As Long, ByVal nNumberOfBytesToRead As Long, ByRef lpNumberOfBytesRead As Long, ByVal lpOverlapped As Long) As Long
    Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, ByRef fileSize As Long) As Long
    Private Declare Function CloseHandle Lib "kernel32" (ByVal handle As Long) As Long
    Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal codepage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cbMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
    
    
    Private Const GENERIC_READ As Long = &H120089
    Private Const FILE_SHARE_READ As Long = &H1
    Private Const OPEN_EXISTING As Long = 3
    Private Const FILE_ATTRIBUTE_NORMAL As Long = &H80
    Private Const INVALID_HANDLE_VALUE As Long = -1
    Private Const INVALID_FILE_SIZE As Long = INVALID_HANDLE_VALUE
    Private Const CP_UTF8 As Long = 65001
    
    Public Function ReadFileData(ByVal sFileName As String) As Byte()
        
        Dim hFile As Long
        Dim buffer() As Byte
        Dim fileSize As Long
        Dim bufferSize As Long
        Dim bytesRead As Long
        Dim buffPos As Long
        Dim tmpBuffer() As Byte
        
        hFile = CreateFileW(StrPtr(sFileName), GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0)
        
        If Not (hFile = INVALID_HANDLE_VALUE) Then
            fileSize = GetFileSize(hFile, 0)
            
            If Not (fileSize = INVALID_FILE_SIZE) Then
                
                'Buffer is one byte larger than file size. This allows us to call
                'ReadFile to read after the whole file has been read so it can return 0 and close the file.
                'If the buffer was exactly the length of the file, it would throw a subcription out of range error
                'when we call ReadFile after all the file's data has been written
                ReDim buffer(0 To fileSize)
                
                Do
                    'Read file 1000 bytes at a time
                    If CBool(ReadFile(hFile, VarPtr(buffer(buffPos)), 1000&, bytesRead, 0)) Then
                        If bytesRead > 0 Then
                            buffPos = buffPos + bytesRead
                        Else
                            'Close file
                            If Not CBool(CloseHandle(hFile)) Then
                                Err.Raise 9093, "", "Failure closing file handle"
                            Else
                               
                                ReDim tmpBuffer(0 To fileSize - 1)
                                For i = 0 To fileSize - 1
                                    tmpBuffer(i) = buffer(i)
                                Next
                                
                                ReadFileData = tmpBuffer
                                
                                Exit Do
                            End If
                        End If
                    Else
                        CloseHandle hFile
                        Err.Raise 9092, "", "Failure reading file contents. Windows error code " & CStr(Err.LastDllError)
                    End If
    
                Loop
                
            Else
                CloseHandle (hFile)
                Err.Raise 9091, "", "Error determining file size"
            End If
        
        Else
            Err.Raise 9090, "", "Failed to open file"
        End If
    
    End Function
    
    Public Function ReadUTF8TextFile(ByVal sFileName As String) As String
        
        Dim blob() As Byte
        Dim blobSize As Long
        Dim numWideChar As Long
        Dim Text As String
        
        blob = ReadFileData(sFileName)
        
        blobSize = (UBound(blob) - LBound(blob)) + 1
        
        'Calculate the number of wide characters the conversion would result in
        numWideChar = MultiByteToWideChar(CP_UTF8, 0, VarPtr(blob(0)), blobSize, 0, 0)
        
        'Allocate String to hold the converted UTF-16 characters
        Text = Space(numWideChar)
        
        'Convert byte array data from UTF-8 to UTF-16
        numWideChar = MultiByteToWideChar(CP_UTF8, 0, VarPtr(blob(0)), blobSize, StrPtr(Text), numWideChar)
    
        ReadUTF8TextFile = Text
    End Function
    
    
    Private Sub Form_Load()
        Dim fileContents As String
        Dim Data() As Byte
        Dim p As Long
        
        fileContents = ReadUTF8TextFile(App.Path + "\text file\text1.txt")
        
        uctb1.TextUnicode = fileContents
        
    End Sub
    Tested on a multi-language UTF-8 text file:-



    Ignore the square boxes you see in place of the Hindi and the other language. That has nothing to do with the conversion process. For some reason Elroy's Unicode Textbox, which I used to test this, doesn't quite know how to display certain characters. If you copied those characters from that Textbox into any modern Unicode control that can handle it(eg NotePad or your web browser), you'd see it display the characters correctly which means that Unicode data is preserved.
    Attached Files Attached Files
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  23. #23

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Niya View Post
    For some reason Elroy's Unicode Textbox, which I used to test this, doesn't quite know how to display certain characters.
    Replace Elroy's Unicode Textbox with the Unicode TextBox control from Krool if you need to display Hindi charaters:

    Name:  Screenshot - 12.08.2021 , 06_29_07.png
Views: 1083
Size:  4.2 KB

    I use all his unicode controls in my projects to display unicode languages and everything is working very well:

    Krool's CommonControls (Replacement of the MS common controls)

  24. #24
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    I actually tried to use Krool's controls when I was doing this String stuff in your other thread. I couldn't get it to work. I tried to use the OCX version but when I downloaded the zip/rar from his link, WinRar couldn't open it. Was reading it as corrupted so I opted to use Elroy's control instead since I just wanted a TextBox anyway.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  25. #25
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    I'm not sure why we are digging over the old field hoping to find new potatoes. This topic was done to death years ago, both here and in the CodeBank.

  26. #26

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Niya View Post
    I tried to use the OCX version but when I downloaded the zip/rar from his link, WinRar couldn't open it.
    Did you read the note above the download link:

    "The attached file VBCCR17.OCX.rar.zip should be renamed to VBCCR17.OCX.rar after download and it contains the pre-compiled OCX."

    Maybe that's the problem with the file you have downloaded...

  27. #27
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Mith View Post
    Did you read the note above the download link:

    "The attached file VBCCR17.OCX.rar.zip should be renamed to VBCCR17.OCX.rar after download and it contains the pre-compiled OCX."

    Maybe that's the problem with the file you have downloaded...
    Yea, I remember trying all that. My computer just refused to open the rar. I'll go try it again and see what happens.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  28. #28
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Yea, still doesn't work for me:-

    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  29. #29
    PowerPoster
    Join Date
    Jan 2020
    Posts
    3,746

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    # VBCCR
    VB Common Controls Replacement Library (Replacement of the MS common controls)

    This project is intended to replace the MS common controls for VB6.

    http://www.vbforums.com/showthread.p...mmon-controls)

    Code:
    TextBoxHandle = CreateWindowEx(dwExStyle, StrPtr("Edit"), 0, dwStyle, 0, 0, UserControl.ScaleWidth, UserControl.ScaleHeight, UserControl.hWnd, 0, App.hInstance, ByVal 0&)
    
    
    T1 = ReadUniFile(App.Path & "\text file\unicodefile1.txt")
    TextBoxW1.Text = T1
    
     Function ReadUniFile(ByVal sFile As String) As String
        Dim a As Long
        a = FileLen(sFile)
        ReDim buff(a - 1) As Byte
        ReDim buff1(a - 3) As Byte
        Open sFile For Binary As #1
        Get #1, , buff
        Close #1
        CopyMemory buff1(0), buff(2), a - 2
        ReadUniFile = buff1
    End Function

  30. #30
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by xiaoyao View Post
    Code:
    TextBoxHandle = CreateWindowEx(dwExStyle, StrPtr("Edit"), 0, dwStyle, 0, 0, UserControl.ScaleWidth, UserControl.ScaleHeight, UserControl.hWnd, 0, App.hInstance, ByVal 0&)
    
    
    T1 = ReadUniFile(App.Path & "\text file\unicodefile1.txt")
    TextBoxW1.Text = T1
    
     Function ReadUniFile(ByVal sFile As String) As String
        Dim a As Long
        a = FileLen(sFile)
        ReDim buff(a - 1) As Byte
        ReDim buff1(a - 3) As Byte
        Open sFile For Binary As #1
        Get #1, , buff
        Close #1
        CopyMemory buff1(0), buff(2), a - 2
        ReadUniFile = buff1
    End Function
    That won't work. Aside from the buffer measurements and offsets looking a little suspect, that code won't properly read a UTF-8 text file. However, with some modifications:-
    Code:
    Function ReadUniFile(ByVal sFile As String) As String
        Dim a As Long
        Dim numWideChar As Long
        
        a = FileLen(sFile)
        ReDim buff(0 To a - 1) As Byte
        
        Open sFile For Binary As #1
        
            Get #1, , buff
            
            numWideChar = MultiByteToWideChar(CP_UTF8, 0, VarPtr(buff(0)), a, 0, 0)
            
            ReadUniFile = Space(numWideChar)
            
            numWideChar = MultiByteToWideChar(CP_UTF8, 0, VarPtr(buff(0)), a, StrPtr(ReadUniFile), numWideChar)
            
        Close #1
    End Function
    You must convert the UTF-8 to UTF-16 for them to work properly with VB Strings.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  31. #31
    PowerPoster
    Join Date
    Jan 2020
    Posts
    3,746

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    you need load from unicode file:text file\unicodefile1.txt or
    Code:
    dim fileContents  as string
    fileContents = ReadUTF8TextFile(App.Path + "\text file\text1.txt")
    TextBoxW1.Text = fileContents
    Last edited by xiaoyao; Aug 11th, 2021 at 09:28 PM.

  32. #32
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by xiaoyao View Post
    you need load from unicode file:text file\unicodefile1.txt
    huh?
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  33. #33

    Thread Starter
    Hyperactive Member Mith's Avatar
    Join Date
    Jul 2017
    Location
    Thailand
    Posts
    445

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Niya View Post
    Yea, still doesn't work for me:-
    i downloaded the file again and it works without any problems.
    The file size must be 987.996 Bytes and the checksums are:

    File: VBCCR17.OCX.rar
    CRC-32: 2fe644e3
    MD5: 4e3e2f857a28c7894c601fafa83d04fd
    SHA-1: 663d415a04b1fd274900925f1328f0cffc9c64a2

  34. #34
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Mith View Post
    i downloaded the file again and it works without any problems.
    The file size must be 987.996 Bytes and the checksums are:

    File: VBCCR17.OCX.rar
    CRC-32: 2fe644e3
    MD5: 4e3e2f857a28c7894c601fafa83d04fd
    SHA-1: 663d415a04b1fd274900925f1328f0cffc9c64a2
    Well, the file size, the MD5 and SHA-1 hashes match and I think its safe to say the CRC would also be a match if I checked. So I donno. Just one of those weird things I guess. It's no big deal as I don't really use VB6 outside of answering questions on these forums and some maintenance work on old code we have.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  35. #35
    PowerPoster
    Join Date
    Jan 2020
    Posts
    3,746

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    SOME TIMES ,I ONLY NEED TextboxW CONTROL,IT'S maked 210kb
    TextboxW.OCX

    only use listview control,it's used 1MB
    VBCCR16.OCX 5mb size

    The overall volume of an OCX is a bit larger. It is very difficult to disassemble the various controls to reduce the size of the compiled EXE.

  36. #36
    New Member
    Join Date
    Aug 2022
    Posts
    1

    Re: [RESOLVED] Replacement for StrConv(vbFromUnicode)

    Quote Originally Posted by Mith View Post
    How did the StrConv function converted these values?

    54/54 into 60
    49/49 into 63
    49/51 into 120
    49/49 into 109
    49/49 into 108

    How is it possible to convert the same values at the array into different ASCII values?
    The byte array buffer and the converted string text have the same size: how is it possible that the buffer uses double bytes for representing one character?
    If the buffer uses double bytes for representing one character the converted text string must be the half of the size as the buffer, or?
    Im really confused
    try adding "1033" as LocaleID to StrConv
    StrConv is fast, and all "pure vb" workarounds suck more or less. After setting strconv's locale to English / United States (1033), unwanted garbage replacements don't seem to happen. I tested it with 50MB ZIP file. Received it as a BLOB from adodb (which results in a byte array), and then StrConv'd this byte array to string to send it out via winsock to a browser. It worked fast (at least by VB6 standards), and more importantly, browser got the same file that resided in database, not some StrConv remix

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width