Results 1 to 6 of 6

Thread: Universal FileRead function

  1. #1

    Thread Starter
    Member dseaman's Avatar
    Join Date
    Oct 2004
    Location
    Natal, Brazil
    Posts
    38

    Universal FileRead function

    Universal Unicode Aware FileRead function that automatically detects ANSI, UTF8, UTF8(No BOM), UTF16LE, UTF16BE.

    Includes Error handling routine that:
    1, Outputs to Debug Window.
    2. Outputs to LogFile.
    3. Outputs OutputDebugString (Requires DebugView.exe. See https://docs.microsoft.com/en-us/sys...oads/debugview)

    List of revisions:
    27-Dec-2019 01:08 UTC - Added ADODB.Stream FileWrite.
    19-Jan-2020 13:53 UTC - Fixed logic in HasBytes/GetEncoding (Thanks Wqweto)


    Name:  FileRead.png
Views: 635
Size:  20.7 KB

    FileReadWrite.zip
    Last edited by dseaman; Jan 19th, 2020 at 09:02 AM. Reason: Bug fix HasBytes/GetEncoding

  2. #2
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    597

    Re: Universal FileRead function

    UFT32LE codepage is 12000 and UTF32BE's codepage is 12001, that means they can be read to vb6's unicode text by simple StrConv and wrote/converted by WideCharToMultiByte?

    something likes:
    Code:
    Private Function ToUTF32(ByVal Text As String) As Byte()
      Dim lngOutLen        As Long
      Dim UTF32()           As Byte
    
      lngOutLen = WideCharToMultiByte(CP_UTF32, 0, StrPtr(Text), Len(Text), 0, 0, 0, 0)
      If lngOutLen = 0 Then
        Err.Raise Err.LastDllError, "ToUTF32", "WideCharToMultiByte"
      Else
        ReDim UTF32(lngOutLen - 1)
        WideCharToMultiByte CP_UTF32, 0, StrPtr(Text), Len(Text), VarPtr(UTF32(0)), lngOutLen, 0, 0
        ToUTF32 = UTF32
      End If
    End Function
    Last edited by DaveDavis; Dec 26th, 2019 at 10:04 PM.

  3. #3

    Thread Starter
    Member dseaman's Avatar
    Join Date
    Oct 2004
    Location
    Natal, Brazil
    Posts
    38

    Re: Universal FileRead function

    UFT32LE codepage is 12000 and UTF32BE's codepage is 12001, that means they can be read to vb6's unicode text by simple StrConv and wrote/converted by WideCharToMultiByte?
    WideCharToMultiByte lngOutLen returns 0 when using codepage 12000 (UTF32LE).
    Do you have a working version of Function ToUTF32 using codepage 12000?

    Here is a quote from https://stackoverflow.com/questions/...e-utf-32-files

    And the Win32 API: it has a function MultiByteToWideChar which can convert UTF-8 to UTF-16 (which you need to pass in to all Win32 calls) but it doesn’t accept UTF-32.
    Last edited by dseaman; Dec 27th, 2019 at 10:36 AM.

  4. #4

    Thread Starter
    Member dseaman's Avatar
    Join Date
    Oct 2004
    Location
    Natal, Brazil
    Posts
    38

    Re: Universal FileRead function

    The code doesn't work for me i don't know why!
    Apparently CP_UTF32 does not work with API WideCharToMultiByte.

    BTW, http://cknotes.com/chilkat-charsets-...ngs-supported/ has following constants but still does not work with API WideCharToMultiByte.
    utf-32 = 65005
    utf-32be = 65006

    Encoding of Text Files in VB 6.0
    https://stackoverflow.com/questions/...iles-in-vb-6-0
    This project also does not work with UTF32 (WideCharToMultiByte)
    Last edited by dseaman; Dec 28th, 2019 at 10:51 AM. Reason: More Info

  5. #5
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    5,121

    Re: Universal FileRead function

    Wondering why HasBytes returns false on single element byte arrays and probably GetEncoding will fail on single element byte arrays too.

    Is this designed to read zero-length files and single byte files or these are not implemented by design?

    cheers,
    </wqw>

  6. #6

    Thread Starter
    Member dseaman's Avatar
    Join Date
    Oct 2004
    Location
    Natal, Brazil
    Posts
    38

    Re: Universal FileRead function

    Wondering why HasBytes returns false on single element byte arrays and probably GetEncoding will fail on single element byte arrays too.

    Is this designed to read zero-length files and single byte files or these are not implemented by design?
    Thanks for pointing this out. I think it was working OK in earlier versions and then I broke it.
    HasBytes now returns True for single element.
    All size checking should be done in GetEncoding to ensure that there are sufficient bytes to check for BOM.

    Project updated at http://www.vbforums.com/showthread.p...=1#post5441071

    Would welcome any suggestions for improving GetEncoding.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width