Results 1 to 30 of 30

Thread: Help for encoding a text file with VB6

  1. #1

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Help for encoding a text file with VB6

    Hello friends
    First, forgive me for being weak English
    I want to write a program with vb6
    When the text file is dropped and released, we convert the file to "Windows-1256"
    "Utf8" and "Ansi" and "unicode" are converted to "Windows-1256"
    If you get a lot better

    Please help me

  2. #2
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Help for encoding a text file with VB6

    Windows-1256 is an ANSI codepage, not a non-ANSI encoding.

    Many text files have no BOM or other indicator of encoding, so guessing what encoding a file has can be problematic. Guessing what you want to re-encode the text file into is even harder but it sounds like you always want to produce Windows-1256.

  3. #3

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Thank you for replying
    I was wondering if you or anyone can provide a code that
    First check a file encode
    If the file encode is utf-8 show a text and do nothing else
    And if the file encode is anything beside utf-8 then change the file encode to utf-8

  4. #4
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Help for encoding a text file with VB6

    The problem with detecting UTF-8 is that both ASCII and all ANSI and DBCS encodings are valid UTF-8. Even if the file has bytes with the high bit set (greater than 127) it might be ANSI or UTF-16.

    The Notepad file encoding problem, redux discusses some of this.

  5. #5

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    In essence, I need to make a program for encoding Persian and Arabic subtitles
    I have a program that is made with .NET
    Is it not possible in Visual Basic 6?
    I'm attaching the application made with .NET
    I have created the second program and I attach it

    Thank you again
    Attached Files Attached Files

  6. #6
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    6,747

    Re: Help for encoding a text file with VB6

    Some of the routines I use for dealing with UTF8:
    Code:
    Private Const CP_UTF8 As Long = 65001
    
    Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
                             ByVal CodePage As Long, _
                             ByVal dwFlags As Long, _
                             ByVal lpWideCharStr As Long, _
                             ByVal cchWideChar As Long, _
                             ByVal lpMultiByteStr As Long, _
                             ByVal cbMultiByte As Long, _
                             ByVal lpDefaultChar As Long, _
                             ByVal lpUsedDefaultChar As Long) As Long
                             
    Public Function ConvertToUTF8(ByRef Source As String) As Byte()
      Dim Length As Long
      Dim Pointer As Long
      Dim Size As Long
      Dim buffer() As Byte
      
      If Len(Source) > 0 Then
        Length = Len(Source)
        Pointer = StrPtr(Source)
        Size = WideCharToMultiByte(CP_UTF8, 0, Pointer, Length, 0, 0, 0, 0)
        If Size > 0 Then
          ReDim buffer(0 To Size - 1)
          
          WideCharToMultiByte CP_UTF8, 0, Pointer, Length, VarPtr(buffer(0)), Size, 0, 0
          ConvertToUTF8 = buffer
        End If
      End If
      
    End Function
    
    '---------------------------------------------------------------------------------------
    ' Procedure : ContainsUTF8
    ' DateTime  : 2012-10-23
    ' Author    : Arnoutdv
    ' Purpose   : Recognize UTF8 characters in byte array
    '           : If byte1 >= "&C2" and byte1 <= "D3" and
    '           :    byte2 >= "&80" and byte2 <= "BF" then UTF8 encoded character
    ' Source    : http://www.utf8-chartable.de/unicode-utf8-table.pl
    ' Revision  : 2014-04-02 - 3 bytes UTF8 : http://www.utf8-chartable.de/unicode-utf8-table.pl?start=2048
    '---------------------------------------------------------------------------------------
    Public Function ContainsUTF8(ByRef Source() As Byte) As Boolean
      Dim i As Long, lUBound As Long
      Dim hexC2 As Byte, hexDB As Byte, hexBF As Byte, hex80 As Byte, hexE0 As Byte
      Dim CurByte As Byte
      
      If pUTF8header(Source) Then
        ContainsUTF8 = True
      Else
        hexC2 = &HC2
        hexDB = &HDB
        hex80 = &H80
        hexBF = &HBF
        hexE0 = &HE0
        
        lUBound = UBound(Source)
        
        For i = 0 To lUBound - 1
          CurByte = Source(i)
          If CurByte >= hexC2 And CurByte <= hexDB Then
            If (Source(i + 1) And hex80) Then
              ContainsUTF8 = True
              Exit For
            End If
          End If
          If i + 2 <= lUBound Then
            If CurByte >= hexE0 Then
              If (Source(i + 1) And hex80) And (Source(i + 2) And hex80) Then
                ContainsUTF8 = True
                Exit For
              End If
            End If
          End If
        Next i
        
      End If
    End Function
    
    '---------------------------------------------------------------------------------------
    ' Procedure : pUTF8header
    ' DateTime  : 2013-06-14
    ' Author    : Arnoutdv
    ' Purpose   :
    ' Reference : http://en.wikipedia.org/wiki/Byte_order_mark
    '---------------------------------------------------------------------------------------
    Private Function pUTF8header(Source() As Byte) As Boolean
    
      If UBound(Source) >= 2 Then
        If Source(0) = &HEF Then
          If Source(1) = &HBB Then
            If Source(2) = &HBF Then
              pUTF8header = True
              Exit Function
            End If
          End If
        End If
      End If
    
    End Function

  7. #7

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by Arnoutdv View Post
    Some of the routines I use for dealing with UTF8:
    Code:
    Private Const CP_UTF8 As Long = 65001
    
    Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
                             ByVal CodePage As Long, _
                             ByVal dwFlags As Long, _
                             ByVal lpWideCharStr As Long, _
                             ByVal cchWideChar As Long, _
                             ByVal lpMultiByteStr As Long, _
                             ByVal cbMultiByte As Long, _
                             ByVal lpDefaultChar As Long, _
                             ByVal lpUsedDefaultChar As Long) As Long
                             
    Public Function ConvertToUTF8(ByRef Source As String) As Byte()
      Dim Length As Long
      Dim Pointer As Long
      Dim Size As Long
      Dim buffer() As Byte
      
      If Len(Source) > 0 Then
        Length = Len(Source)
        Pointer = StrPtr(Source)
        Size = WideCharToMultiByte(CP_UTF8, 0, Pointer, Length, 0, 0, 0, 0)
        If Size > 0 Then
          ReDim buffer(0 To Size - 1)
          
          WideCharToMultiByte CP_UTF8, 0, Pointer, Length, VarPtr(buffer(0)), Size, 0, 0
          ConvertToUTF8 = buffer
        End If
      End If
      
    End Function
    
    '---------------------------------------------------------------------------------------
    ' Procedure : ContainsUTF8
    ' DateTime  : 2012-10-23
    ' Author    : Arnoutdv
    ' Purpose   : Recognize UTF8 characters in byte array
    '           : If byte1 >= "&C2" and byte1 <= "D3" and
    '           :    byte2 >= "&80" and byte2 <= "BF" then UTF8 encoded character
    ' Source    : http://www.utf8-chartable.de/unicode-utf8-table.pl
    ' Revision  : 2014-04-02 - 3 bytes UTF8 : http://www.utf8-chartable.de/unicode-utf8-table.pl?start=2048
    '---------------------------------------------------------------------------------------
    Public Function ContainsUTF8(ByRef Source() As Byte) As Boolean
      Dim i As Long, lUBound As Long
      Dim hexC2 As Byte, hexDB As Byte, hexBF As Byte, hex80 As Byte, hexE0 As Byte
      Dim CurByte As Byte
      
      If pUTF8header(Source) Then
        ContainsUTF8 = True
      Else
        hexC2 = &HC2
        hexDB = &HDB
        hex80 = &H80
        hexBF = &HBF
        hexE0 = &HE0
        
        lUBound = UBound(Source)
        
        For i = 0 To lUBound - 1
          CurByte = Source(i)
          If CurByte >= hexC2 And CurByte <= hexDB Then
            If (Source(i + 1) And hex80) Then
              ContainsUTF8 = True
              Exit For
            End If
          End If
          If i + 2 <= lUBound Then
            If CurByte >= hexE0 Then
              If (Source(i + 1) And hex80) And (Source(i + 2) And hex80) Then
                ContainsUTF8 = True
                Exit For
              End If
            End If
          End If
        Next i
        
      End If
    End Function
    
    '---------------------------------------------------------------------------------------
    ' Procedure : pUTF8header
    ' DateTime  : 2013-06-14
    ' Author    : Arnoutdv
    ' Purpose   :
    ' Reference : http://en.wikipedia.org/wiki/Byte_order_mark
    '---------------------------------------------------------------------------------------
    Private Function pUTF8header(Source() As Byte) As Boolean
    
      If UBound(Source) >= 2 Then
        If Source(0) = &HEF Then
          If Source(1) = &HBB Then
            If Source(2) = &HBF Then
              pUTF8header = True
              Exit Function
            End If
          End If
        End If
      End If
    
    End Function
    Thanks
    But that's not the case

  8. #8
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    6,747

    Re: Help for encoding a text file with VB6

    What is the case?
    Please be more specific

  9. #9

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by Arnoutdv View Post
    What is the case?
    Please be more specific
    I explained in post 5 and attached the file
    If I need to explain again?

  10. #10
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by aghamali View Post
    When the text file is dropped and released, we convert the file to "Windows-1256"
    "Utf8" and "Ansi" and "unicode" are converted to "Windows-1256"
    So you basically have text files in Ansi (1252?), Utf-8 and Unicode as input, and you want Ansi-1256 as output, right?

    EDIT: I see that there is a Unicode.txt in your project folder. But that one is not Unicode, it's Utf-8. Which reduces the problem to "Convert UTF-8 to codepage 1256".

    There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
    Last edited by jj2007; Jul 12th, 2017 at 08:13 AM.

  11. #11
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by jj2007 View Post
    There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
    This either trolling level 11 or just very hilarious :-))

    @aghamali: Try this implementation of a universal `ReadTextFile` function:
    Code:
    '--- for WideCharToMultiByte
    Private Const CP_UTF8                       As Long = 65001
    
    Private Declare Function IsTextUnicode Lib "advapi32" (lpBuffer As Any, ByVal cb As Long, lpi As Long) As Long
    Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
    
    Private Function ReadTextFile(sFile As String) As String
        Const ForReading    As Long = 1
        Const BOM_UTF       As String = "п»ї"   '--- "\xEF\xBB\xBF"
        Const BOM_UNICODE   As String = "яю"    '--- "\xFF\xFE"
        Dim lSize           As Long
        Dim sPrefix         As String
        Dim nFile           As Integer
        Dim sCharset        As String
        Dim oStream         As Object
        
        On Error GoTo EH
        '--- get file size
        lSize = FileLen(sFile)
        If lSize = 0 Then
            Exit Function
        End If
        '--- read first 50 chars
        nFile = FreeFile()
        Open sFile For Binary Access Read As nFile
        sPrefix = String$(IIf(lSize < 50, lSize, 50), 0)
        Get nFile, , sPrefix
        Close nFile
        '--- figure out charset
        If Left$(sPrefix, 3) = BOM_UTF Then
            sCharset = "UTF-8"
        ElseIf Left$(sPrefix, 2) = BOM_UNICODE Or IsTextUnicode(ByVal sPrefix, Len(sPrefix), &HFFFF& - 2) <> 0 Then
            sCharset = "Unicode"
        ElseIf InStr(1, sPrefix, "<?xml", vbTextCompare) > 0 And InStr(1, sPrefix, "utf-8", vbTextCompare) > 0 Then
            '--- special xml encoding test
            sCharset = "UTF-8"
        Else
            sCharset = "Windows-1256"
        End If
        '--- plain text: direct VB6 read
        If LenB(ReadTextFile) = 0 And sCharset = "Windows-1256" Then
            nFile = FreeFile()
            Open sFile For Binary Access Read As nFile
            ReadTextFile = String$(lSize, 0)
            Get nFile, , ReadTextFile
            Close nFile
        End If
        '--- plain text + unicode: use FileSystemObject
        If LenB(ReadTextFile) = 0 And sCharset <> "UTF-8" Then
            On Error Resume Next  '--- checked
            ReadTextFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(sFile, ForReading, False, sCharset = "Unicode").ReadAll()
            On Error GoTo EH
        End If
        '--- plain text + unicode + utf-8: use ADODB.Stream
        If LenB(ReadTextFile) = 0 Then
            Set oStream = CreateObject("ADODB.Stream")
            With oStream
                .Open
                .Charset = sCharset
                .LoadFromFile sFile
                ReadTextFile = .ReadText()
            End With
        End If
        '--- pseudo-fix UTF-8
        If sCharset = "Windows-1256" Then
            For lSize = &H80 To &HFF
                sPrefix = ToUtf8(Chr$(lSize))
                If InStr(ReadTextFile, sPrefix) > 0 Then
                    ReadTextFile = Replace(ReadTextFile, sPrefix, Chr$(lSize))
                End If
            Next
        End If
        Exit Function
    EH:
        Debug.Print Err.Description
    End Function
    
    Private Function ToUtf8(sText As String) As String
        Dim lSize           As Long
        
        ToUtf8 = String$(4 * Len(sText), 0)
        lSize = WideCharToMultiByte(CP_UTF8, 0, StrPtr(sText), Len(sText), ByVal ToUtf8, Len(ToUtf8), 0, 0)
        ToUtf8 = Left$(ToUtf8, lSize)
    End Function
    
    Private Sub Command1_Click()
        Dim sText As String
        sText = ReadTextFile(App.Path & "\utf8.txt")
        Text1.SelText = "utf8.txt:" & vbCrLf & sText
        sText = ReadTextFile(App.Path & "\unicode.txt")
        Text1.SelText = "unicode.txt:" & vbCrLf & sText
        sText = ReadTextFile(App.Path & "\ascii.txt")
        Text1.SelText = "ascii.txt:" & vbCrLf & sText
    End Sub
    It should handle cp1256/utf-8/unicode text files in one go.

    Edit: It tries to handle utf-8 w/o BOM files too (last "pseudo-fix" comment).

    cheers,
    </wqw>

  12. #12
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by wqweto View Post
    This either trolling level 11 or just very hilarious :-))
    I need 5 lines of code to translate Unicode.txt to an exact copy of ASCII.txt (the cp1256 format). But I like complicated solutions, too.

  13. #13
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by jj2007 View Post
    I need 5 lines of code to translate Unicode.txt to an exact copy of ASCII.txt (the cp1256 format). But I like complicated solutions, too.
    Hey, I'm copy/pasting production code here -- the cruft is 10+ years old at least.

    This is like a honey badger kind of read-every-format-of-text-file the end users manage to find kind of function. Works for me, probably will work for universal subtitles converter too (*and* supports utf-8 w/o BOM crazy popular format).

    No hard feelings but requesting a second explanation when the OP is just fed up explaining everything is in post #5 is kind of trollish. . . Finally suggesting MasmBasic reimpl in VB6 forum thread is level 11 :-))

    cheers,
    </wqw>

  14. #14

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by jj2007 View Post
    So you basically have text files in Ansi (1252?), Utf-8 and Unicode as input, and you want Ansi-1256 as output, right?

    EDIT: I see that there is a Unicode.txt in your project folder. But that one is not Unicode, it's Utf-8. Which reduces the problem to "Convert UTF-8 to codepage 1256".

    There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
    I want an app for converting movie subtitles to a playable version on TV with support from Persian and Arabic, which is UTF-8.
    And I only know his language
    So far, I've just tried to get the following results
    If the file "Ansi" is turned on, it will be done
    If the "utf8" file is corrupted, the file will be corrupted
    If the file and "unicode" is not converted

  15. #15
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    6,747

    Re: Help for encoding a text file with VB6

    No hard feelings but requesting a second explanation when the OP is just fed up explaining everything is in post #5 is kind of trollish.
    I just misunderstood his quite cryptic original question.
    I assumed he wanted UTF8 as output and detect whether the input file was already UTF8, hence the code I posted.

  16. #16

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by wqweto View Post
    This either trolling level 11 or just very hilarious :-))

    @aghamali: Try this implementation of a universal `ReadTextFile` function:
    Code:
    '--- for WideCharToMultiByte
    Private Const CP_UTF8                       As Long = 65001
    
    Private Declare Function IsTextUnicode Lib "advapi32" (lpBuffer As Any, ByVal cb As Long, lpi As Long) As Long
    Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
    
    Private Function ReadTextFile(sFile As String) As String
        Const ForReading    As Long = 1
        Const BOM_UTF       As String = "п»ї"   '--- "\xEF\xBB\xBF"
        Const BOM_UNICODE   As String = "яю"    '--- "\xFF\xFE"
        Dim lSize           As Long
        Dim sPrefix         As String
        Dim nFile           As Integer
        Dim sCharset        As String
        Dim oStream         As Object
        
        On Error GoTo EH
        '--- get file size
        lSize = FileLen(sFile)
        If lSize = 0 Then
            Exit Function
        End If
        '--- read first 50 chars
        nFile = FreeFile()
        Open sFile For Binary Access Read As nFile
        sPrefix = String$(IIf(lSize < 50, lSize, 50), 0)
        Get nFile, , sPrefix
        Close nFile
        '--- figure out charset
        If Left$(sPrefix, 3) = BOM_UTF Then
            sCharset = "UTF-8"
        ElseIf Left$(sPrefix, 2) = BOM_UNICODE Or IsTextUnicode(ByVal sPrefix, Len(sPrefix), &HFFFF& - 2) <> 0 Then
            sCharset = "Unicode"
        ElseIf InStr(1, sPrefix, "<?xml", vbTextCompare) > 0 And InStr(1, sPrefix, "utf-8", vbTextCompare) > 0 Then
            '--- special xml encoding test
            sCharset = "UTF-8"
        Else
            sCharset = "Windows-1256"
        End If
        '--- plain text: direct VB6 read
        If LenB(ReadTextFile) = 0 And sCharset = "Windows-1256" Then
            nFile = FreeFile()
            Open sFile For Binary Access Read As nFile
            ReadTextFile = String$(lSize, 0)
            Get nFile, , ReadTextFile
            Close nFile
        End If
        '--- plain text + unicode: use FileSystemObject
        If LenB(ReadTextFile) = 0 And sCharset <> "UTF-8" Then
            On Error Resume Next  '--- checked
            ReadTextFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(sFile, ForReading, False, sCharset = "Unicode").ReadAll()
            On Error GoTo EH
        End If
        '--- plain text + unicode + utf-8: use ADODB.Stream
        If LenB(ReadTextFile) = 0 Then
            Set oStream = CreateObject("ADODB.Stream")
            With oStream
                .Open
                .Charset = sCharset
                .LoadFromFile sFile
                ReadTextFile = .ReadText()
            End With
        End If
        '--- pseudo-fix UTF-8
        If sCharset = "Windows-1256" Then
            For lSize = &H80 To &HFF
                sPrefix = ToUtf8(Chr$(lSize))
                If InStr(ReadTextFile, sPrefix) > 0 Then
                    ReadTextFile = Replace(ReadTextFile, sPrefix, Chr$(lSize))
                End If
            Next
        End If
        Exit Function
    EH:
        Debug.Print Err.Description
    End Function
    
    Private Function ToUtf8(sText As String) As String
        Dim lSize           As Long
        
        ToUtf8 = String$(4 * Len(sText), 0)
        lSize = WideCharToMultiByte(CP_UTF8, 0, StrPtr(sText), Len(sText), ByVal ToUtf8, Len(ToUtf8), 0, 0)
        ToUtf8 = Left$(ToUtf8, lSize)
    End Function
    
    Private Sub Command1_Click()
        Dim sText As String
        sText = ReadTextFile(App.Path & "\utf8.txt")
        Text1.SelText = "utf8.txt:" & vbCrLf & sText
        sText = ReadTextFile(App.Path & "\unicode.txt")
        Text1.SelText = "unicode.txt:" & vbCrLf & sText
        sText = ReadTextFile(App.Path & "\ascii.txt")
        Text1.SelText = "ascii.txt:" & vbCrLf & sText
    End Sub
    It should handle cp1256/utf-8/unicode text files in one go.

    Edit: It tries to handle utf-8 w/o BOM files too (last "pseudo-fix" comment).

    cheers,
    </wqw>

    I did not catch
    What are these codes?

  17. #17

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    I just want a code to "encode" a text file with the "ansi" and "unicode" program to "utf8" and if the text file "utf8" does not convert.

    I put a program in post 5 that does exactly the same thing, but it is made with .NET
    I want to create a program like this

    this program SubtitleFixer.zip

    Unfortunately, because of my poor English language, I can not explain well
    All sorry for me

  18. #18
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by aghamali View Post
    I just want a code to "encode" a text file with the "ansi" and "unicode" program to "utf8" and if the text file "utf8" does not convert
    Please do not feel sorry that the rest of the World does not understand your language ;-)

    Btw Google translate understands it, perhaps you could write some very clear and simple phrases in Arabic or Urdu and let Google translate it. See what it does with this phrase:

    شاید آپ عربی یا اردو میں بہت واضح اور آسان جملے لکھ سکتے ہیں اور گوگل کا ترجمہ کرتے ہیں.

    ربما يمكنك كتابة بعض العبارات واضحة جدا وبسيطة باللغة العربية أو الأردية والسماح جوجل ترجمة ذلك.

    Another approach - which situation is correct?
    A: My source files are in Unicode format, but I need codepage 1256
    B: My source files are in Utf-8 format, but I need codepage 1256
    C: My source files are in codepage 1256 format, but I need UTF-8
    D: My source files are in codepage 1256 format, but I need Unicode

  19. #19
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Help for encoding a text file with VB6

    If your files are not extremely large you could use the ADO Stream class for this (ADODB.Stream):

    Example:

    Code:
    Option Explicit
    
    'Requires a reference to:
    '
    '   Microsoft ActiveX Data Objects 2.5 Library (or later).
    
    Private Sub Main()
        Dim StreamIn As ADODB.Stream
        Dim Header() As Byte
        Dim HasBOM As Boolean
        Dim StreamOut As ADODB.Stream
    
        Set StreamIn = New ADODB.Stream
        With StreamIn
            'Load "utf-8.txt" UTF-8 file, detect BOM and skip over if present:
            .Open
            .Type = adTypeBinary
            .LoadFromFile "utf-8.txt"
            If .Size >= 3 Then
                Header = .Read(3)
                HasBOM = Header(0) = &HEF And Header(1) = &HBB And Header(2) = &HBF
            End If
            .Position = 0 'Allows us to change to text mode and assign Charset.
            .Type = adTypeText
            .Charset = "utf-8"
        End With
    
        Set StreamOut = New ADODB.Stream
        With StreamOut
            'Copy the text to "windows-1256.txt" ANSI 1252 file:
            .Open
            .Charset = "windows-1256"
            .Type = adTypeText
            'Move to beginning:
            If HasBOM Then
                StreamIn.Position = 3
            Else
                StreamIn.Position = 0
            End If
            .WriteText StreamIn.ReadText(adReadAll)
            .SaveToFile "windows-1256.txt", adSaveCreateOverWrite
            .Close
    
            'Copy the text to "unicode.txt" UTF-16LE file:
            .Open
            .Charset = "unicode"
            .Type = adTypeText
            'Move to beginning:
            If HasBOM Then
                StreamIn.Position = 3
            Else
                StreamIn.Position = 0
            End If
            .WriteText StreamIn.ReadText(adReadAll)
            .SaveToFile "unicode.txt", adSaveCreateOverWrite
            .Close
        End With
        StreamIn.Close
    
        MsgBox "Finished"
    End Sub
    Regarding the Charset property:

    String value that specifies the character set into which the contents of the Stream will be translated. The default value is "Unicode". Allowed values are typical strings passed over the interface as Internet character set strings (for example, "iso-8859-1", "Windows-1252", etc.). For a list of the character set strings that is known by a system, see the subkeys of HKEY_CLASSES_ROOT\MIME\Database\Charset in the Windows Registry.

  20. #20

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by jj2007 View Post
    Please do not feel sorry that the rest of the World does not understand your language ;-)

    Btw Google translate understands it, perhaps you could write some very clear and simple phrases in Arabic or Urdu and let Google translate it. See what it does with this phrase:

    شاید آپ عربی یا اردو میں بہت واضح اور آسان جملے لکھ سکتے ہیں اور گوگل کا ترجمہ کرتے ہیں.

    ربما يمكنك كتابة بعض العبارات واضحة جدا وبسيطة باللغة العربية أو الأردية والسماح جوجل ترجمة ذلك.

    Another approach - which situation is correct?
    A: My source files are in Unicode format, but I need codepage 1256
    B: My source files are in Utf-8 format, but I need codepage 1256
    C: My source files are in codepage 1256 format, but I need UTF-8
    D: My source files are in codepage 1256 format, but I need Unicode
    Thank you for your thought

    None of the above
    It's not clear what the unicode text file is
    I want the app to recognize it

  21. #21

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by dilettante View Post
    If your files are not extremely large you could use the ADO Stream class for this (ADODB.Stream):

    Example:

    Code:
    Option Explicit
    
    'Requires a reference to:
    '
    '   Microsoft ActiveX Data Objects 2.5 Library (or later).
    
    Private Sub Main()
        Dim StreamIn As ADODB.Stream
        Dim Header() As Byte
        Dim HasBOM As Boolean
        Dim StreamOut As ADODB.Stream
    
        Set StreamIn = New ADODB.Stream
        With StreamIn
            'Load "utf-8.txt" UTF-8 file, detect BOM and skip over if present:
            .Open
            .Type = adTypeBinary
            .LoadFromFile "utf-8.txt"
            If .Size >= 3 Then
                Header = .Read(3)
                HasBOM = Header(0) = &HEF And Header(1) = &HBB And Header(2) = &HBF
            End If
            .Position = 0 'Allows us to change to text mode and assign Charset.
            .Type = adTypeText
            .Charset = "utf-8"
        End With
    
        Set StreamOut = New ADODB.Stream
        With StreamOut
            'Copy the text to "windows-1256.txt" ANSI 1252 file:
            .Open
            .Charset = "windows-1256"
            .Type = adTypeText
            'Move to beginning:
            If HasBOM Then
                StreamIn.Position = 3
            Else
                StreamIn.Position = 0
            End If
            .WriteText StreamIn.ReadText(adReadAll)
            .SaveToFile "windows-1256.txt", adSaveCreateOverWrite
            .Close
    
            'Copy the text to "unicode.txt" UTF-16LE file:
            .Open
            .Charset = "unicode"
            .Type = adTypeText
            'Move to beginning:
            If HasBOM Then
                StreamIn.Position = 3
            Else
                StreamIn.Position = 0
            End If
            .WriteText StreamIn.ReadText(adReadAll)
            .SaveToFile "unicode.txt", adSaveCreateOverWrite
            .Close
        End With
        StreamIn.Close
    
        MsgBox "Finished"
    End Sub
    Regarding the Charset property:


    Thank you dear friend
    My file is a movie subtitle file

    You can further explain this class

  22. #22

    Thread Starter
    Member
    Join Date
    Jul 2017
    Posts
    46

    Re: Help for encoding a text file with VB6

    Name:  photo_2017-07-13_00-32-13.jpg
Views: 7569
Size:  39.1 KB

  23. #23
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    7,667

    Re: Help for encoding a text file with VB6

    Semi-related, couldn't help myself, I had the idea to duplicate Notepad's Open dialog where the user selects the encoding manually, and combined that with wqweto's ReadTextFile function; to manually force which option you use. Really I just wanted to do the dialog lol... but it works. The textbox on the form is API-created so it supports all characters. (Note: Doesn't work on XP and requires oleexp.tlb, the default folder code can be enabled-, it needs mIID.bas from the oleexp zip which otherwise wasn't needed so wasn't added)



    Code:
    Private Sub mnuOpen_Click()
    On Error Resume Next
    Dim isiRes As IShellItem, sPath As String
    Dim nEncode As Long, sEncode As String
    
    'Dim isiDef As IShellItem 'default folder
    'Dim kfm As New KnownFolderManager
    'Dim kfDocs As IKnownFolder
    'kfm.GetFolder FOLDERID_Documents, kfDocs
    'kfDocs.GetShellItem KF_FLAG_DEFAULT, IID_IShellItem, isiDef
    
    Dim lPtr As Long
    Dim lOptions As FILEOPENDIALOGOPTIONS
    Dim fdc As IFileDialogCustomize
    
    'Set up filter
    Dim FileFilter() As COMDLG_FILTERSPEC
    ReDim FileFilter(1)
    
    FileFilter(0).pszName = "Text Files"
    FileFilter(0).pszSpec = "*.txt"
    
    FileFilter(1).pszName = "All Files"
    FileFilter(1).pszSpec = "*.*"
    
    Set fod = New FileOpenDialog
    With fod
        .SetTitle "Open"
        .SetOptions FOS_FILEMUSTEXIST Or FOS_DONTADDTORECENT
    '    .SetDefaultFolder isiDef
        .SetFileTypes 2, VarPtr(FileFilter(0).pszName)
        
        Set fdc = fod
        fdc.AddComboBox 100&
        fdc.AddControlItem 100&, 1&, "Windows-1256"
        fdc.AddControlItem 100&, 2&, "Unicode"
        fdc.AddControlItem 100&, 3&, "UTF-8"
        fdc.SetSelectedControlItem 100&, 1&
        
        .Show Me.hwnd
        
        .GetResult isiRes
        If (isiRes Is Nothing) = False Then
            isiRes.GetDisplayName SIGDN_FILESYSPATH, lPtr
            sPath = BStrFromLPWStr(lPtr)
            
            fdc.GetSelectedControlItem 100&, nEncode
            
            Select Case nEncode
                Case 1: sEncode = "Windows-1256"
                Case 2: sEncode = "Unicode"
                Case 3: sEncode = "UTF-8    "
            End Select
            
            Edit_SetText hEdit, ReadTextFile(sPath, sEncode)
        End If
    End With
    End Sub
    Attached Files Attached Files
    Last edited by fafalone; Jul 13th, 2017 at 01:25 AM.

  24. #24
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: Help for encoding a text file with VB6

    @aghamali: This workflow diagram is not the way things work in VB6. It's actually much simpler -- you just need 2 simple routines to impl your conversion effort in 3 steps.

    1. Select filename w/ std file open dialog
    2. Read file contents and convert it to unicode string while reading
    3. Save unicode string to file and convert it to utf-8 while saving

    First step is something completely out of the scope of this thread but fafalone posted some snippet anyway.

    Second step can be done w/ the universal `ReadTextFile` function from post #11

    Third step can be done with `ADODB.Stream` as simple as this
    Code:
    Private Sub WriteUtf8File(sFile As String, sText As String)
        Const adSaveCreateOverWrite = 2
        
        On Error GoTo EH
        With CreateObject("ADODB.Stream")
            .Open
            .Charset = "UTF-8"
            .WriteText sText
            .SaveToFile sFile, adSaveCreateOverWrite
        End With
        Exit Sub
    EH:
        Debug.Print Err.Description
    End Sub
    In VB6 native strings are already in unicode (16-bit UCS-2 obsolete encoding, not UTF-16) and it's very inconvenient (next to impossible) to use utf-8 strings in memory for some kind of processing, so better stick with "normal" unicode BSTR strings.

    Also note that its best to convert ansi/utf-8/whatever/encoding to unicode VB6 string exactly when reading the input file. Conversion is not a separate process -- like first read all the file as a binary blog, then convert to unicode BSTR string in this case.

    cheers,
    </wqw>

  25. #25
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by fafalone View Post
    Semi-related, couldn't help myself, I had the idea to duplicate Notepad's Open dialog...
    Chokes (not surprisingly) here: Private fod As FileOpenDialog

    Missing oleexp, yes. And Installing oleexp is only slightly more complicated than installing GCC on Windows. You are dealing with a person who is not very fluent in English, and of unknown skills in VB6.

    I won't post my five-liner because I know how allergic VB users are to other dialects, but if you really want to help aghamali, try to give him something short and crispy, such as
    - get filename from commandline (Unicode)
    - check which encoding by reading a handful of bytes (BOM? Wide chars?)
    - translate and write to file.

  26. #26
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    7,667

    Re: Help for encoding a text file with VB6

    You copy the .tlb file from the zip into SysWOW64, that's it. How's that as bad as GCC? Then it's no longer missing and there's no more error.

  27. #27
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Help for encoding a text file with VB6

    Don't dump random things into system directories like that.

    The more appropriate location is a subfolder you create under %ProgramFiles%\Common Files or %ProgramFiles(x86)%\Common Files and ideally such a type library would come wrapped in an installer that does just that.

  28. #28
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    7,667

    Re: Help for encoding a text file with VB6

    Hmm I'm not sure I can even recall an installer that was just for a VB TLB. Although with all the other files and versions an installer for oleexp isn't a bad idea...

  29. #29
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Help for encoding a text file with VB6

    Quote Originally Posted by fafalone View Post
    You copy the .tlb file from the zip into SysWOW64, that's it. How's that as bad as GCC? Then it's no longer missing and there's no more error.
    Against dilettante's wise advice not to dump things into system folders, I moved it to C:\Windows\SysWOW64\oleexp.tlb

    Same error as before. Interesting: The error box says it is missing in the project's folder. But if I place a copy there, VB won't find it.
    Last edited by jj2007; Jul 15th, 2017 at 07:18 PM.

  30. #30
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    7,667

    Re: Help for encoding a text file with VB6

    When the references dialog that comes up still says missing, click 'Browse...' and select it from the dialog. You'll only have to do that once.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width