Hello friends
First, forgive me for being weak English
I want to write a program with vb6
When the text file is dropped and released, we convert the file to "Windows-1256"
"Utf8" and "Ansi" and "unicode" are converted to "Windows-1256"
If you get a lot better
Windows-1256 is an ANSI codepage, not a non-ANSI encoding.
Many text files have no BOM or other indicator of encoding, so guessing what encoding a file has can be problematic. Guessing what you want to re-encode the text file into is even harder but it sounds like you always want to produce Windows-1256.
Thank you for replying
I was wondering if you or anyone can provide a code that
First check a file encode
If the file encode is utf-8 show a text and do nothing else
And if the file encode is anything beside utf-8 then change the file encode to utf-8
The problem with detecting UTF-8 is that both ASCII and all ANSI and DBCS encodings are valid UTF-8. Even if the file has bytes with the high bit set (greater than 127) it might be ANSI or UTF-16.
In essence, I need to make a program for encoding Persian and Arabic subtitles
I have a program that is made with .NET
Is it not possible in Visual Basic 6?
I'm attaching the application made with .NET
I have created the second program and I attach it
Private Const CP_UTF8 As Long = 65001
Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long) As Long
Public Function ConvertToUTF8(ByRef Source As String) As Byte()
Dim Length As Long
Dim Pointer As Long
Dim Size As Long
Dim buffer() As Byte
If Len(Source) > 0 Then
Length = Len(Source)
Pointer = StrPtr(Source)
Size = WideCharToMultiByte(CP_UTF8, 0, Pointer, Length, 0, 0, 0, 0)
If Size > 0 Then
ReDim buffer(0 To Size - 1)
WideCharToMultiByte CP_UTF8, 0, Pointer, Length, VarPtr(buffer(0)), Size, 0, 0
ConvertToUTF8 = buffer
End If
End If
End Function
'---------------------------------------------------------------------------------------
' Procedure : ContainsUTF8
' DateTime : 2012-10-23
' Author : Arnoutdv
' Purpose : Recognize UTF8 characters in byte array
' : If byte1 >= "&C2" and byte1 <= "D3" and
' : byte2 >= "&80" and byte2 <= "BF" then UTF8 encoded character
' Source : http://www.utf8-chartable.de/unicode-utf8-table.pl
' Revision : 2014-04-02 - 3 bytes UTF8 : http://www.utf8-chartable.de/unicode-utf8-table.pl?start=2048
'---------------------------------------------------------------------------------------
Public Function ContainsUTF8(ByRef Source() As Byte) As Boolean
Dim i As Long, lUBound As Long
Dim hexC2 As Byte, hexDB As Byte, hexBF As Byte, hex80 As Byte, hexE0 As Byte
Dim CurByte As Byte
If pUTF8header(Source) Then
ContainsUTF8 = True
Else
hexC2 = &HC2
hexDB = &HDB
hex80 = &H80
hexBF = &HBF
hexE0 = &HE0
lUBound = UBound(Source)
For i = 0 To lUBound - 1
CurByte = Source(i)
If CurByte >= hexC2 And CurByte <= hexDB Then
If (Source(i + 1) And hex80) Then
ContainsUTF8 = True
Exit For
End If
End If
If i + 2 <= lUBound Then
If CurByte >= hexE0 Then
If (Source(i + 1) And hex80) And (Source(i + 2) And hex80) Then
ContainsUTF8 = True
Exit For
End If
End If
End If
Next i
End If
End Function
'---------------------------------------------------------------------------------------
' Procedure : pUTF8header
' DateTime : 2013-06-14
' Author : Arnoutdv
' Purpose :
' Reference : http://en.wikipedia.org/wiki/Byte_order_mark
'---------------------------------------------------------------------------------------
Private Function pUTF8header(Source() As Byte) As Boolean
If UBound(Source) >= 2 Then
If Source(0) = &HEF Then
If Source(1) = &HBB Then
If Source(2) = &HBF Then
pUTF8header = True
Exit Function
End If
End If
End If
End If
End Function
Private Const CP_UTF8 As Long = 65001
Private Declare Function WideCharToMultiByte Lib "kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long) As Long
Public Function ConvertToUTF8(ByRef Source As String) As Byte()
Dim Length As Long
Dim Pointer As Long
Dim Size As Long
Dim buffer() As Byte
If Len(Source) > 0 Then
Length = Len(Source)
Pointer = StrPtr(Source)
Size = WideCharToMultiByte(CP_UTF8, 0, Pointer, Length, 0, 0, 0, 0)
If Size > 0 Then
ReDim buffer(0 To Size - 1)
WideCharToMultiByte CP_UTF8, 0, Pointer, Length, VarPtr(buffer(0)), Size, 0, 0
ConvertToUTF8 = buffer
End If
End If
End Function
'---------------------------------------------------------------------------------------
' Procedure : ContainsUTF8
' DateTime : 2012-10-23
' Author : Arnoutdv
' Purpose : Recognize UTF8 characters in byte array
' : If byte1 >= "&C2" and byte1 <= "D3" and
' : byte2 >= "&80" and byte2 <= "BF" then UTF8 encoded character
' Source : http://www.utf8-chartable.de/unicode-utf8-table.pl
' Revision : 2014-04-02 - 3 bytes UTF8 : http://www.utf8-chartable.de/unicode-utf8-table.pl?start=2048
'---------------------------------------------------------------------------------------
Public Function ContainsUTF8(ByRef Source() As Byte) As Boolean
Dim i As Long, lUBound As Long
Dim hexC2 As Byte, hexDB As Byte, hexBF As Byte, hex80 As Byte, hexE0 As Byte
Dim CurByte As Byte
If pUTF8header(Source) Then
ContainsUTF8 = True
Else
hexC2 = &HC2
hexDB = &HDB
hex80 = &H80
hexBF = &HBF
hexE0 = &HE0
lUBound = UBound(Source)
For i = 0 To lUBound - 1
CurByte = Source(i)
If CurByte >= hexC2 And CurByte <= hexDB Then
If (Source(i + 1) And hex80) Then
ContainsUTF8 = True
Exit For
End If
End If
If i + 2 <= lUBound Then
If CurByte >= hexE0 Then
If (Source(i + 1) And hex80) And (Source(i + 2) And hex80) Then
ContainsUTF8 = True
Exit For
End If
End If
End If
Next i
End If
End Function
'---------------------------------------------------------------------------------------
' Procedure : pUTF8header
' DateTime : 2013-06-14
' Author : Arnoutdv
' Purpose :
' Reference : http://en.wikipedia.org/wiki/Byte_order_mark
'---------------------------------------------------------------------------------------
Private Function pUTF8header(Source() As Byte) As Boolean
If UBound(Source) >= 2 Then
If Source(0) = &HEF Then
If Source(1) = &HBB Then
If Source(2) = &HBF Then
pUTF8header = True
Exit Function
End If
End If
End If
End If
End Function
When the text file is dropped and released, we convert the file to "Windows-1256"
"Utf8" and "Ansi" and "unicode" are converted to "Windows-1256"
So you basically have text files in Ansi (1252?), Utf-8 and Unicode as input, and you want Ansi-1256 as output, right?
EDIT: I see that there is a Unicode.txt in your project folder. But that one is not Unicode, it's Utf-8. Which reduces the problem to "Convert UTF-8 to codepage 1256".
There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
Last edited by jj2007; Jul 12th, 2017 at 08:13 AM.
There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
This either trolling level 11 or just very hilarious :-))
@aghamali: Try this implementation of a universal `ReadTextFile` function:
Code:
'--- for WideCharToMultiByte
Private Const CP_UTF8 As Long = 65001
Private Declare Function IsTextUnicode Lib "advapi32" (lpBuffer As Any, ByVal cb As Long, lpi As Long) As Long
Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
Private Function ReadTextFile(sFile As String) As String
Const ForReading As Long = 1
Const BOM_UTF As String = "п»ї" '--- "\xEF\xBB\xBF"
Const BOM_UNICODE As String = "яю" '--- "\xFF\xFE"
Dim lSize As Long
Dim sPrefix As String
Dim nFile As Integer
Dim sCharset As String
Dim oStream As Object
On Error GoTo EH
'--- get file size
lSize = FileLen(sFile)
If lSize = 0 Then
Exit Function
End If
'--- read first 50 chars
nFile = FreeFile()
Open sFile For Binary Access Read As nFile
sPrefix = String$(IIf(lSize < 50, lSize, 50), 0)
Get nFile, , sPrefix
Close nFile
'--- figure out charset
If Left$(sPrefix, 3) = BOM_UTF Then
sCharset = "UTF-8"
ElseIf Left$(sPrefix, 2) = BOM_UNICODE Or IsTextUnicode(ByVal sPrefix, Len(sPrefix), &HFFFF& - 2) <> 0 Then
sCharset = "Unicode"
ElseIf InStr(1, sPrefix, "<?xml", vbTextCompare) > 0 And InStr(1, sPrefix, "utf-8", vbTextCompare) > 0 Then
'--- special xml encoding test
sCharset = "UTF-8"
Else
sCharset = "Windows-1256"
End If
'--- plain text: direct VB6 read
If LenB(ReadTextFile) = 0 And sCharset = "Windows-1256" Then
nFile = FreeFile()
Open sFile For Binary Access Read As nFile
ReadTextFile = String$(lSize, 0)
Get nFile, , ReadTextFile
Close nFile
End If
'--- plain text + unicode: use FileSystemObject
If LenB(ReadTextFile) = 0 And sCharset <> "UTF-8" Then
On Error Resume Next '--- checked
ReadTextFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(sFile, ForReading, False, sCharset = "Unicode").ReadAll()
On Error GoTo EH
End If
'--- plain text + unicode + utf-8: use ADODB.Stream
If LenB(ReadTextFile) = 0 Then
Set oStream = CreateObject("ADODB.Stream")
With oStream
.Open
.Charset = sCharset
.LoadFromFile sFile
ReadTextFile = .ReadText()
End With
End If
'--- pseudo-fix UTF-8
If sCharset = "Windows-1256" Then
For lSize = &H80 To &HFF
sPrefix = ToUtf8(Chr$(lSize))
If InStr(ReadTextFile, sPrefix) > 0 Then
ReadTextFile = Replace(ReadTextFile, sPrefix, Chr$(lSize))
End If
Next
End If
Exit Function
EH:
Debug.Print Err.Description
End Function
Private Function ToUtf8(sText As String) As String
Dim lSize As Long
ToUtf8 = String$(4 * Len(sText), 0)
lSize = WideCharToMultiByte(CP_UTF8, 0, StrPtr(sText), Len(sText), ByVal ToUtf8, Len(ToUtf8), 0, 0)
ToUtf8 = Left$(ToUtf8, lSize)
End Function
Private Sub Command1_Click()
Dim sText As String
sText = ReadTextFile(App.Path & "\utf8.txt")
Text1.SelText = "utf8.txt:" & vbCrLf & sText
sText = ReadTextFile(App.Path & "\unicode.txt")
Text1.SelText = "unicode.txt:" & vbCrLf & sText
sText = ReadTextFile(App.Path & "\ascii.txt")
Text1.SelText = "ascii.txt:" & vbCrLf & sText
End Sub
It should handle cp1256/utf-8/unicode text files in one go.
Edit: It tries to handle utf-8 w/o BOM files too (last "pseudo-fix" comment).
I need 5 lines of code to translate Unicode.txt to an exact copy of ASCII.txt (the cp1256 format). But I like complicated solutions, too.
Hey, I'm copy/pasting production code here -- the cruft is 10+ years old at least.
This is like a honey badger kind of read-every-format-of-text-file the end users manage to find kind of function. Works for me, probably will work for universal subtitles converter too (*and* supports utf-8 w/o BOM crazy popular format).
No hard feelings but requesting a second explanation when the OP is just fed up explaining everything is in post #5 is kind of trollish. . . Finally suggesting MasmBasic reimpl in VB6 forum thread is level 11 :-))
So you basically have text files in Ansi (1252?), Utf-8 and Unicode as input, and you want Ansi-1256 as output, right?
EDIT: I see that there is a Unicode.txt in your project folder. But that one is not Unicode, it's Utf-8. Which reduces the problem to "Convert UTF-8 to codepage 1256".
There is ConvertCp$, but you seem to have a reason to use VB6, can you explain why?
I want an app for converting movie subtitles to a playable version on TV with support from Persian and Arabic, which is UTF-8.
And I only know his language
So far, I've just tried to get the following results
If the file "Ansi" is turned on, it will be done
If the "utf8" file is corrupted, the file will be corrupted
If the file and "unicode" is not converted
No hard feelings but requesting a second explanation when the OP is just fed up explaining everything is in post #5 is kind of trollish.
I just misunderstood his quite cryptic original question.
I assumed he wanted UTF8 as output and detect whether the input file was already UTF8, hence the code I posted.
This either trolling level 11 or just very hilarious :-))
@aghamali: Try this implementation of a universal `ReadTextFile` function:
Code:
'--- for WideCharToMultiByte
Private Const CP_UTF8 As Long = 65001
Private Declare Function IsTextUnicode Lib "advapi32" (lpBuffer As Any, ByVal cb As Long, lpi As Long) As Long
Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, lpMultiByteStr As Any, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long
Private Function ReadTextFile(sFile As String) As String
Const ForReading As Long = 1
Const BOM_UTF As String = "п»ї" '--- "\xEF\xBB\xBF"
Const BOM_UNICODE As String = "яю" '--- "\xFF\xFE"
Dim lSize As Long
Dim sPrefix As String
Dim nFile As Integer
Dim sCharset As String
Dim oStream As Object
On Error GoTo EH
'--- get file size
lSize = FileLen(sFile)
If lSize = 0 Then
Exit Function
End If
'--- read first 50 chars
nFile = FreeFile()
Open sFile For Binary Access Read As nFile
sPrefix = String$(IIf(lSize < 50, lSize, 50), 0)
Get nFile, , sPrefix
Close nFile
'--- figure out charset
If Left$(sPrefix, 3) = BOM_UTF Then
sCharset = "UTF-8"
ElseIf Left$(sPrefix, 2) = BOM_UNICODE Or IsTextUnicode(ByVal sPrefix, Len(sPrefix), &HFFFF& - 2) <> 0 Then
sCharset = "Unicode"
ElseIf InStr(1, sPrefix, "<?xml", vbTextCompare) > 0 And InStr(1, sPrefix, "utf-8", vbTextCompare) > 0 Then
'--- special xml encoding test
sCharset = "UTF-8"
Else
sCharset = "Windows-1256"
End If
'--- plain text: direct VB6 read
If LenB(ReadTextFile) = 0 And sCharset = "Windows-1256" Then
nFile = FreeFile()
Open sFile For Binary Access Read As nFile
ReadTextFile = String$(lSize, 0)
Get nFile, , ReadTextFile
Close nFile
End If
'--- plain text + unicode: use FileSystemObject
If LenB(ReadTextFile) = 0 And sCharset <> "UTF-8" Then
On Error Resume Next '--- checked
ReadTextFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(sFile, ForReading, False, sCharset = "Unicode").ReadAll()
On Error GoTo EH
End If
'--- plain text + unicode + utf-8: use ADODB.Stream
If LenB(ReadTextFile) = 0 Then
Set oStream = CreateObject("ADODB.Stream")
With oStream
.Open
.Charset = sCharset
.LoadFromFile sFile
ReadTextFile = .ReadText()
End With
End If
'--- pseudo-fix UTF-8
If sCharset = "Windows-1256" Then
For lSize = &H80 To &HFF
sPrefix = ToUtf8(Chr$(lSize))
If InStr(ReadTextFile, sPrefix) > 0 Then
ReadTextFile = Replace(ReadTextFile, sPrefix, Chr$(lSize))
End If
Next
End If
Exit Function
EH:
Debug.Print Err.Description
End Function
Private Function ToUtf8(sText As String) As String
Dim lSize As Long
ToUtf8 = String$(4 * Len(sText), 0)
lSize = WideCharToMultiByte(CP_UTF8, 0, StrPtr(sText), Len(sText), ByVal ToUtf8, Len(ToUtf8), 0, 0)
ToUtf8 = Left$(ToUtf8, lSize)
End Function
Private Sub Command1_Click()
Dim sText As String
sText = ReadTextFile(App.Path & "\utf8.txt")
Text1.SelText = "utf8.txt:" & vbCrLf & sText
sText = ReadTextFile(App.Path & "\unicode.txt")
Text1.SelText = "unicode.txt:" & vbCrLf & sText
sText = ReadTextFile(App.Path & "\ascii.txt")
Text1.SelText = "ascii.txt:" & vbCrLf & sText
End Sub
It should handle cp1256/utf-8/unicode text files in one go.
Edit: It tries to handle utf-8 w/o BOM files too (last "pseudo-fix" comment).
I just want a code to "encode" a text file with the "ansi" and "unicode" program to "utf8" and if the text file "utf8" does not convert
Please do not feel sorry that the rest of the World does not understand your language ;-)
Btw Google translate understands it, perhaps you could write some very clear and simple phrases in Arabic or Urdu and let Google translate it. See what it does with this phrase:
شاید آپ عربی یا اردو میں بہت واضح اور آسان جملے لکھ سکتے ہیں اور گوگل کا ترجمہ کرتے ہیں.
ربما يمكنك كتابة بعض العبارات واضحة جدا وبسيطة باللغة العربية أو الأردية والسماح جوجل ترجمة ذلك.
Another approach - which situation is correct?
A: My source files are in Unicode format, but I need codepage 1256
B: My source files are in Utf-8 format, but I need codepage 1256
C: My source files are in codepage 1256 format, but I need UTF-8
D: My source files are in codepage 1256 format, but I need Unicode
If your files are not extremely large you could use the ADO Stream class for this (ADODB.Stream):
Example:
Code:
Option Explicit
'Requires a reference to:
'
' Microsoft ActiveX Data Objects 2.5 Library (or later).
Private Sub Main()
Dim StreamIn As ADODB.Stream
Dim Header() As Byte
Dim HasBOM As Boolean
Dim StreamOut As ADODB.Stream
Set StreamIn = New ADODB.Stream
With StreamIn
'Load "utf-8.txt" UTF-8 file, detect BOM and skip over if present:
.Open
.Type = adTypeBinary
.LoadFromFile "utf-8.txt"
If .Size >= 3 Then
Header = .Read(3)
HasBOM = Header(0) = &HEF And Header(1) = &HBB And Header(2) = &HBF
End If
.Position = 0 'Allows us to change to text mode and assign Charset.
.Type = adTypeText
.Charset = "utf-8"
End With
Set StreamOut = New ADODB.Stream
With StreamOut
'Copy the text to "windows-1256.txt" ANSI 1252 file:
.Open
.Charset = "windows-1256"
.Type = adTypeText
'Move to beginning:
If HasBOM Then
StreamIn.Position = 3
Else
StreamIn.Position = 0
End If
.WriteText StreamIn.ReadText(adReadAll)
.SaveToFile "windows-1256.txt", adSaveCreateOverWrite
.Close
'Copy the text to "unicode.txt" UTF-16LE file:
.Open
.Charset = "unicode"
.Type = adTypeText
'Move to beginning:
If HasBOM Then
StreamIn.Position = 3
Else
StreamIn.Position = 0
End If
.WriteText StreamIn.ReadText(adReadAll)
.SaveToFile "unicode.txt", adSaveCreateOverWrite
.Close
End With
StreamIn.Close
MsgBox "Finished"
End Sub
Regarding the Charset property:
String value that specifies the character set into which the contents of the Stream will be translated. The default value is "Unicode". Allowed values are typical strings passed over the interface as Internet character set strings (for example, "iso-8859-1", "Windows-1252", etc.). For a list of the character set strings that is known by a system, see the subkeys of HKEY_CLASSES_ROOT\MIME\Database\Charset in the Windows Registry.
Please do not feel sorry that the rest of the World does not understand your language ;-)
Btw Google translate understands it, perhaps you could write some very clear and simple phrases in Arabic or Urdu and let Google translate it. See what it does with this phrase:
شاید آپ عربی یا اردو میں بہت واضح اور آسان جملے لکھ سکتے ہیں اور گوگل کا ترجمہ کرتے ہیں.
ربما يمكنك كتابة بعض العبارات واضحة جدا وبسيطة باللغة العربية أو الأردية والسماح جوجل ترجمة ذلك.
Another approach - which situation is correct?
A: My source files are in Unicode format, but I need codepage 1256
B: My source files are in Utf-8 format, but I need codepage 1256
C: My source files are in codepage 1256 format, but I need UTF-8
D: My source files are in codepage 1256 format, but I need Unicode
Thank you for your thought
None of the above
It's not clear what the unicode text file is
I want the app to recognize it
If your files are not extremely large you could use the ADO Stream class for this (ADODB.Stream):
Example:
Code:
Option Explicit
'Requires a reference to:
'
' Microsoft ActiveX Data Objects 2.5 Library (or later).
Private Sub Main()
Dim StreamIn As ADODB.Stream
Dim Header() As Byte
Dim HasBOM As Boolean
Dim StreamOut As ADODB.Stream
Set StreamIn = New ADODB.Stream
With StreamIn
'Load "utf-8.txt" UTF-8 file, detect BOM and skip over if present:
.Open
.Type = adTypeBinary
.LoadFromFile "utf-8.txt"
If .Size >= 3 Then
Header = .Read(3)
HasBOM = Header(0) = &HEF And Header(1) = &HBB And Header(2) = &HBF
End If
.Position = 0 'Allows us to change to text mode and assign Charset.
.Type = adTypeText
.Charset = "utf-8"
End With
Set StreamOut = New ADODB.Stream
With StreamOut
'Copy the text to "windows-1256.txt" ANSI 1252 file:
.Open
.Charset = "windows-1256"
.Type = adTypeText
'Move to beginning:
If HasBOM Then
StreamIn.Position = 3
Else
StreamIn.Position = 0
End If
.WriteText StreamIn.ReadText(adReadAll)
.SaveToFile "windows-1256.txt", adSaveCreateOverWrite
.Close
'Copy the text to "unicode.txt" UTF-16LE file:
.Open
.Charset = "unicode"
.Type = adTypeText
'Move to beginning:
If HasBOM Then
StreamIn.Position = 3
Else
StreamIn.Position = 0
End If
.WriteText StreamIn.ReadText(adReadAll)
.SaveToFile "unicode.txt", adSaveCreateOverWrite
.Close
End With
StreamIn.Close
MsgBox "Finished"
End Sub
Regarding the Charset property:
Thank you dear friend
My file is a movie subtitle file
Semi-related, couldn't help myself, I had the idea to duplicate Notepad's Open dialog where the user selects the encoding manually, and combined that with wqweto's ReadTextFile function; to manually force which option you use. Really I just wanted to do the dialog lol... but it works. The textbox on the form is API-created so it supports all characters. (Note: Doesn't work on XP and requires oleexp.tlb, the default folder code can be enabled-, it needs mIID.bas from the oleexp zip which otherwise wasn't needed so wasn't added)
Code:
Private Sub mnuOpen_Click()
On Error Resume Next
Dim isiRes As IShellItem, sPath As String
Dim nEncode As Long, sEncode As String
'Dim isiDef As IShellItem 'default folder
'Dim kfm As New KnownFolderManager
'Dim kfDocs As IKnownFolder
'kfm.GetFolder FOLDERID_Documents, kfDocs
'kfDocs.GetShellItem KF_FLAG_DEFAULT, IID_IShellItem, isiDef
Dim lPtr As Long
Dim lOptions As FILEOPENDIALOGOPTIONS
Dim fdc As IFileDialogCustomize
'Set up filter
Dim FileFilter() As COMDLG_FILTERSPEC
ReDim FileFilter(1)
FileFilter(0).pszName = "Text Files"
FileFilter(0).pszSpec = "*.txt"
FileFilter(1).pszName = "All Files"
FileFilter(1).pszSpec = "*.*"
Set fod = New FileOpenDialog
With fod
.SetTitle "Open"
.SetOptions FOS_FILEMUSTEXIST Or FOS_DONTADDTORECENT
' .SetDefaultFolder isiDef
.SetFileTypes 2, VarPtr(FileFilter(0).pszName)
Set fdc = fod
fdc.AddComboBox 100&
fdc.AddControlItem 100&, 1&, "Windows-1256"
fdc.AddControlItem 100&, 2&, "Unicode"
fdc.AddControlItem 100&, 3&, "UTF-8"
fdc.SetSelectedControlItem 100&, 1&
.Show Me.hwnd
.GetResult isiRes
If (isiRes Is Nothing) = False Then
isiRes.GetDisplayName SIGDN_FILESYSPATH, lPtr
sPath = BStrFromLPWStr(lPtr)
fdc.GetSelectedControlItem 100&, nEncode
Select Case nEncode
Case 1: sEncode = "Windows-1256"
Case 2: sEncode = "Unicode"
Case 3: sEncode = "UTF-8 "
End Select
Edit_SetText hEdit, ReadTextFile(sPath, sEncode)
End If
End With
End Sub
Last edited by fafalone; Jul 13th, 2017 at 01:25 AM.
@aghamali: This workflow diagram is not the way things work in VB6. It's actually much simpler -- you just need 2 simple routines to impl your conversion effort in 3 steps.
1. Select filename w/ std file open dialog
2. Read file contents and convert it to unicode string while reading
3. Save unicode string to file and convert it to utf-8 while saving
First step is something completely out of the scope of this thread but fafalone posted some snippet anyway.
Second step can be done w/ the universal `ReadTextFile` function from post #11
Third step can be done with `ADODB.Stream` as simple as this
Code:
Private Sub WriteUtf8File(sFile As String, sText As String)
Const adSaveCreateOverWrite = 2
On Error GoTo EH
With CreateObject("ADODB.Stream")
.Open
.Charset = "UTF-8"
.WriteText sText
.SaveToFile sFile, adSaveCreateOverWrite
End With
Exit Sub
EH:
Debug.Print Err.Description
End Sub
In VB6 native strings are already in unicode (16-bit UCS-2 obsolete encoding, not UTF-16) and it's very inconvenient (next to impossible) to use utf-8 strings in memory for some kind of processing, so better stick with "normal" unicode BSTR strings.
Also note that its best to convert ansi/utf-8/whatever/encoding to unicode VB6 string exactly when reading the input file. Conversion is not a separate process -- like first read all the file as a binary blog, then convert to unicode BSTR string in this case.
Semi-related, couldn't help myself, I had the idea to duplicate Notepad's Open dialog...
Chokes (not surprisingly) here: Private fod As FileOpenDialog
Missing oleexp, yes. And Installing oleexp is only slightly more complicated than installing GCC on Windows. You are dealing with a person who is not very fluent in English, and of unknown skills in VB6.
I won't post my five-liner because I know how allergic VB users are to other dialects, but if you really want to help aghamali, try to give him something short and crispy, such as
- get filename from commandline (Unicode)
- check which encoding by reading a handful of bytes (BOM? Wide chars?)
- translate and write to file.
Don't dump random things into system directories like that.
The more appropriate location is a subfolder you create under %ProgramFiles%\Common Files or %ProgramFiles(x86)%\Common Files and ideally such a type library would come wrapped in an installer that does just that.
Hmm I'm not sure I can even recall an installer that was just for a VB TLB. Although with all the other files and versions an installer for oleexp isn't a bad idea...