PHP User Warning: fetch_template() calls should be replaced by the vB_Template class. Template name: bbcode_highlight in ..../includes/functions.php on line 4197

PHP User Warning: fetch_template() calls should be replaced by the vB_Template class. Template name: bbcode_highlight in ..../includes/functions.php on line 4197
Classic VB - Does Visual Basic 6 support Unicode?-VBForums
Results 1 to 11 of 11

Thread: Classic VB - Does Visual Basic 6 support Unicode?

Threaded View

  1. #1

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Classic VB - Does Visual Basic 6 support Unicode?

    Yes and no. For yes, VB strings can hold 16-bit characters and is thus Unicode compatible. The problems come from several directions:
    • Strings passed to API are converted to ANSI and vice versa

    • Reading a file to a string is done by automatically converting ANSI to Unicode

    • The same applies when saving: string is converted to ANSI

    • Visual Basic controls are not Unicode aware

    Luckily we have byte arrays to help us with the reading and writing business: they are easy to convert to strings and string contents can be copied to byte arrays easily with native VB code. Thus when working with Unicode, a byte array is your best bet.


    Where are the controls?

    The worst news is that VB6 controls use ANSI. The VB propertybags can't hold Unicode data and VB runtime also passes the data to the controls after ANSI conversion. Thus none of the default controls can be used for Unicode. The only free choice left is to code the controls by yourself or to seek for controls done by others on the internet. The easy solution costs money: people are selling Unicode aware controls and there is no free version available for many. Personally, I've made a UniLabel and a UniCommand and both are available for free (link).


    Switching codepages

    This code is still a work in progress: it is a simple module that allows to switch between codepages. This includes UTF-8 conversions, which might be handy. The functions take in and return a byte array. This far I haven't found an error with the current code. Use GetACP to find out the default codepage the system has in use (this codepage is used when a file is loaded by VB into a string).

    VB Code:
    1. Option Explicit
    2.  
    3. Public Enum KnownCodePage
    4.     CP_UNKNOWN = -1
    5.     CP_ACP = 0
    6.     CP_OEMCP = 1
    7.     CP_MACCP = 2
    8.     CP_THREAD_ACP = 3
    9.     CP_SYMBOL = 42
    10. '   ARABIC
    11.     CP_AWIN = 101   ' Bidi Windows codepage
    12.     CP_709 = 102    ' MS-DOS Arabic Support CP 709
    13.     CP_720 = 103    ' MS-DOS Arabic Support CP 720
    14.     CP_A708 = 104   ' ASMO 708
    15.     CP_A449 = 105   ' ASMO 449+
    16.     CP_TARB = 106   ' MS Transparent Arabic
    17.     CP_NAE = 107    ' Nafitha Enhanced Arabic Char Set
    18.     CP_V4 = 108     ' Nafitha v 4.0
    19.     CP_MA2 = 109    ' Mussaed Al Arabi (MA/2) CP 786
    20.     CP_I864 = 110   ' IBM Arabic Supplement CP 864
    21.     CP_A437 = 111   ' Ansi 437 codepage
    22.     CP_AMAC = 112   ' Macintosh Code Page
    23. '   HEBREW
    24.     CP_HWIN = 201   ' Bidi Windows codepage
    25.     CP_862I = 202   ' IBM Hebrew Supplement CP 862
    26.     CP_7BIT = 203   ' IBM Hebrew Supplement CP 862 Folded
    27.     CP_ISO = 204    ' ISO Hebrew 8859-8 Character Set
    28.     CP_H437 = 205   ' Ansi 437 codepage
    29.     CP_HMAC = 206   ' Macintosh Code Page
    30. '   CODE PAGES
    31.     CP_OEM_437 = 437
    32.     CP_ARABICDOS = 708
    33.     CP_DOS720 = 720
    34.     CP_DOS737 = 737
    35.     CP_DOS775 = 775
    36.     CP_IBM850 = 850
    37.     CP_IBM852 = 852
    38.     CP_DOS861 = 861
    39.     CP_DOS862 = 862
    40.     CP_IBM866 = 866
    41.     CP_DOS869 = 869
    42.     CP_THAI = 874
    43.     CP_EBCDIC = 875
    44.     CP_JAPAN = 932
    45.     CP_CHINA = 936
    46.     CP_KOREA = 949
    47.     CP_TAIWAN = 950
    48. '   UNICODE
    49.     CP_UNICODELITTLE = 1200
    50.     CP_UNICODEBIG = 1201
    51. '   CODE PAGES
    52.     CP_EASTEUROPE = 1250
    53.     CP_RUSSIAN = 1251
    54.     CP_WESTEUROPE = 1252
    55.     CP_GREEK = 1253
    56.     CP_TURKISH = 1254
    57.     CP_HEBREW = 1255
    58.     CP_ARABIC = 1256
    59.     CP_BALTIC = 1257
    60.     CP_VIETNAMESE = 1258
    61. '   KOREAN
    62.     CP_JOHAB = 1361
    63. '   MAC
    64.     CP_MAC_ROMAN = 10000
    65.     CP_MAC_JAPAN = 10001
    66.     CP_MAC_ARABIC = 10004
    67.     CP_MAC_GREEK = 10006
    68.     CP_MAC_CYRILLIC = 10007
    69.     CP_MAC_LATIN2 = 10029
    70.     CP_MAC_TURKISH = 10081
    71. '   CODE PAGES
    72.     CP_CHINESECNS = 20000
    73.     CP_CHINESEETEN = 20002
    74.     CP_IA5WEST = 20105
    75.     CP_IA5GERMAN = 20106
    76.     CP_IA5SWEDISH = 20107
    77.     CP_IA5NORWEGIAN = 20108
    78.     CP_ASCII = 20127
    79.     CP_RUSSIANKOI8R = 20866
    80.     CP_RUSSIANKOI8U = 21866
    81.     CP_ISOLATIN1 = 28591
    82.     CP_ISOEASTEUROPE = 28592
    83.     CP_ISOTURKISH = 28593
    84.     CP_ISOBALTIC = 28594
    85.     CP_ISORUSSIAN = 28595
    86.     CP_ISOARABIC = 28596
    87.     CP_ISOGREEK = 28597
    88.     CP_ISOHEBREW = 28598
    89.     CP_ISOTURKISH2 = 28599
    90.     CP_ISOLATIN9 = 28605
    91.     CP_HEBREWLOG = 38598
    92.     CP_USER = 50000
    93.     CP_AUTOALL = 50001
    94.     CP_JAPANNHK = 50220
    95.     CP_JAPANESC = 50221
    96.     CP_JAPANISO = 50222
    97.     CP_KOREAISO = 50225
    98.     CP_TAIWANISO = 50227
    99.     CP_CHINAISO = 50229
    100.     CP_AUTOJAPAN = 50932
    101.     CP_AUTOCHINA = 50936
    102.     CP_AUTOKOREA = 50949
    103.     CP_AUTOTAIWAN = 50950
    104.     CP_AUTORUSSIAN = 51251
    105.     CP_AUTOGREEK = 51253
    106.     CP_AUTOARABIC = 51256
    107.     CP_JAPANEUC = 51932
    108.     CP_CHINAEUC = 51936
    109.     CP_KOREAEUC = 51949
    110.     CP_TAIWANEUC = 51950
    111.     CP_CHINAHZ = 52936
    112.     CP_GB18030 = 54936
    113. '   UNICODE
    114.     CP_UTF7 = 65000
    115.     CP_UTF8 = 65001
    116. End Enum
    117.  
    118. ' Flags
    119. Public Const MB_PRECOMPOSED = &H1
    120. Public Const MB_COMPOSITE = &H2
    121. Public Const MB_USEGLYPHCHARS = &H4
    122. Public Const MB_ERR_INVALID_CHARS = &H8
    123.  
    124. Public Const WC_DEFAULTCHECK = &H100                ' check for default char
    125. Public Const WC_COMPOSITECHECK = &H200              ' convert composite to precomposed
    126. Public Const WC_DISCARDNS = &H10                    ' discard non-spacing chars
    127. Public Const WC_SEPCHARS = &H20                     ' generate separate chars
    128. Public Const WC_DEFAULTCHAR = &H40                  ' replace with default char
    129.  
    130. Public Declare Function GetACP Lib "kernel32" () As Long
    131. Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, _
    132. ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, _
    133. ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
    134. Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, _
    135. ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, _
    136. ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, _
    137. lpUsedDefaultChar As Long) As Long
    138. Public Function ANSItoUTF16(ByRef Text() As Byte, Optional ByVal cPage As KnownCodePage = CP_UNKNOWN, _
    139.                             Optional lFlags As Long) As Byte()
    140.     Static tmpArr() As Byte, textStr As String
    141.     Dim tmpLen As Long, textLen As Long, A As Long
    142.     If (Not Text) = True Then Exit Function
    143.     ' set code page to a valid one
    144.     If cPage = CP_UNKNOWN Then cPage = GetACP
    145.     If cPage = CP_ACP Or cPage = CP_WESTEUROPE Then
    146.         textLen = UBound(Text)
    147.         tmpLen = textLen + textLen + 1
    148.         If (Not tmpArr) = True Then ReDim Preserve tmpArr(tmpLen)
    149.         If UBound(tmpArr) <> tmpLen Then ReDim Preserve tmpArr(tmpLen)
    150.         For A = 0 To UBound(Text)
    151.             tmpArr(A + A) = Text(A)
    152.         Next A
    153.     Else
    154.         textStr = CStr(Text) & "|"
    155.         textLen = LenB(textStr)
    156.         tmpLen = textLen + textLen
    157.         ReDim Preserve tmpArr(tmpLen + 1)
    158.         ' get the new string to tmpArr
    159.         tmpLen = MultiByteToWideChar(CLng(cPage), lFlags, ByVal StrPtr(textStr), -1, _
    160.                                      ByVal VarPtr(tmpArr(0)), tmpLen)
    161.         If tmpLen = 0 Then Exit Function
    162.         tmpLen = tmpLen + tmpLen - 5
    163.         'If tmpArr(tmpLen - 1) = 0 And tmpArr(tmpLen) = 0 Then tmpLen = tmpLen - 2
    164.         If UBound(tmpArr) <> tmpLen Then ReDim Preserve tmpArr(tmpLen)
    165.     End If
    166.     ' return the result
    167.     ANSItoUTF16 = tmpArr
    168. End Function
    169. Public Function UTF16toANSI(ByRef Text() As Byte, Optional ByVal cPage As KnownCodePage = CP_UNKNOWN, _
    170.                             Optional lFlags As Long) As Byte()
    171.     Static tmpArr() As Byte
    172.     Dim tmpLen As Long, textLen As Long, A As Long
    173.     If (Not Text) = True Then Exit Function
    174.     ' set code page to a valid one
    175.     If cPage = CP_UNKNOWN Then cPage = GetACP
    176.     If cPage = CP_ACP Or cPage = CP_WESTEUROPE Then
    177.         textLen = UBound(Text)
    178.         tmpLen = (textLen + 1) \ 2 - 1
    179.         If (Not tmpArr) = True Then ReDim Preserve tmpArr(tmpLen)
    180.         If UBound(tmpArr) <> tmpLen Then ReDim Preserve tmpArr(tmpLen)
    181.         For A = 0 To tmpLen
    182.             tmpArr(A) = Text(A + A)
    183.         Next A
    184.     Else
    185.         textLen = (UBound(Text) + 1) \ 2
    186.         ' at maximum ANSI can be four bytes per character in new Chinese encoding GB180302000
    187.         tmpLen = textLen + textLen + textLen + textLen + 1
    188.         ReDim Preserve tmpArr(tmpLen - 1)
    189.         ' get the new string to tmpArr
    190.         tmpLen = WideCharToMultiByte(CLng(cPage), lFlags, ByVal VarPtr(Text(0)), textLen, ByVal VarPtr(tmpArr(0)), _
    191.                                      tmpLen, ByVal 0&, ByVal 0&)
    192.         If tmpLen = 0 Then Exit Function
    193.         ' a hopeless try to correct a weird error?
    194.         ReDim Preserve tmpArr(tmpLen - 1)
    195.     End If
    196.     ' return the result
    197.     UTF16toANSI = tmpArr
    198. End Function
    Last edited by si_the_geek; Jun 17th, 2008 at 07:23 AM. Reason: updated link to controls

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width