Results 1 to 17 of 17

Thread: [RESOLVED] em dash display

  1. #1

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    9,622

    Resolved [RESOLVED] em dash display

    A database (Access) I have includes em dashes (the long dash, as opposed to the short, en dash). When I retrieve a field with that character (within a sentence or phrase of normal text), how do I recognize it to display as an em dash in, say, a flexgrid, listbox or textbox? Currently a thick vertical bar is displayed in my controls.

  2. #2
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,892

    Re: em dash display

    Just a guess, but a Unicode capable listbox, flexgrid, or textbox would likely show the em dash.

  3. #3
    PowerPoster ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    3,129

    Re: em dash display

    hi,

    you could try Instr in a query, you are loading from a Table ?

    I did a quick test and this worked
    Code:
    SELECT Empfaenger.EM_Name1, Empfaenger.EM_Name2, InStr(1,[EM_Name2],"b") AS Expr1
    FROM Empfaenger;
    it returns the position where b is

    hth
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  4. #4

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    9,622

    Re: em dash display

    finding the em dash in the database table is not the issue....displaying it in controls is. I'll look into making my controls unicode capable in that project.

  5. #5
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    924

    Re: em dash display

    If it's only one character, I would use Mid statment to replace it with regular dash, or even loop through all records and eliminate that character. Some programs, like MS Word use stylized characters when using number bullets, and I think Outlook, which use MS Word as an editor; changes double-quotes to stylized ones(ASCII Codes 147 and 148, instead of 34), so if you were emailing someone with a command line that includes paths with spaces, then copying them to Command Prompt would result in an error. I haven't used Outlook for years, so I don't know if this is still an issue.

  6. #6

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    9,622

    Re: em dash display

    Quote Originally Posted by qvb6 View Post
    If it's only one character, I would use Mid statment to replace it with regular dash, or even loop through all records and eliminate that character. Some programs, like MS Word use stylized characters when using number bullets, and I think Outlook, which use MS Word as an editor; changes double-quotes to stylized ones(ASCII Codes 147 and 148, instead of 34), so if you were emailing someone with a command line that includes paths with spaces, then copying them to Command Prompt would result in an error. I haven't used Outlook for years, so I don't know if this is still an issue.
    That's a possibility. The db DOES have both en and em dashes, but for what I need to do, I guess it doesn't really matter which is displayed in my program.

    But, I 'solved' my issue by using Find/Replace in my source document (Excel file). So, for those of you looking to see 'how' the em dash is displayed in VB6, unless someone posts an example, I won't be attempting to show how to do it, as I 'solved' "MY" problem in other ways.

  7. #7
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: [RESOLVED] em dash display

    "Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?

  8. #8

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    9,622

    Re: [RESOLVED] em dash display

    Quote Originally Posted by dilettante View Post
    "Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?
    No, sure am not sure. Appears several times in a downloaded (free) Bible App from the Microsoft Store. As I copied the contents from the app (via C-P), it "appears" to be an Em Dash from the source as well as the Excel File into which I pasted it (then imported file into MS Access). No idea 'what' it is, but it didn't appear as a 'long dash' in my program...but, I got around it by replacing it in Excel (Did THAT by copy/pasting the 'long dash' (whatever it really is) into the Search field, and Replaced it with a regular (N) dash. Good to know a 'real' M Dash displays fine.

  9. #9
    Hyperactive Member
    Join Date
    Mar 2018
    Posts
    462

    Re: [RESOLVED] em dash display

    Quote Originally Posted by dilettante View Post
    "Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?
    Sounds like it might be Unicode u+2014
    Last edited by DllHell; Mar 20th, 2019 at 05:00 PM.

  10. #10
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: [RESOLVED] em dash display

    Quote Originally Posted by DllHell View Post
    Sounds like it might be Unicode u+2014
    That's the same character as &H97.

  11. #11
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,167

    Re: [RESOLVED] em dash display

    Quote Originally Posted by dilettante View Post
    That's the same character as &H97.
    This miracles of Windows code pages :-))

    Everything beyond &H80 is subjected to interpretation according current system code page for non-Unicode application, but. . . En and Em dashes are &H96 and &H97 in every code page except far-eastern ones.

    cheers,
    </wqw>

  12. #12
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    924

    Re: [RESOLVED] em dash display

    Yep, anything over 127 is subject to code page translation. The default page for US-English is Windows 1252. What's more is when you assign characters in the range 0 to 255 by using Chr(), they are internally converted to Unicode, and they are not the same as code page 1252, so AscW() wouldn't give you numbers in the range 0 to 255. In Unicode, characters in the range 128 to 255 are fixed and are not subject to code page translation.

    Here is a simple loop going through 0 to 255, and printing out characters that differ from code page 1252:

    VB Code:
    1. Option Explicit
    2.  
    3. Private Declare Function GetACP Lib "kernel32" () As Long
    4.  
    5. Private Sub Form_Load()
    6.     Dim i As Long
    7.     Dim s As String
    8.  
    9.     Debug.Print "Active Code Page = " & GetACP()
    10.     Debug.Print "Chr(Dec)", "Chr(Hex)", "Asc(Hex)", "AscW(Hex)"
    11.     For i = 0 To 255
    12.         s = Chr(i)
    13.         ' Compare Chr() with ChrW(), and print where they differ
    14.         If s <> ChrW(i) Then
    15.             Debug.Print i, Hex(i), Hex(Asc(s)), GetHex(AscW(s))
    16.         End If
    17.     Next
    18. End Sub
    19.  
    20. ' Get Hex value padded with 0 to the left
    21. Public Function GetHex(ByVal i As Long) As String
    22.     GetHex = Right("000" & Hex(i), 4)
    23. End Function

    Output:

    Code:
    Active Code Page = 1252
    Chr(Dec)      Chr(Hex)      Asc(Hex)      AscW(Hex)
     128          80            80            20AC
     130          82            82            201A
     131          83            83            0192
     132          84            84            201E
     133          85            85            2026
     134          86            86            2020
     135          87            87            2021
     136          88            88            02C6
     137          89            89            2030
     138          8A            8A            0160
     139          8B            8B            2039
     140          8C            8C            0152
     142          8E            8E            017D
     145          91            91            2018
     146          92            92            2019
     147          93            93            201C
     148          94            94            201D
     149          95            95            2022
     150          96            96            2013
     151          97            97            2014
     152          98            98            02DC
     153          99            99            2122
     154          9A            9A            0161
     155          9B            9B            203A
     156          9C            9C            0153
     158          9E            9E            017E
     159          9F            9F            0178

  13. #13
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: [RESOLVED] em dash display

    Quote Originally Posted by qvb6 View Post
    Yep, anything over 127 is subject to code page translation.
    Not really.

    If you are talking about the Chr$() function, it makes a call to MultiByteToWideChar() passing CP_ACP. What happens depends on the current ANSI codepage.

  14. #14
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: [RESOLVED] em dash display

    Code:
    Option Explicit
    
    Private Const CP_USASCII As Long = 20127
    
    Private Declare Function MultiByteToWideChar Lib "Kernel32" ( _
        ByVal CodePage As Long, _
        ByVal dwFlags As Long, _
        ByRef MultiByteStr As Byte, _
        ByVal cbMultiByte As Long, _
        ByVal lpWideCharStr As Long, _
        ByVal cchWideChar As Long) As Long
    
    Private Sub Dump()
        Dim Bytes(0 To 255) As Byte
        Dim I As Long
        Dim Chars As String
    
        For I = 0 To 255
            Bytes(I) = CByte(I)
        Next
        Chars = Space$(256)
        MultiByteToWideChar CP_USASCII, 0, Bytes(0), 256, StrPtr(Chars), 256
        For I = 1 To 64
            Debug.Print I; "="; AscW(Mid$(Chars, I, 1)), _
                        I + 64; "="; AscW(Mid$(Chars, I + 64, 1)), _
                        I + 128; "="; AscW(Mid$(Chars, I + 128, 1)), _
                        I + 192; "="; AscW(Mid$(Chars, I + 192, 1))
        Next
    End Sub
    Results in:

    Code:
     1 = 0         65 = 64       129 = 0       193 = 64 
     2 = 1         66 = 65       130 = 1       194 = 65 
     3 = 2         67 = 66       131 = 2       195 = 66 
     4 = 3         68 = 67       132 = 3       196 = 67 
     5 = 4         69 = 68       133 = 4       197 = 68 
     6 = 5         70 = 69       134 = 5       198 = 69 
     7 = 6         71 = 70       135 = 6       199 = 70 
     8 = 7         72 = 71       136 = 7       200 = 71 
     9 = 8         73 = 72       137 = 8       201 = 72 
     10 = 9        74 = 73       138 = 9       202 = 73 
     11 = 10       75 = 74       139 = 10      203 = 74 
     12 = 11       76 = 75       140 = 11      204 = 75 
     13 = 12       77 = 76       141 = 12      205 = 76 
     14 = 13       78 = 77       142 = 13      206 = 77 
     15 = 14       79 = 78       143 = 14      207 = 78 
     16 = 15       80 = 79       144 = 15      208 = 79 
     17 = 16       81 = 80       145 = 16      209 = 80 
     18 = 17       82 = 81       146 = 17      210 = 81 
     19 = 18       83 = 82       147 = 18      211 = 82 
     20 = 19       84 = 83       148 = 19      212 = 83 
     21 = 20       85 = 84       149 = 20      213 = 84 
     22 = 21       86 = 85       150 = 21      214 = 85 
     23 = 22       87 = 86       151 = 22      215 = 86 
     24 = 23       88 = 87       152 = 23      216 = 87 
     25 = 24       89 = 88       153 = 24      217 = 88 
     26 = 25       90 = 89       154 = 25      218 = 89 
     27 = 26       91 = 90       155 = 26      219 = 90 
     28 = 27       92 = 91       156 = 27      220 = 91 
     29 = 28       93 = 92       157 = 28      221 = 92 
     30 = 29       94 = 93       158 = 29      222 = 93 
     31 = 30       95 = 94       159 = 30      223 = 94 
     32 = 31       96 = 95       160 = 31      224 = 95 
     33 = 32       97 = 96       161 = 32      225 = 96 
     34 = 33       98 = 97       162 = 33      226 = 97 
     35 = 34       99 = 98       163 = 34      227 = 98 
     36 = 35       100 = 99      164 = 35      228 = 99 
     37 = 36       101 = 100     165 = 36      229 = 100 
     38 = 37       102 = 101     166 = 37      230 = 101 
     39 = 38       103 = 102     167 = 38      231 = 102 
     40 = 39       104 = 103     168 = 39      232 = 103 
     41 = 40       105 = 104     169 = 40      233 = 104 
     42 = 41       106 = 105     170 = 41      234 = 105 
     43 = 42       107 = 106     171 = 42      235 = 106 
     44 = 43       108 = 107     172 = 43      236 = 107 
     45 = 44       109 = 108     173 = 44      237 = 108 
     46 = 45       110 = 109     174 = 45      238 = 109 
     47 = 46       111 = 110     175 = 46      239 = 110 
     48 = 47       112 = 111     176 = 47      240 = 111 
     49 = 48       113 = 112     177 = 48      241 = 112 
     50 = 49       114 = 113     178 = 49      242 = 113 
     51 = 50       115 = 114     179 = 50      243 = 114 
     52 = 51       116 = 115     180 = 51      244 = 115 
     53 = 52       117 = 116     181 = 52      245 = 116 
     54 = 53       118 = 117     182 = 53      246 = 117 
     55 = 54       119 = 118     183 = 54      247 = 118 
     56 = 55       120 = 119     184 = 55      248 = 119 
     57 = 56       121 = 120     185 = 56      249 = 120 
     58 = 57       122 = 121     186 = 57      250 = 121 
     59 = 58       123 = 122     187 = 58      251 = 122 
     60 = 59       124 = 123     188 = 59      252 = 123 
     61 = 60       125 = 124     189 = 60      253 = 124 
     62 = 61       126 = 125     190 = 61      254 = 125 
     63 = 62       127 = 126     191 = 62      255 = 126 
     64 = 63       128 = 127     192 = 63      256 = 127

  15. #15
    Hyperactive Member
    Join Date
    Mar 2018
    Posts
    462

    Re: [RESOLVED] em dash display

    Quote Originally Posted by wqweto View Post
    En and Em dashes are &H96 and &H97 in every code page except far-eastern
    This explains a lot. I have several clients in the area so tend to always test unicode stuff using PRC settings.

  16. #16
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    924

    Re: [RESOLVED] em dash display

    Quote Originally Posted by dilettante View Post
    Private Const CP_USASCII As Long = 20127
    Not to pick a fight, but I found this page defines CP_USASCII as 1252, while this one shows that "20127" is the code page for "us-ascii". I am not surprised by the confusing information out there, even in MSDN. I find that specifying character codes without the encoding scheme(ANSI+CodePage/Other single char per byte schemes/Unicode) is like saying that the temperature is 40 degrees, without saying the units(C/F). Even in VB6, char values(by using Asc, not AscW) are like temperature without units(CodePage). Saying the code page each time, and trying to be as accurate as possible would mean writing walls of text, which not many have the time for.

  17. #17
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: [RESOLVED] em dash display

    My only point was that things are more complicated than any absolute rules might indicate. You could have an SBCS codepage that produces almost anything from inputs of 0-255.

    You are correct that Microsoft defines a CODEPAGEID of CP_USASCII as 1252 in one place, as a local constant only, in an interface definition for IMimeInternational. And they warn Do not use. In other words this codepage symbolic name has no real definition and certainly no global definition in the Win32 API. Meanwhile "us-ascii" is not valid syntax in most programming languages anyway aside from maybe Cobol and the codepage value 20127 has no official symbolic name assigned anywhere.

    Would it have changed anything if I had named the constant US_ASCII or HIGH_BIT_STRIPPER_ASCII? No.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width