|
-
Mar 20th, 2019, 11:21 AM
#1
[RESOLVED] em dash display
A database (Access) I have includes em dashes (the long dash, as opposed to the short, en dash). When I retrieve a field with that character (within a sentence or phrase of normal text), how do I recognize it to display as an em dash in, say, a flexgrid, listbox or textbox? Currently a thick vertical bar is displayed in my controls.
-
Mar 20th, 2019, 11:53 AM
#2
Re: em dash display
Just a guess, but a Unicode capable listbox, flexgrid, or textbox would likely show the em dash.
-
Mar 20th, 2019, 11:56 AM
#3
Re: em dash display
hi,
you could try Instr in a query, you are loading from a Table ?
I did a quick test and this worked
Code:
SELECT Empfaenger.EM_Name1, Empfaenger.EM_Name2, InStr(1,[EM_Name2],"b") AS Expr1
FROM Empfaenger;
it returns the position where b is
hth
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Mar 20th, 2019, 12:25 PM
#4
Re: em dash display
finding the em dash in the database table is not the issue....displaying it in controls is. I'll look into making my controls unicode capable in that project.
-
Mar 20th, 2019, 12:40 PM
#5
Fanatic Member
Re: em dash display
If it's only one character, I would use Mid statment to replace it with regular dash, or even loop through all records and eliminate that character. Some programs, like MS Word use stylized characters when using number bullets, and I think Outlook, which use MS Word as an editor; changes double-quotes to stylized ones(ASCII Codes 147 and 148, instead of 34), so if you were emailing someone with a command line that includes paths with spaces, then copying them to Command Prompt would result in an error. I haven't used Outlook for years, so I don't know if this is still an issue.
-
Mar 20th, 2019, 01:59 PM
#6
Re: em dash display
 Originally Posted by qvb6
If it's only one character, I would use Mid statment to replace it with regular dash, or even loop through all records and eliminate that character. Some programs, like MS Word use stylized characters when using number bullets, and I think Outlook, which use MS Word as an editor; changes double-quotes to stylized ones(ASCII Codes 147 and 148, instead of 34), so if you were emailing someone with a command line that includes paths with spaces, then copying them to Command Prompt would result in an error. I haven't used Outlook for years, so I don't know if this is still an issue.
That's a possibility. The db DOES have both en and em dashes, but for what I need to do, I guess it doesn't really matter which is displayed in my program.
But, I 'solved' my issue by using Find/Replace in my source document (Excel file). So, for those of you looking to see 'how' the em dash is displayed in VB6, unless someone posts an example, I won't be attempting to show how to do it, as I 'solved' "MY" problem in other ways.
-
Mar 20th, 2019, 04:08 PM
#7
Re: [RESOLVED] em dash display
"Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?
-
Mar 20th, 2019, 04:37 PM
#8
Re: [RESOLVED] em dash display
 Originally Posted by dilettante
"Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?
No, sure am not sure. Appears several times in a downloaded (free) Bible App from the Microsoft Store. As I copied the contents from the app (via C-P), it "appears" to be an Em Dash from the source as well as the Excel File into which I pasted it (then imported file into MS Access). No idea 'what' it is, but it didn't appear as a 'long dash' in my program...but, I got around it by replacing it in Excel (Did THAT by copy/pasting the 'long dash' (whatever it really is) into the Search field, and Replaced it with a regular (N) dash. Good to know a 'real' M Dash displays fine.
-
Mar 20th, 2019, 04:56 PM
#9
Hyperactive Member
Re: [RESOLVED] em dash display
 Originally Posted by dilettante
"Em Dash" is ANSI &H97 and "En Dash" is ANSI &H96, both display just fine in ANSI controls for me. Are you sure you don't have some other dash instead?
Sounds like it might be Unicode u+2014
Last edited by DllHell; Mar 20th, 2019 at 05:00 PM.
-
Mar 20th, 2019, 05:02 PM
#10
Re: [RESOLVED] em dash display
 Originally Posted by DllHell
Sounds like it might be Unicode u+2014
That's the same character as &H97.
-
Mar 20th, 2019, 06:08 PM
#11
Re: [RESOLVED] em dash display
 Originally Posted by dilettante
That's the same character as &H97.
This miracles of Windows code pages :-))
Everything beyond &H80 is subjected to interpretation according current system code page for non-Unicode application, but. . . En and Em dashes are &H96 and &H97 in every code page except far-eastern ones.
cheers,
</wqw>
-
Mar 20th, 2019, 07:33 PM
#12
Fanatic Member
Re: [RESOLVED] em dash display
Yep, anything over 127 is subject to code page translation. The default page for US-English is Windows 1252. What's more is when you assign characters in the range 0 to 255 by using Chr(), they are internally converted to Unicode, and they are not the same as code page 1252, so AscW() wouldn't give you numbers in the range 0 to 255. In Unicode, characters in the range 128 to 255 are fixed and are not subject to code page translation.
Here is a simple loop going through 0 to 255, and printing out characters that differ from code page 1252:
VB Code:
Option Explicit
Private Declare Function GetACP Lib "kernel32" () As Long
Private Sub Form_Load()
Dim i As Long
Dim s As String
Debug.Print "Active Code Page = " & GetACP()
Debug.Print "Chr(Dec)", "Chr(Hex)", "Asc(Hex)", "AscW(Hex)"
For i = 0 To 255
s = Chr(i)
' Compare Chr() with ChrW(), and print where they differ
If s <> ChrW(i) Then
Debug.Print i, Hex(i), Hex(Asc(s)), GetHex(AscW(s))
End If
Next
End Sub
' Get Hex value padded with 0 to the left
Public Function GetHex(ByVal i As Long) As String
GetHex = Right("000" & Hex(i), 4)
End Function
Output:
Code:
Active Code Page = 1252
Chr(Dec) Chr(Hex) Asc(Hex) AscW(Hex)
128 80 80 20AC
130 82 82 201A
131 83 83 0192
132 84 84 201E
133 85 85 2026
134 86 86 2020
135 87 87 2021
136 88 88 02C6
137 89 89 2030
138 8A 8A 0160
139 8B 8B 2039
140 8C 8C 0152
142 8E 8E 017D
145 91 91 2018
146 92 92 2019
147 93 93 201C
148 94 94 201D
149 95 95 2022
150 96 96 2013
151 97 97 2014
152 98 98 02DC
153 99 99 2122
154 9A 9A 0161
155 9B 9B 203A
156 9C 9C 0153
158 9E 9E 017E
159 9F 9F 0178
-
Mar 20th, 2019, 11:55 PM
#13
Re: [RESOLVED] em dash display
 Originally Posted by qvb6
Yep, anything over 127 is subject to code page translation.
Not really.
If you are talking about the Chr$() function, it makes a call to MultiByteToWideChar() passing CP_ACP. What happens depends on the current ANSI codepage.
-
Mar 21st, 2019, 12:30 AM
#14
Re: [RESOLVED] em dash display
Code:
Option Explicit
Private Const CP_USASCII As Long = 20127
Private Declare Function MultiByteToWideChar Lib "Kernel32" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByRef MultiByteStr As Byte, _
ByVal cbMultiByte As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long) As Long
Private Sub Dump()
Dim Bytes(0 To 255) As Byte
Dim I As Long
Dim Chars As String
For I = 0 To 255
Bytes(I) = CByte(I)
Next
Chars = Space$(256)
MultiByteToWideChar CP_USASCII, 0, Bytes(0), 256, StrPtr(Chars), 256
For I = 1 To 64
Debug.Print I; "="; AscW(Mid$(Chars, I, 1)), _
I + 64; "="; AscW(Mid$(Chars, I + 64, 1)), _
I + 128; "="; AscW(Mid$(Chars, I + 128, 1)), _
I + 192; "="; AscW(Mid$(Chars, I + 192, 1))
Next
End Sub
Results in:
Code:
1 = 0 65 = 64 129 = 0 193 = 64
2 = 1 66 = 65 130 = 1 194 = 65
3 = 2 67 = 66 131 = 2 195 = 66
4 = 3 68 = 67 132 = 3 196 = 67
5 = 4 69 = 68 133 = 4 197 = 68
6 = 5 70 = 69 134 = 5 198 = 69
7 = 6 71 = 70 135 = 6 199 = 70
8 = 7 72 = 71 136 = 7 200 = 71
9 = 8 73 = 72 137 = 8 201 = 72
10 = 9 74 = 73 138 = 9 202 = 73
11 = 10 75 = 74 139 = 10 203 = 74
12 = 11 76 = 75 140 = 11 204 = 75
13 = 12 77 = 76 141 = 12 205 = 76
14 = 13 78 = 77 142 = 13 206 = 77
15 = 14 79 = 78 143 = 14 207 = 78
16 = 15 80 = 79 144 = 15 208 = 79
17 = 16 81 = 80 145 = 16 209 = 80
18 = 17 82 = 81 146 = 17 210 = 81
19 = 18 83 = 82 147 = 18 211 = 82
20 = 19 84 = 83 148 = 19 212 = 83
21 = 20 85 = 84 149 = 20 213 = 84
22 = 21 86 = 85 150 = 21 214 = 85
23 = 22 87 = 86 151 = 22 215 = 86
24 = 23 88 = 87 152 = 23 216 = 87
25 = 24 89 = 88 153 = 24 217 = 88
26 = 25 90 = 89 154 = 25 218 = 89
27 = 26 91 = 90 155 = 26 219 = 90
28 = 27 92 = 91 156 = 27 220 = 91
29 = 28 93 = 92 157 = 28 221 = 92
30 = 29 94 = 93 158 = 29 222 = 93
31 = 30 95 = 94 159 = 30 223 = 94
32 = 31 96 = 95 160 = 31 224 = 95
33 = 32 97 = 96 161 = 32 225 = 96
34 = 33 98 = 97 162 = 33 226 = 97
35 = 34 99 = 98 163 = 34 227 = 98
36 = 35 100 = 99 164 = 35 228 = 99
37 = 36 101 = 100 165 = 36 229 = 100
38 = 37 102 = 101 166 = 37 230 = 101
39 = 38 103 = 102 167 = 38 231 = 102
40 = 39 104 = 103 168 = 39 232 = 103
41 = 40 105 = 104 169 = 40 233 = 104
42 = 41 106 = 105 170 = 41 234 = 105
43 = 42 107 = 106 171 = 42 235 = 106
44 = 43 108 = 107 172 = 43 236 = 107
45 = 44 109 = 108 173 = 44 237 = 108
46 = 45 110 = 109 174 = 45 238 = 109
47 = 46 111 = 110 175 = 46 239 = 110
48 = 47 112 = 111 176 = 47 240 = 111
49 = 48 113 = 112 177 = 48 241 = 112
50 = 49 114 = 113 178 = 49 242 = 113
51 = 50 115 = 114 179 = 50 243 = 114
52 = 51 116 = 115 180 = 51 244 = 115
53 = 52 117 = 116 181 = 52 245 = 116
54 = 53 118 = 117 182 = 53 246 = 117
55 = 54 119 = 118 183 = 54 247 = 118
56 = 55 120 = 119 184 = 55 248 = 119
57 = 56 121 = 120 185 = 56 249 = 120
58 = 57 122 = 121 186 = 57 250 = 121
59 = 58 123 = 122 187 = 58 251 = 122
60 = 59 124 = 123 188 = 59 252 = 123
61 = 60 125 = 124 189 = 60 253 = 124
62 = 61 126 = 125 190 = 61 254 = 125
63 = 62 127 = 126 191 = 62 255 = 126
64 = 63 128 = 127 192 = 63 256 = 127
-
Mar 21st, 2019, 09:22 AM
#15
Hyperactive Member
Re: [RESOLVED] em dash display
 Originally Posted by wqweto
En and Em dashes are &H96 and &H97 in every code page except far-eastern
This explains a lot. I have several clients in the area so tend to always test unicode stuff using PRC settings.
-
Mar 21st, 2019, 11:19 AM
#16
Fanatic Member
Re: [RESOLVED] em dash display
 Originally Posted by dilettante
Private Const CP_USASCII As Long = 20127
Not to pick a fight, but I found this page defines CP_USASCII as 1252, while this one shows that "20127" is the code page for "us-ascii". I am not surprised by the confusing information out there, even in MSDN. I find that specifying character codes without the encoding scheme(ANSI+CodePage/Other single char per byte schemes/Unicode) is like saying that the temperature is 40 degrees, without saying the units(C/F). Even in VB6, char values(by using Asc, not AscW) are like temperature without units(CodePage). Saying the code page each time, and trying to be as accurate as possible would mean writing walls of text, which not many have the time for.
-
Mar 21st, 2019, 04:24 PM
#17
Re: [RESOLVED] em dash display
My only point was that things are more complicated than any absolute rules might indicate. You could have an SBCS codepage that produces almost anything from inputs of 0-255.
You are correct that Microsoft defines a CODEPAGEID of CP_USASCII as 1252 in one place, as a local constant only, in an interface definition for IMimeInternational. And they warn Do not use. In other words this codepage symbolic name has no real definition and certainly no global definition in the Win32 API. Meanwhile "us-ascii" is not valid syntax in most programming languages anyway aside from maybe Cobol and the codepage value 20127 has no official symbolic name assigned anywhere.
Would it have changed anything if I had named the constant US_ASCII or HIGH_BIT_STRIPPER_ASCII? No.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|