-
Apr 4th, 2024, 03:36 AM
#1
Thread Starter
Member
Displaying HTML in Rich Text Box
I want to display HTML content in rich text box.
I am able to convert the HTML by loading it into a web browser control and then selecting all and then pasting on rich text box. It does work except it cannot show any images in the rich text box.
Is there any way to import/show the images as well?
If there is any 3rd party DLL which can convert HTML to RTF format? If it is free then I am happy to use it.
I know Pandoc can convert but it is over 100 MB in size and overkill for my small application.
Code:
Public Function HtmlToRtf(HtmlFilePath As String) As Integer
Try
Dim f As New Form With {
.Height = 500,
.Width = 700,
.Text = "HTML imported as RTF"
}
Dim x As String = My.Computer.FileSystem.ReadAllText(HtmlFilePath)
Dim wb As New WebBrowser With {
.DocumentText = x,
.Dock = DockStyle.Fill
}
While (wb.DocumentText <> x)
Application.DoEvents()
End While
f.Controls.Add(wb)
wb.Document.ExecCommand("SelectAll", False, Nothing)
wb.Document.ExecCommand("Copy", False, Nothing)
rtb.Paste()
rtb.SelectionStart = 0
Return 0
Catch qq As Exception
MessageBox.Show(qq.ToString, "Error HTML import", MessageBoxButtons.OK, MessageBoxIcon.Error)
Return 1
End Try
End Function
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim ofd As New OpenFileDialog With {
.Filter = "HTML files (*.html)|*.html",
.FileName = ""
}
Dim HtmlFilePath As String = ""
rtb.SelectionFont = New Font("Arial", 14, FontStyle.Regular)
If ofd.ShowDialog = Windows.Forms.DialogResult.OK Then
HtmlFilePath = ofd.FileName
Else
MessageBox.Show("You must specify a filename To open.", "Open Error", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
Application.Exit()
End If
HtmlToRtf(HtmlFilePath)
End Sub
Last edited by mobi12; Apr 4th, 2024 at 03:41 AM.
-
Apr 4th, 2024, 03:45 AM
#2
Re: Displaying HTML in Rich Text Box
Do you have access to Microsoft Word, because it should be able to do it. It's not an ideal solution but it should work. That said, maybe it would have similar trouble with images, given the different ways they are handled in HTML and RTF.
-
Apr 4th, 2024, 04:05 AM
#3
Thread Starter
Member
Re: Displaying HTML in Rich Text Box
Yes, I can use Word InterOp but legally I can't distribute my code with it. So I prefer a standalone solution or using any free Nuget package.
-
Apr 4th, 2024, 06:21 AM
#4
Re: Displaying HTML in Rich Text Box
-
Apr 4th, 2024, 06:50 AM
#5
Re: Displaying HTML in Rich Text Box
 Originally Posted by Arnoutdv
Just be aware of the pricing and license for that.
-
Apr 4th, 2024, 06:52 AM
#6
Re: Displaying HTML in Rich Text Box
 Originally Posted by mobi12
Yes, I can use Word InterOp but legally I can't distribute my code with it. So I prefer a standalone solution or using any free Nuget package.
I thought that I'd responded to this but apparently not. You can deploy any Interop assembly with your app that you like. You wouldn't be able to deploy the original COM component though. If you deploy the Interop assembly, anyone with Word installed will be able to use your app. If you don't want to make Word a requirement though, that's another matter.
-
Apr 4th, 2024, 06:59 AM
#7
Re: Displaying HTML in Rich Text Box
As richtextbox goes I have done something similar with adding and image to the clipboard and do a richtextbox.paste going down the line of .AppendText . An issue tho is that it will replace your clipboard data .
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Apr 4th, 2024, 10:34 AM
#8
Thread Starter
Member
Re: Displaying HTML in Rich Text Box
The problem seems lot more complex than I initially thought. I have written a very crude parser though still have not managed to figure out a way to include images. Embedded images in HTML are encoded in base64 byte arrays within <img src=...> so, need to extract that byte array and paste the image into RTF.
Code:
Public Function ReadHtmlAsRTF(ByVal HtmlFilePath As String) As String
'EXPERIMENTAL
Dim WordContent As String = ""
Dim lines() As String = File.ReadAllLines(HtmlFilePath)
Dim zb = New Text.StringBuilder()
'RTF header
zb.Append("{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Tahoma;}{\f1\fnil\fcharset0 Calibri;}{\f2\fnil\fcharset0 Times New Roman;}}")
zb.Append("{\colortbl ;\red255\green140\blue0;\red0\green128\blue0;\red0\green0\blue255;\red203\green146\blue39;\red255\green0\blue0;\red255\green255\blue0;}")
zb.Append("\pard")
'RTF styles
Dim RtfTitle As String = "\pard\b\f2\fs36"
Dim RtfHeading As String = "\pard\b\f2\fs32"
Dim RtfParagraph As String = "\pard\b0\f1\fs28"
Dim RtfBullet As String = "\pard\+ "
Dim RtfHeader As String = ""
Dim RtfFooter As String = "\par"
zb.Append(RtfHeader)
'REPLACE HTML TAGS WITH RTF TAGS
For Each h As String In lines
h = Regex.Replace(h, "<html>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</html>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<body>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</body>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<head>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</head>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<h1>", "\f0\fs44", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</h1>", "\f1\fs28\b0\par", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<h2>", "\f2\fs36", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</h2>", "\f1\fs28\b0\par", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<p>", "\f1\fs28", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</p>", "\par", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<br>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<b>", " \b ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</b>", " \b0 ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<strong>", " \b ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</strong>", " \b0 ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<em>", " \i ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</em>", " \i0 ", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<li>", "{\pntext\f4\'B7 }", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</li>", "\par", RegexOptions.IgnoreCase)
h = Regex.Replace(h, " ", "\par")
h = Regex.Replace(h, "<tbody>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</tbody>", "\par", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<tr>", "\f1\fs28\b0\trowd", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</tr>", "\row", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<td>", "", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "</td>", "\cell", RegexOptions.IgnoreCase)
h = Regex.Replace(h, "<[^>]+>", "", RegexOptions.IgnoreCase) 'clean remaining tags - must be last line
zb.Append(h)
Next
zb.Append("}")
WordContent = zb.ToString()
Return WordContent
End Function
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|