is there a way to get plain text from Microsoft Internet Control or transfer control
is there a way to get plain text from the Microsoft Internet Control or Microsoft Internet Transfer Control
For Example if u go to a page and hit Ctrl A and Then Ctrl C and then open notepad and hit Ctrl V u get plain next with no HTML in it. Can that be done with the Microsoft Internet Control or Microsoft Internet Transfer Control
Re: is there a way to get plain text from Microsoft Internet Control or transfer cont
You will need to strip out the HTML tags from the code...
You can do this with string manipulation like Left$(), Mid$(), Right$(), InStr(), etc., but might be best using something like the XHTML or XML object. There is code for parsing HTML/XML using this object in here, but might take a bit of searching...
Someone named MarkT has posted it a few times I think.
Re: is there a way to get plain text from Microsoft Internet Control or transfer cont
Originally Posted by SomeQuick
is there a way to get plain text from the Microsoft Internet Control or Microsoft Internet Transfer Control
For Example if u go to a page and hit Ctrl A and Then Ctrl C and then open notepad and hit Ctrl V u get plain next with no HTML in it. Can that be done with the Microsoft Internet Control or Microsoft Internet Transfer Control
Using the Microsoft Internet Control (a.k.a. WebBrowser control) this should put all text from the loaded page into a text box (Text1).
Code:
Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
On Error Resume Next
Text1 = WebBrowser1.Document.Body.Innertext
End Sub
Re: is there a way to get plain text from Microsoft Internet Control or transfer control
I suspect SomeQuick is talking about the Internet Transfer Control (ITC) which people often refer to as Inet.
When you copy from IE to the clipboard it places both HTML and Text versions there. Notepad only pastes from the Text. Thus IE is really doing the work of stripping out HTML.
A WebBrowser control is one way to do this, but it can be a little "heavy." A lighter control that can also do this is the TriEdit control, typically referred to as the DHTMLEdit control.
Normally people use the DHTMLEdit control as a sort of "super RichTextBox" to enable the display and editing of HTML documents. It can also be used as a hidden control though, which gives you a more compact way of manipulating HTML: it offers a DOM property much like that in IE (and the WebBrowser control).
It can also be loaded via its LoadURL method or you can set its DocumentHTML property to a string. There is a SourceCodePreservation design-time property that keeps all source document whitespace, and a FilterSourceCode method that cleans up extraneous markup produced when the document is parsed into the DOM.
However:
Due to a number of exploits found in this control over the years Microsoft decided to pull it from Vista. When this quickly became a problem Microsoft had a new secure work-alike created that can be downloaded from: DHTML Editing Control for Applications Redistributable Package (x86) which works great.
The hard part now is finding the companion SDK, in particular the documentation for this product. I have found the old documents at DHTML Editing Component SDK however.
I've also made a small demo showing its use as an "HTML TextBox" control.
Re: is there a way to get plain text from Microsoft Internet Control or transfer cont
Originally Posted by Edgemeal
Using the Microsoft Internet Control (a.k.a. WebBrowser control) this should put all text from the loaded page into a text box (Text1).
Code:
Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
On Error Resume Next
Text1 = WebBrowser1.Document.Body.Innertext
End Sub
Thank You, That Works Nice!
I was wondering if u also knew of a way to do with with the Inet Control