Results 1 to 14 of 14

Thread: [RESOLVED] Disable adobe reader plugin in Webbrowser

  1. #1

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Resolved [RESOLVED] Disable adobe reader plugin in Webbrowser

    Hi all,

    I am currently trying to program a web scrapping program, which should be working fine if I searched using internet explorer or chrome - since the results would appear as text (html). However, when I try to load it in VB.NET 2010, it always loads it with the adobe reader plugin embedded within the webpage (pdf). Is there a way I can disable this plugin or tell the webbrowser it does not have this plugin?

    Thanks in advance
    If my post has been helpful, please rate it!

  2. #2
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,401

    Re: Disable adobe reader plugin in Webbrowser

    Is there a particular reason why you are using the browser control if you are simply scrapping the site?
    My Github - 1d3nt

  3. #3

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Quote Originally Posted by ident View Post
    Is there a particular reason why you are using the browser control if you are simply scrapping the site?
    Yes, because I am using the document.innertext function to avoid some messy highlighting/link code from the html source.
    If my post has been helpful, please rate it!

  4. #4
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,401

    Re: Disable adobe reader plugin in Webbrowser

    Yes but why don't you simply use the webclient class?
    My Github - 1d3nt

  5. #5

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Sorry, what do you mean? Can you give an example? I think its because my program performs a search that involves extracting links from the first results page, browsing through these links and scrapping the results from the second set of pages. Hence the second set of results is dependent on the links scrapped from the first, rather then a stationary set of urls that can be scrapped using the webclient class.

    Not sure if the above makes sense? O.o Sorry for the confusion.
    If my post has been helpful, please rate it!

  6. #6
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,401

    Re: Disable adobe reader plugin in Webbrowser

    A web browser is UI element. You are not using it as such. If all you want is the pages html then use the webclient class.

    vb Code:
    1. Public Class Form1
    2.  
    3.     Private Sub Form1_Load(ByVal sender As System.Object,
    4.                            ByVal e As System.EventArgs) Handles MyBase.Load
    5.         Dim source As String = Nothing
    6.         Using wClient As New Net.WebClient
    7.             Try
    8.                 source = wClient.DownloadString(New Uri("url"))
    9.             Catch ex As Net.WebException
    10.                 MessageBox.Show(ex.Message)
    11.             End Try
    12.         End Using
    13.  
    14.         ' do what ever with the pages source....
    15.     End Sub
    16. End Class

    You would want to download the page using the async method. This will block the calling thread. But it's enough to give you an idea.
    My Github - 1d3nt

  7. #7

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Ye, there is a few issues with using that. The main thing is that I have an existing code in place that uses the browser, mainly because the online database requires a login password on first search.

    The second issue is that I need to inject variables systematically (names and dates) to perform the search on the website which will give me a list of links. I then browse through the second list of links to extract the WebBrowser1.Document.Body.InnerText only, as an easier way to capture the extract wanted in a readable format without all the html code bits left.

    I don't know why the results show in a pdf reader when used in my program but not through any other browsers.
    If my post has been helpful, please rate it!

  8. #8
    PowerPoster dunfiddlin's Avatar
    Join Date
    Jun 2012
    Posts
    8,245

    Re: Disable adobe reader plugin in Webbrowser

    I am currently trying to program a web scrapping program
    How very Luddite!

    I don't know why the results show in a pdf reader when used in my program but not through any other browsers.
    I suspect that you need to find out before any progress can be made. It could be a browser recognition problem. Unlike Internet Explorer the browser does not announce itself as an advanced browser. Having said that, it seems a little counter-intuitive for the site to default to the more complicated format if it cannot determine the browser's capabilities.

    As I seem to have said a lot recently the control of plug-ins etc. is handled in Windows by Internet Options, a separation of powers which is intended to make it impossible for a programmer to interfere with the user's personal choices. That means that there is no way (or at least none that I know of) of changing settings on the fly (which, on balance, is probably a good thing!)
    As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"

    Reviews: "dunfiddlin likes his DataTables" - jmcilhinney

    Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!

  9. #9

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Is there some setting in either internet explorer or adobe reader that I can change to handle this?
    If my post has been helpful, please rate it!

  10. #10
    Addicted Member
    Join Date
    Nov 2006
    Posts
    129

    Re: Disable adobe reader plugin in Webbrowser

    It would be impossible to scrap a pdf page anyways? it's not HTML.. its like lets say a exe file you try to open that in internet explorer you'll get a bunch of symbols.

    if you open a pdf page in internet explorer with view-source:http://to.com/file.pdf what do you get?

    Thats the answer too.. just use

    Code:
    view-source:http://www.bapio.co.uk/uploads/publications/1342172154.pdf
    if the file extension is PDF instead of html/php etc...

    Here is what your scrapper will see, it's not HTML code
    Code:
    %PDF-1.2 
    %âãÏÓ
     
    9 0 obj
    <<
    /Length 10 0 R
    /Filter /FlateDecode 
    >>
    stream
    H‰ÍÑJÃ0†Ÿ ïð{§²fç$M“ínÒ-‚[&jeŠâÛÛ¤ñ~‚$ÉÉÿ}ÉÉ…¬Ij«¬ÌsÀ—‚Ç~€XÖ-],÷‚$Y—÷Ó)ü'N«u*1!œ„ÀVÙ?ŸÁ?
    žb1RbbœÒ‰ÉH²[¹™TD:#ž&Ø*ÙÌX®¦øiç»$qnf¬ƒ¿†¶]»ÀõËîãaÿ¶{ÿÂØ£‰›×q|JªLs]™QÒI¸¬jî„%¯Œ9Øé`ß঺¼ÅU»itezÛ$›’Ú¿OeBÆÄ’Ò¯á¸Råþ@zÜ—úóÿgª¼ø<õ¡ª
    endstream
    endobj
    10 0 obj
    246
    endobj
    4 0 obj
    <<
    /Type /Page
    /Parent 5 0 R
    /Resources <<
    /Font <<
    /F0 6 0 R 
    /F1 7 0 R 
    >>
    /ProcSet 2 0 R
    >>
    /Contents 9 0 R

  11. #11

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Quote Originally Posted by sspoke View Post
    It would be impossible to scrap a pdf page anyways? it's not HTML.. its like lets say a exe file you try to open that in internet explorer you'll get a bunch of symbols.

    if you open a pdf page in internet explorer with view-source:http://to.com/file.pdf what do you get?

    Thats the answer too.. just use

    Code:
    view-source:http://www.bapio.co.uk/uploads/publications/1342172154.pdf
    if the file extension is PDF instead of html/php etc...

    Here is what your scrapper will see, it's not HTML code
    Code:
    %PDF-1.2 
    %âãÏÓ
     
    9 0 obj
    <<
    /Length 10 0 R
    /Filter /FlateDecode 
    >>
    stream
    H‰ÍÑJÃ0†Ÿ ïð{§²fç$M“ínÒ-‚[&jeŠâÛÛ¤ñ~‚$ÉÉÿ}ÉÉ…¬Ij«¬ÌsÀ—‚Ç~€XÖ-],÷‚$Y—÷Ó)ü'N«u*1!œ„ÀVÙ?ŸÁ?
    žb1RbbœÒ‰ÉH²[¹™TD:#ž&Ø*ÙÌX®¦øiç»$qnf¬ƒ¿†¶]»ÀõËîãaÿ¶{ÿÂØ£‰›×q|JªLs]™QÒI¸¬jî„%¯Œ9Øé`ß঺¼ÅU»itezÛ$›’Ú¿OeBÆÄ’Ò¯á¸Råþ@zÜ—úóÿgª¼ø<õ¡ª
    endstream
    endobj
    10 0 obj
    246
    endobj
    4 0 obj
    <<
    /Type /Page
    /Parent 5 0 R
    /Resources <<
    /Font <<
    /F0 6 0 R 
    /F1 7 0 R 
    >>
    /ProcSet 2 0 R
    >>
    /Contents 9 0 R
    I don't just mean the actual file, but rather an embedded PDF viewer within the webpage. I'm guessing their code has something to detect whether or not the PDF viewer plugin is enabled, and then feeds results either in HTML as text or through PDF viewer as an embedded PDF.
    If my post has been helpful, please rate it!

  12. #12
    Addicted Member
    Join Date
    Nov 2006
    Posts
    129

    Re: Disable adobe reader plugin in Webbrowser

    Well best you can do is when WebBrowser one is done loading when DocumentCompleted Event is fired do
    webBrowser1.Stop()

    it may cancel the pdf viewer from loading

    Or the iframe which contains the pdf? just delete the iframe and problem is solved.. just detect if the iframe has a pdf first..
    Here is a code that removes all iframes.

    Code:
    For Each x As HtmlElement In DirectCast(sender, WebBrowser).Document.GetElementsByTagName("iframe")
    	x.OuterHtml = String.Empty
    Next
    but if it's not iframe but instead embed you can try

    Code:
    For Each x As HtmlElement In DirectCast(sender, WebBrowser).Document.GetElementsByTagName("embed")
    	x.OuterHtml = String.Empty
           //or
           //x.SetAttribute("src", String.Empty)
    Next
    in chrome its like this normally
    Code:
    <embed width="100%" height="100%" name="plugin" src="http://example.com/pdf.pdf" type="application/pdf">
    Last edited by sspoke; Jul 22nd, 2013 at 01:54 AM.

  13. #13
    PowerPoster dunfiddlin's Avatar
    Join Date
    Jun 2012
    Posts
    8,245

    Re: Disable adobe reader plugin in Webbrowser

    Quote Originally Posted by JXDOS View Post
    Is there some setting in either internet explorer or adobe reader that I can change to handle this?
    Was I not clear?

    the control of plug-ins etc. is handled
    entirely, solely and exclusively
    in Windows by Internet Options
    Better?
    As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"

    Reviews: "dunfiddlin likes his DataTables" - jmcilhinney

    Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!

  14. #14

    Thread Starter
    Hyperactive Member JXDOS's Avatar
    Join Date
    Aug 2006
    Location
    Mars...
    Posts
    423

    Re: Disable adobe reader plugin in Webbrowser

    Thanks for the help guys. sspoke's solution seems to do the trick
    If my post has been helpful, please rate it!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width