[RESOLVED] Working with WebBrowser Control
I am trying to search a page for certain links only, but can't seem to find a way to have it 'search' for a 'keyword' in the links, and only list those in a ListBox - What I have now works to grab ALL the links on the page, but I need to only find SPECIFIC ones:
Code:
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button1.Click
WebBrowser1.Navigate(TextBox1.Text)
End Sub
Private Sub WebBrowser1_DocumentCompleted( _
ByVal sender As Object, _
ByVal e As WebBrowserDocumentCompletedEventArgs) _
Handles WebBrowser1.DocumentCompleted
If (WebBrowser1.ReadyState = WebBrowserReadyState.Complete) Then
For Each ClientControl As HtmlElement In WebBrowser1.Document.Links
Debug.Print(ClientControl.GetAttribute("href"))
Next
End If
End Sub
End Class
Any help would be greatly appreciated - I spent the last hour Googling this, and no mention anywhere on how to find only certain links...
Re: Working with WebBrowser Control
You can use some LINQ to create a collection of only the links that match your criteria.
For example, this code would find all links on the vbforums.com homepage that contain f=25 in the URL (which is the forum ID for the VB.NET forum section on here)
Code:
Dim myLinks = From Link As HtmlElement In WebBrowser1.Document.Links.OfType(Of HtmlElement)() _
Where Link.GetAttribute("href").Contains("f=25")
Select Link
For Each Link As HtmlElement In myLinks
Debug.WriteLine(Link.GetAttribute("href"))
Next
Re: Working with WebBrowser Control
Hmmm... It's still grabbing ALL the links on the page, even with a 'filter' set *scratches head*
Re: Working with WebBrowser Control
How would it be possible to get a link based on an attribute of that link, without checking each link to see if the given attribute meets your criteria?
If I had 3 coins under 3 cups, and I told you the coin under each could possibly be a penny, a dime, or a quarter, and asked how many quarters were there, how could you tell without looking under each cup?
Re: Working with WebBrowser Control
I'm trying to have just the specified links to be stored in a ListBox, and any others are disregarded if they do not meet the criteria - I know ALL the links need to be loaded and checked, but they are ALL going to my ListBox, not the user-defined ones...
EDIT:
I just had a random thought - I have ALL gathered links loaded into ListBox1 - Would it be easier to search through ListBox1 for the 'filters' I am looking for, and have them sent to ListBox2?
Re: Working with WebBrowser Control
Using the code I posted to you that uses LINQ to grab all links that fall within your specified criteria, you don't need to do that.
Code:
Dim myLinks = From Link As HtmlElement In WebBrowser1.Document.Links.OfType(Of HtmlElement)() _
Where Link.GetAttribute("href").Contains("f=25")
Select Link
Listbox1.Items.AddRange(myLinks.ToList) 'only will put links from vbforums.com that contain 'f=25' in the URL
Re: Working with WebBrowser Control
I get this now:
'Error 1 Overload resolution failed because no accessible 'AddRange' can be called with these arguments:
'Public Sub AddRange(items() As Object)': Value of type 'System.Collections.Generic.List(Of System.Windows.Forms.HtmlElement)' cannot be converted to '1-dimensional array of Object'.
'Public Sub AddRange(value As System.Windows.Forms.ListBox.ObjectCollection)': Value of type 'System.Collections.Generic.List(Of System.Windows.Forms.HtmlElement)' cannot be converted to 'System.Windows.Forms.ListBox.ObjectCollection'.'
Re: Working with WebBrowser Control
Sorry. Happens sometimes when I just type it out in the forum instead of confirming in the IDE.
Since the link will actually just be an HTMLElement object, and what you really want is the href value of that, you need to loop the result set and grab the element you need.
Try this:
Code:
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If (WebBrowser1.ReadyState = WebBrowserReadyState.Complete) Then
Dim myLinks = From Link As HtmlElement In WebBrowser1.Document.Links.OfType(Of HtmlElement)() _
Where Link.GetAttribute("href").Contains("f=25")
Select Link
For Each Link As HtmlElement In myLinks 'only loops through links that you want, not all.
ListBox1.Items.Add(Link.GetAttribute("href"))
Next
End If
End Sub
Re: Working with WebBrowser Control
LOL - After playing around a bit, I came up with the exact code you just posted - I guess I was on the right track - Still new to .NET, and this LINQ thing is throwing me for a loop - I spent over 6 years programming in VB6, so switching hasn't been easy :-P
Thank you for all your help - Another thing I need to do at some point is figure out how to have the ListBox NOT fill up with repeat links when the WebBrowser control navigates to a different page - I know it has something to do with where I have the code placed, probably...
Re: Working with WebBrowser Control
you could do this which would avoid duplicates in the listbox, it should not cause any major performance impact unless the listbox has a ton of items.
Code:
For Each Link As HtmlElement In myLinks
If not ListBox1.Items.Contains(Link.GetAttribute("href")) then
ListBox1.Items.Add(Link.GetAttribute("href"))
end if
Next
Re: Working with WebBrowser Control
Damn - 'Link' is not declared. It may be inaccessible due to its protection level.
I must have put something in the wrong spot :-(
EDIT:
Never mind the previous post - I just placed it wrong ...