[RESOLVED] [2005] Webbrowser - Waiting for form to load
Greetings
I'm trying to create a VB application that will login to a website and dump the HTML source to a local text file once logged in. I'm using the webbrowser object for this problem because the HTMLRequest object is not recognized as a supported browser when attempting to grab the initial login page. The website simply redirects me to a unsupported browser page. Below is my code:
Code:
Option Strict Off
Option Explicit On
Imports System.Net
Imports System.IO
Imports System.Threading
Friend Class Form1
Inherits System.Windows.Forms.Form
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
wWeb.Navigate("http://www.website.com")
MessageBox.Show("page loaded")
End Sub
Private Sub butLogin_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles butLogin.Click
wWeb.Document.All.Item("email").InnerText = "user@domain"
wWeb.Document.All.Item("pass").InnerText = "xxxxx"
wWeb.Document.All.Item("doquicklogin").InvokeMember("click")
End Sub
Private Sub butExit_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles butExit.Click
Application.Exit()
End Sub
Private Sub butdump_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles butdump.Click
'Dim html As String = wWeb.Document.Body.InnerHtml
Dim html As String = wWeb.DocumentText
Dim fs As FileStream
fs = New FileStream("dump.txt", FileMode.Create, FileAccess.Write)
Dim s As New StreamWriter(fs)
s.Write(html)
s.Close()
fs.Close()
End Sub
End Class
In this configuration the code works fine, but user interaction is required after form load. The form has 3 buttons: login, dump, and exit. I would like to remove the buttons and let it all happen during the form1_load sub, but I get the following error:
Object reference not set to an instance of an object.
Even when I use thread.sleep(5000) before trying to fill the web form out (trying to let the page load) i still get the error. I also tried to put the initial wweb.navigate("http://www.website.com") in its own sub along with login and dump into their own subs while calling each from the form1_load sub. I still ran into the same error. By stepping through my code, I noticed that my windows form doesn't become visible until form1_load completes. So in essence, the browser doesn't have a page loaded. This to me means my calls on the web form objects (to fill them out and submit) don't exist until after form1_load completes.
So my problem is now how to get around this dilemma. It was suggested to me to use multi-threading, but i've never attempted it before and it seems too complicated for the problem. Any Thoughts?
Use the webbrowser control's DocumentComplete event. It fires when the page has loaded in the browser. Normally in IE this is when the status bar (the bottom left corner of IE) says "Done".
Note that this event can fire multiple times for a single page if the page is done in frames.
Below is the code I tried as per your advice. (I think this is what you implied)
Code:
Private Sub Form1_Load(...) Handles MyBase.Load
wWeb.Navigate("http://www.website.com")
Dim i As Integer
Do While wWeb.ReadyState <> WebBrowserReadyState.Complete
i = i + 1
Loop
MessageBox.Show(i)
End Sub
I put this loop in after calling the navigate method. I used i as a counter to get an idea how long it takes to complete, but I never make it out of the loop. It seems the webbrowser object doesn't navigate until after the form1_load sub finishes. In fact, the form never appears on screen, or on the task bar until form1_load finishes.
Controls fire events, and the webbrowser has one called DocumentCompleted that fires when the page has been loaded. You can access it from the drop down menus when you are in code view (see picture)
That worked great. I should have known to look for event handlers on the webbrowser object, however I do have 1 last question. Below is the current state of my code with the revisions:
Code:
Option Strict Off
Option Explicit On
Imports System.Net
Imports System.IO
Imports System.Threading
Friend Class Form1
Inherits System.Windows.Forms.Form
Dim boo As Boolean
Private Sub Form1_Load(...) Handles MyBase.Load
wWeb.Navigate("http://www.website.com")
End Sub
Private Sub butExit_Click(...) Handles butExit.Click
Application.Exit()
End Sub
Private Sub wWeb_DocumentCompleted(...) Handles wWeb.DocumentCompleted
If boo = False Then
wWeb.Document.All.Item("email").InnerText = "username"
wWeb.Document.All.Item("pass").InnerText = "xxxx"
wWeb.Document.All.Item("doquicklogin").InvokeMember("click")
boo = True
ElseIf boo = True Then
Dim html As String = wWeb.DocumentText
Dim fs As FileStream
fs = New FileStream("dump.txt", FileMode.Create, FileAccess.Write)
Dim s As New StreamWriter(fs)
s.Write(html)
s.Close()
fs.Close()
End If
End Sub
End Class
In wWeb_DocumentCompleted sub (the documentCompleted event handler) I used a if else statement to differentiate between 2 DocumentComplete events. Is there a better way to handle the same event while allowing me to execute different code per case? (without using if else statement).
Re: [RESOLVED] [2005] Webbrowser - Waiting for form to load
well I am going to guess your reason for using the boolean is to determine when to take what action correct?
Use the other properties of the webbrowser to determine this. For example if the login page is a different URL than the page you want to save to a text file, you can check the webbrowsers current URL to determine what action to take.
If the website is setup in a way that both the login page and the content you want to save to file is all in the same URL, then you can look for certain things in the page to determine what action to take.
For example look in the document for the "email" or "pass" textfields. If they exist, then you need to send the login info and trigger the button click, if not, then you must be logged in, and you can save the data to your text file.