Grabbing contents off a webpage.
Hello there.
I was wondering if anyone could provide me with any help on grabbing data from a webpage in VB 6, and looking for particular elements, and showing them elements only, ie a date. I've had a look for a few guides and tutorials although nothing substantial which shows me how to get the content. Also is there anyway how to refresh the data every few minutes?
- Kind Regards
Adam
Re: Grabbing contents off a webpage.
This is what I use to get HTML source. Just dump all this code into a module or something:
VB Code:
Option Explicit
Public Declare Function InternetOpen Lib "wininet.dll" Alias "InternetOpenA" (ByVal sAgent As String, ByVal lAccessType As Long, ByVal sProxyName As String, ByVal sProxyBypass As String, ByVal lFlags As Long) As Long
Public Declare Function InternetOpenUrl Lib "wininet.dll" Alias "InternetOpenUrlA" (ByVal hInternetSession As Long, ByVal sURL As String, ByVal sHeaders As String, ByVal lHeadersLength As Long, ByVal lFlags As Long, ByVal lContext As Long) As Long
Public Declare Function InternetReadFile Lib "wininet.dll" (ByVal hFile As Long, ByVal sBuffer As String, ByVal lNumBytesToRead As Long, lNumberOfBytesRead As Long) As Integer
Public Declare Function InternetCloseHandle Lib "wininet.dll" (ByVal hInet As Long) As Integer
Public Const IF_FROM_CACHE = &H1000000
Public Const IF_MAKE_PERSISTENT = &H2000000
Public Const IF_NO_CACHE_WRITE = &H4000000
Private Const BUFFER_LEN = 256
Public Function GetUrlSource(sURL As String) As String
Dim sBuffer As String * BUFFER_LEN, iResult As Integer, sData As String
Dim hInternet As Long, hSession As Long, lReturn As Long
'get the handle of the current internet connection
hSession = InternetOpen("vb wininet", 1, vbNullString, vbNullString, 0)
'get the handle of the url
If hSession Then hInternet = InternetOpenUrl(hSession, sURL, vbNullString, 0, IF_NO_CACHE_WRITE, 0)
'if we have the handle, then start reading the web page
If hInternet Then
'get the first chunk & buffer it.
iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
sData = sBuffer
'if there's more data then keep reading it into the buffer
Do While lReturn <> 0
iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
sData = sData + Mid(sBuffer, 1, lReturn)
Loop
End If
'close the URL
iResult = InternetCloseHandle(hInternet)
GetUrlSource = sData
End Function
And use it like this:
VB Code:
Private Sub Command1_Click()
'Get the HTML source.
Dim strHTML As String
strHTML = GetURLSource("http://www.google.com")
End Sub
As for extracting information (a.k.a "parsing"), you'll have to use string manipulation, which usually involves a combination of InStr(), Left$(), Mid$(), Right$(), and maybe InStrRev().
Depends on what kind of information you're trying to get. I would at least learn how to use those functions first if you don't know how to already.
Re: Grabbing contents off a webpage.
Using the web browser control you can interact with a webpage or change its display etc. Here is a good thread with a few of my code examples.
http://www.vbforums.com/showthread.php?t=330341
Re: Grabbing contents off a webpage.
Thanks guys, DigiRev thats a big help, is there anyway I can refresh the data after so long?
Re: Grabbing contents off a webpage.
Quote:
Originally Posted by DarkDemon
Thanks guys, DigiRev thats a big help, is there anyway I can refresh the data after so long?
You can put the code in a Timer event instead of a Command Button.
Something like this (I haven't tested it, but it should work). Add a timer to your form.
VB Code:
Option Explicit
Dim intCounter As Integer 'Number of seconds passed.
Dim strHTML As String 'HTML source.
'5 second pause time.
'Change to whatever you want.
Private Const PAUSE_TIME As Integer = 5
Private Sub Command1_Click()
Timer1.Enabled = False 'Stop timer.
intCounter = 0 'Reset counter (seconds).
Timer1.Enabled = True 'Start timer.
End Sub
Private Sub Form_Load()
Timer1.Enabled = False
Timer1.Interval = 1000 'Timer_Timer() event will fire every 1000 milliseconds (1 second).
End Sub
Private Sub Timer1_Timer()
Timer1.Enabled = False
intCounter = intCounter + 1 'Increase counter because a second has passed.
'Check if we've paused long enough (5 seconds in this example).
If intCounter = PAUSE_TIME Then
strHTML = GetURLSource("http://www.google.com")
'Process HTML here.
'Reset counter.
intCounter = 0
End If
Timer1.Enabled = True
End Sub
Re: Grabbing contents off a webpage.
Thanks again thats a bunch of help which has pointed me in the right direction.
My final question is if the webpage contents are as follows
Name: Bobby
Age: 23
Message: Hello everyone
How split this data up, by grabbing it and showing it in seperate labels. The main reason I'm asking is there is other content around this data.
Re: Grabbing contents off a webpage.
VB Code:
Dim arr() As String
arr() = Split(WebBrowser1.Document.body.innerText, vbCrLf)
Label1.Caption = arr(0)
Label2.Caption = arr(1)
Label3.Caption = arr(2)
Re: Grabbing contents off a webpage.
Using the webbrowser control is boring.
Re: Grabbing contents off a webpage.
Parsing is easy if the source is short... after awhile it can get out of hand. Using the webbrowser may be boring but it allows you to set up an HTMLDocument... and that object has getElementsByTagName() and other useful methods.