Get HTML from webpage w/o Inet
Hi there,
What is another way to download the page source code w/o using the Inet control?
Can a webbrowser control be used to do this? If so, please provide an example.
ex.:
strURL = "http://yahoo.com"
All i need is the source code of strURL put into strHTML without using the Inet control.
Thanks in advance!
Re: Get HTML from webpage w/o Inet
Re: Get HTML from webpage w/o Inet
You can use XML in addition to Winsock to get html code.
VB Code:
Option Explicit
Private Sub Command1_Click()
With Winsock
.RemoteHost = "finance.lycos.com"
.RemotePort = 80
.Close
.Connect
End With
End Sub
Private Sub Winsock_Connect()
Winsock.SendData "GET /qc/stocks/quotes.aspx?symbols=INTC" & Chr(10)
End Sub
Private Sub Winsock_DataArrival(ByVal bytesTotal As Long)
Dim X As String
Winsock.GetData X
Debug.Print X
End Sub
Private Sub Winsock_Error(ByVal Number As Integer, Description As String, ByVal Scode As Long, ByVal Source As String, ByVal HelpFile As String, ByVal HelpContext As Long, CancelDisplay As Boolean)
Debug.Print "Error "; Description
End Sub
Re: Get HTML from webpage w/o Inet
You see the webpage im trying to parse data from is special lol...
When i try to use inet to download html from it, it detects it as data mining or something and takes me to an error page and then my program displays the html of that error page...
so, what i need to do is create a webbrowser control, load the page in there and only then get the html from an already loaded page within my webbrowser control.
Re: Get HTML from webpage w/o Inet
Re: Get HTML from webpage w/o Inet
yeh im still trying to figure out how to use it
in your example above, is this what actually gets the html?
VB Code:
Private Sub Winsock_DataArrival(ByVal bytesTotal As Long)
Dim X As String
Winsock.GetData X
End Sub
maybe there is an easier way? How would i get InnerHTML from webbrowser1... trying to do some research on this, haven't found anything useful yet.
- lol.. laptop is dieing, hope i figure this out before 8% of my battery runs out
Re: Get HTML from webpage w/o Inet
WebBrowser1.Document.Body.parentElement.innerHTML
Re: Get HTML from webpage w/o Inet
VB Code:
'' Jamie Plenderleith
'' [url]http://www.plenderj.com[/url]
''
Public Function strGetHTMLBasic(ByVal strURI As String, ByVal strCustomPOSTData As String) As String
Dim objWebClient As HttpWebRequest = WebRequest.Create(strURI)
If Not strCustomPOSTData = vbNullString Then
Dim encoding As New ASCIIEncoding(), bteLength As Byte()
strCustomPOSTData = EnsureHTTPEncoded(strCustomPOSTData)
With objWebClient
.Method = "POST"
.ContentType = "application/x-www-form-urlencoded"
bteLength = encoding.GetBytes(strCustomPOSTData)
.ContentLength = bteLength.GetLength(0)
Dim stmRequestWriter As New StreamWriter(.GetRequestStream, encoding.ASCII)
stmRequestWriter.Write(strCustomPOSTData)
stmRequestWriter.Close()
End With
End If
Dim stmStreamReader As New StreamReader(objWebClient.GetResponse.GetResponseStream)
Dim strRetVal As String = stmStreamReader.ReadToEnd
stmStreamReader.Close() : Return strRetVal
End Function
Function EnsureHTTPEncoded(ByVal strString As String) As String
Dim strRetVal As String
If Not strString = vbNullString Then
If Not InStr(strString, "&") = 0 Then
Dim strArr() As String = Split(strString, "&")
Dim strTemp As String, strTempArr() As String
For Each strTemp In strarr
strTempArr = Split(strTemp, "=")
strRetVal &= HTTPEncode(strTempArr(0)) & "=" & HTTPEncode(strTempArr(1)) & "&"
Next
If Right(strRetVal, 1) = "&" Then strRetVal = Left(strRetVal, Len(strRetVal) - 1)
Else
Dim strArr() As String = Split(strString, "=")
strRetVal = HTTPEncode(strarr(0)) & "=" & HTTPEncode(strarr(1))
End If
End If
Return strRetVal
End Function
Function HTTPEncode(ByVal strString As String) As String
Return HttpUtility.UrlEncode(strString)
End Function
Re: Get HTML from webpage w/o Inet
Jamie, there's no StreamReader in Classic VB ;)
Re: Get HTML from webpage w/o Inet
lol. Assumed this was .NET because the last post I saw about screen scraping had you in it too :D
Re: Get HTML from webpage w/o Inet
Erm. Why not just use URLDownloadToFile? Search the forums, its an API. Download the source to your computer and load it in locally.
chem