Ok, wow, not sure how to go about asking the question ...
So let me explain my delima .... I am getting two types of html source ... then again just might be going about this wrong ...
Ok I started out by going to http://www.edmunds.com/flipper/do/Me...rstNav=Gallery
Then I grabbed page source ... by right clicking and from the context menu getting page source. SO far so good right
Well looking at the page source ... I see right off the data I am needing which is this
The lines that begin with full: well I need to get the image url ... well ultimately I want to download into a folder every one of the full: images in the source codeCode:{ id: '20245103', thumb: '/pictures/VEHICLE/2009/Ford/2009.ford.f-150.20245103-ST.jpg', full: '/pictures/VEHICLE/2009/Ford/2009.ford.f-150.20245103-E.jpg', caption: '2009 Ford F-150 FX4 Extended Cab Shown', credits: '(Photo courtesy of Ford Motor Company)', desc: '2009 Ford F-150 FX4 Extended Cab Shown', title: '2010 Ford F-150' } ,{ id: '20245121', thumb: '/pictures/VEHICLE/2009/Ford/2009.ford.f-150.20245121-ST.jpg', full: '/pictures/VEHICLE/2009/Ford/2009.ford.f-150.20245121-E.jpg', caption: '2009 Ford F-150 Platinum Crew Cab Shown', credits: '(Photo courtesy of Ford Motor Company)', desc: '2009 Ford F-150 Platinum Crew Cab Shown', title: '2010 Ford F-150'
But this source code is different from what I am used to ...
Normally I would use a webbrowser control
Then basically do something like this
Then once the page is loaded I would try to get the table in which the data was located like this for exampleCode:web1.Navigate("http://www.edmunds.com/flipper/do/MediaNav/make=50/model=F-150/firstNav=Gallery") Do Until web1.ReadyState = WebBrowserReadyState.Complete Application.DoEvents() Loop
Find the table
dgcurrent in this case is from another page I parsed over a year agoCode:Dim myTable As HtmlElement = wb2.Document.All("dgCurrent")
Once I have the table then long story short
However ... above code was used to parse a page that had the html tags and suchCode:'MAKE SURE WE GOT A VALID TABLE OBJECT If myTable IsNot Nothing Then 'LOOP ALL ROW ELEMENTS IN TABLE For Each MyElement As HtmlElement In myTable.GetElementsByTagName("TR") 'THERE SHOULD BE 2 TD ELEMENTS PER ROW Dim myTDTags As HtmlElementCollection = MyElement.GetElementsByTagName("TD") 'IF WE GOT 2 TD ELEMENTS, WRITE THEM TO 2 ARRAYLIST If myTDTags.Count = 2 Then myPubAr.Add(myTDTags(0).InnerText) myLocAr.Add(myTDTags(1).InnerText) End If Next End If ' THEN DISPLAY IT INA A RICHTEXTBOX
IN this case I am not sure how to get this
All though I did stumble onto something that did change the structure of the html source by simply loooking at the innerhtml
But I am not sure how to parse innerhtml
Looking at this page in a webbrowser control and simply doing this
What is displayed in the RTB is the same page but the source code is more proper and looks like I could parse it but for the life of me I cant figure out how to parse the innerhtmlCode:Dim myHtml As String myHtml = web1.Document.Body.InnerHtml rtb1.Text = myHtml
Guys I need help
WHat I am wanting to accomplish here is get all the full size images and DL them to a folder
Any help is much appreciated


Reply With Quote