[RESOLVED] Pulling an Element From an HTML Array Using WebBrowser
Hello,
I am working on a program that generates a batch file partially from excel data and partially from an online database with a web interface. I'm using WebBrowser rather than a WebClient because to access the database the user has to go through a company global login page since the database is on the company intranet.
For most of the fields it has been fairly straightforward pulling data from the page using GetAttribute. However I've found that using this method to pull the County doesn't return the county as the user sees it. For example, the page would display a county called "Accomack" but when I use the method GetElementById("county").GetAttribute("value") it pulls an integer instead of the county name, "2814" in this case.
I discovered that embedded in the HTML is an array for every county in the US, each county being assigned an identifier code. The HTML code is as follows:
HTML Code:
var arrCounty = new Array (
new Array( '2310','Abbeville','SC'),
new Array( '1111','Acadia','LA'),
new Array( '2814','Accomack','VA'),
So what method can I use to return the county name, and possibly state too, rather than pulling the useless identifier code?
Thanks,
Michael
Edit: I should clarify that the array created is actually a javascript array which I believe is then pulled by the HTML field.
Re: Pulling an Element From an HTML Array Using WebBrowser
did you try adding a watch to GetElementById("county") to see if any of the properties return what you are looking for? Perhaps look at .innertext?
Re: Pulling an Element From an HTML Array Using WebBrowser
Quote:
Edit: I should clarify that the array created is actually a javascript array which I believe is then pulled by the HTML field.
Well yes, we kinda assumed that since there's no such thing as an HTML array. It's not much help though without the relevant HTML section. Evidently you're addressing the wrong attribute but without the HTML, or better yet the page url so we can see it in context, we have absolutely know way of knowing what the right one might be!
Re: Pulling an Element From an HTML Array Using WebBrowser
Yeah I apologize, I'm not extremely well-versed in HTML. Umm well providing the URL wouldn't help because it's hosted on the company intranet, so you wouldn't be able to access it. But I can provide the relevant HTML section.
HTML Code:
<select name="county" class="select1"><option value="">SELECT</option></select>
and here is the method that populates the county array:
fnpopulateCounty(document.getElementById('state'),'county');
Re: Pulling an Element From an HTML Array Using WebBrowser
Quote:
Originally Posted by
jayinthe813
did you try adding a watch to GetElementById("county") to see if any of the properties return what you are looking for? Perhaps look at .innertext?
.innertext seems to return the selected index of the "class" string. I'm going to goof around with a couple others though to see if I get anything useful.
Re: Pulling an Element From an HTML Array Using WebBrowser
I feared as much. The problem is that you have a selection control which displays one thing but returns another (just as the standard ComboBox does when databound). If you use this select tag as your source you will therefore always get the value (the ... er ... "useless identifier code"). There simply is no other option. You therefore have little choice but to create a Dictionary which will enable you to reference the aforementioned useless identifier code and return the county name once you have obtained this value.
Re: Pulling an Element From an HTML Array Using WebBrowser
I see that the innerhtml contains this dictionary of values. It looks as follows:
InnerHtml = "<OPTION value="">SELECT</OPTION><OPTION selected value=549>Ada</OPTION><OPTION value=550>Adams</OPTION><OPTION value=551>Bannock</OPTION><OPTION value=552>Bear Lake</OPTION><OPTION value=553>Benewah</OPTION><OPTION value=554>Bingham</OPTION><OPTION value=5...
It also may be of note that the first selected value returned is the one I had used as a test example (value = 549, Ada). This is NOT the first one on the whole list, however this was the first one pulled.
Re: Pulling an Element From an HTML Array Using WebBrowser
It should be possible to create a Dictionary by parsing this then, though I have to say that the VS Html editor chewed up and spat out that HTML section so if it's literally what's in the source there may be trouble ahead. Using a correctly formatted version ...
HTML Code:
<select name="county" class="select1"><option value="">SELECT</option>
<option value="549">Ada</option><option value="550">Adams</option><option value="551">Bannock</option><option value="552">Bear Lake</option><option value="553">Benewah</option><option value="554">Bingham</option>
</select>
.... this is what I suggest.
vb.net Code:
Dim Counties As New Dictionary(Of String, String)
' To parse the dictionary
For Each OptElement As HtmlElement In WebBrowser1.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" Then
Counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
End If
Next
' To obtain the county name after selection
CountyName = Counties(WebBrowser1.Document.GetElementById("county").GetAttribute("value"))
Re: Pulling an Element From an HTML Array Using WebBrowser
Well this appears to be working, I can see that it's first adding all of the states but then the counties to the dictionary. The only issue is it's getting about 50 counties in before it encounters a repeated county and breaks.
At least this is what I'm assuming, it gives the error "An item with the same key has already been added." at the line:
Counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText). Then looking at "Counties" I can see the list that had been started, first with the states (I'm assuming to the similar format that they create an array for the states) and then the counties.
Now the interesting thing is that once it gets to the counties the very first county it lists is the one I want (Ada). So it doesn't start from the beginning of the counties list, it starts from the one I want and then continues adding all of the counties with subsequent identifiers. So it seems to me like the whole dictionary isn't necessary since it starts recording counties right where I need it.
Edit: Or I suppose another way I could do it is just have it stop adding after the 58th indexed item (the first county after all of the states/provinces are added, i.e. the one I want) and then just index that desired item. What do you think?
Re: Pulling an Element From an HTML Array Using WebBrowser
I took your code and just put stop so once it reached the first county it would exit the For loop and it works great now that there are no repeats. Perhaps there is a more elegant way of doing this but the main thing is it works!
Code:
Dim i As Integer = 0
Try
' To parse the dictionary
For Each OptElement As HtmlElement In WebBrowser.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" Then
Counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
i += 1
End If
If i > 58 Then
Exit Try
End If
Next
Finally
End Try
Thanks for the help!
Re: Pulling an Element From an HTML Array Using WebBrowser
Quote:
Originally Posted by
mm040e
I took your code and just put stop so once it reached the first county it would exit the For loop and it works great now that there are no repeats. Perhaps there is a more elegant way of doing this but the main thing is it works!
Code:
Dim i As Integer = 0
Try
' To parse the dictionary
For Each OptElement As HtmlElement In WebBrowser.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" Then
Counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
i += 1
End If
If i > 58 Then
Exit Try
End If
Next
Finally
End Try
Thanks for the help!
Instead of counting you could do:
Code:
For Each OptElement As HtmlElement In WebBrowser1.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" And Not counties.ContainsKey(OptElement.GetAttribute("value")) Then
counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
End If
Next OptElement
If it is impossible to reach an empty string in your collection you could also do
Code:
For Each OptElement As HtmlElement In WebBrowser1.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" And Not counties.ContainsKey(OptElement.GetAttribute("value")) Then
counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
Else
Exit For
End If
Next OptElement
Not sure if 'exit try' is a good habit to get into compared to 'exit for' when looping, but I guess that's more opinion. In this example it should have the same effect, I would just make sure you don't intend to run any more code in the try...catch block at that point.
Re: Pulling an Element From an HTML Array Using WebBrowser
Quote:
Originally Posted by
jayinthe813
Instead of counting you could do:
Code:
For Each OptElement As HtmlElement In WebBrowser1.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" And Not counties.ContainsKey(OptElement.GetAttribute("value")) Then
counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
End If
Next OptElement
If it is impossible to reach an empty string in your collection you could also do
Code:
For Each OptElement As HtmlElement In WebBrowser1.Document.Body.GetElementsByTagName("option")
If OptElement.GetAttribute("value") <> "" And Not counties.ContainsKey(OptElement.GetAttribute("value")) Then
counties.Add(OptElement.GetAttribute("value"), OptElement.InnerText)
Else
Exit For
End If
Next OptElement
Not sure if 'exit try' is a good habit to get into compared to 'exit for' when looping, but I guess that's more opinion. In this example it should have the same effect, I would just make sure you don't intend to run any more code in the try...catch block at that point.
I appreciate the suggestions, I would agree that these are definitely more refined methods to accomplish my goal.