Hello, I have an html file which contains:
Code:
<div class="bookmarks">
<p>
<b>Kategoriler<br></b><a href="#anchor1">Yüklü Yazılımlar</a><br>
<a href="#anchor2">Active Setup</a><br>
<a href="#anchor3">Installed Programs</a><br>
<a href="#anchor4"> Tools for .Net 3.5</a><br>
<a href="#anchor5">7-Zip 16.04 (x64 edition)</a><br>
<a href="#anchor6">Active Directory Authentication Library for SQL Server</a><br>
<a href="#anchor7">Active Directory Authentication Library for SQL Server (x86)</a><br>
<a href="#anchor8">Adobe Acrobat 7.0 Professional</a><br>
<a href="#anchor9">Adobe Acrobat Reader DC - Turkish</a><br>
...
</p>
</div>
and I want to get all innertext after (including) #anchor4. I know this is a general question but I couldn't understand logic of this kind of iterating. For example I tried this:
VB.NET Code:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load("installed_apps.html");
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@class='bookmarks']"))
{
string str = "";
foreach (HtmlNode node2 in node.SelectNodes("//p"))
{
foreach (HtmlNode node3 in node2.SelectNodes("//a[@href='anchor5']"))
{
str = node3.InnerText;
}
}
installedItems.Add(str);
}
However I got system.nullreferenceexception.