|
-
Apr 18th, 2006, 06:33 AM
#1
Thread Starter
Fanatic Member
string/html parsing in c#
Hello!
I'm building a string parser for my friend and I need some help...
The parser will get a string which is an html file.
from that html (which is received as a string) the parser needs to exract the
attribute called 'content' from an element called 'meta name="keywords"'.
this is for a search function...
I thought of convering the string to a xml file and then exract the attribute,
but I don't know how...
can anyone help?
thanks!
-
Apr 18th, 2006, 08:13 AM
#2
Re: string/html parsing in c#
Are you using .NET 2.0? Please use the radio buttons provided to specify your IDE/Framework version when creating a thread.
-
Apr 18th, 2006, 08:34 AM
#3
Hyperactive Member
Re: string/html parsing in c#
I dont think converting html to xml is really a good idea... or even going to work for that matter. Why not use normal string functions like IndexOf and SubString?
-
Apr 19th, 2006, 02:18 PM
#4
Hyperactive Member
Re: string/html parsing in c#
When you say the HTML file is passed as a string, are you saying the contents of the HTML file (the actual HTML code) is passed as a string, or that the filename to the HTML file is passed as a string?
You should look at the MSHTML library. It lets you automate an Internet Explorer window.
So you can navigate to a URL or file on disk, and iterate through all the elements. The most important part is that you can get back a collection of any type of element you want.
Check out this site.
http://www.csharphelp.com/archives/archive146.html
I can also post some of my own code if you want additional reference material.
-
Apr 20th, 2006, 10:50 AM
#5
Thread Starter
Fanatic Member
Re: string/html parsing in c#
I meant that the html code is passed on as a string,
not the url...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|