agmorgan
Aug 16th, 2007, 11:54 AM
I am writing an IRCbot that will respond to stock requests with the current price.
To do this I need a screen scraper.
I download the page then I need to look for the relevent content.import java.net.URL;
url = new URL(("http://bluebones.net/ticker/feed/?s=vod.L&n=uk"));
in = new java.io.BufferedReader(new java.io.InputStreamReader(url.openStream()));
line = in.readLine();
while (line != null)
{
xml = xml + line;
line = in.readLine();
}
xml = xml.replaceAll("<", "<").replaceAll(">", ">").replaceAll(""", "\"");
//Parse HTML here.
I just cant seem to get my sting containing the HTML into any DOM.
I have seen references to javax.swing.text.html.HTMLEditorKit and javax.xml.parsers.DocumentBuilder but I am struggling to make it work.
I don't want to use regex or any basic string manipulation to do it.
It seems so much harder to do stuff in java than VB :(
To do this I need a screen scraper.
I download the page then I need to look for the relevent content.import java.net.URL;
url = new URL(("http://bluebones.net/ticker/feed/?s=vod.L&n=uk"));
in = new java.io.BufferedReader(new java.io.InputStreamReader(url.openStream()));
line = in.readLine();
while (line != null)
{
xml = xml + line;
line = in.readLine();
}
xml = xml.replaceAll("<", "<").replaceAll(">", ">").replaceAll(""", "\"");
//Parse HTML here.
I just cant seem to get my sting containing the HTML into any DOM.
I have seen references to javax.swing.text.html.HTMLEditorKit and javax.xml.parsers.DocumentBuilder but I am struggling to make it work.
I don't want to use regex or any basic string manipulation to do it.
It seems so much harder to do stuff in java than VB :(