Hello.
I am trying to write an app that will read in a HTML document and extract all the URLs from it.
I currently have the HTML document being read in line by line and I need to be able to identify if there are any URLs in the string.
Someone suggested using Regular Expressions?
I am having abit of trouble doing this.
I am trying something like this but it doesnt work. Any help will be great!! THANKS![]()
Code:String test = new String("bla bla bla http://somesite.com/tmp/page.html bla bla"); String regex = "@\"http(s)?://([\\w-]+\\.)+[\\w-]+(/[\\w- ./?%&=]*)?\\b\")"; Pattern p = Pattern.compile(regex); Matcher m = p.matcher(test); if (m.find()){ System.out.println(m.group(1)); } else{ System.out.println("Not found!"); }




Reply With Quote