-
Simple Web Spider
I need to have basically a web spider written. Basically to just give a quick down and dirty of what it needs to do is the following..
1. Goto a URL
2. Read all URL's from that page and log just the url in the database
3. goto each of these url's and then log all the url's for that page in the database (not logging any duplicate urls).
In each of these pages it pulls up it needs to search to find out if its a rss/xml feed. The program can tell its a XML feed because the first line should read one of the following...
<?xml version="1.0" encoding="UTF-8" ?>
<?xml version="1.0" encoding="ISO-8859-1" ?>
or something along these lines.