|
-
Jun 14th, 2010, 07:47 AM
#1
Thread Starter
Hyperactive Member
Website Scraper
Hi,
Just wondering if someone can help me with the following query.
I have hundreds of URLs that I need to crawl and return the <title>tag for each page.
I have no idea how to achieve this and would greatly appreciate some assistance.
All my URLs are in a spreadsheet and what I need to do is when I get the data back, I need the <title> tag placing next to the URL in a spreadsheet which will allow for further manipulation in excel that is required.
I thought about using Google Reader =importXML("URL";"query") in a Google spreadhsheet; however, this will only really work on tables and I can only display 50 records in one spreadsheet; therefore that is not a viable option.
If some can assit me with a PHP script I would be grateful; however, I will need it to be simply steps and how to to run this on the webserver if required.
Thanks
-
Jun 14th, 2010, 10:04 AM
#2
Re: Website Scraper
This is what you want --> http://simplehtmldom.sourceforge.net/
You should be able to use it to execute a given URL and get any element from that DOM, including the <title> element!
-
Jun 15th, 2010, 03:43 AM
#3
Thread Starter
Hyperactive Member
Re: Website Scraper
Hi there,
That looks great and just what I neeeded.
With regards to the script iself it shows all the elements; however, how would I run it? Can you possibly help me in what I need to do with it and how best to load that onto the server and run it please?
Thanks
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|