Results 1 to 3 of 3

Thread: Website Scraper

  1. #1

    Thread Starter
    Hyperactive Member Olly79's Avatar
    Join Date
    May 2005
    Posts
    264

    Website Scraper

    Hi,

    Just wondering if someone can help me with the following query.

    I have hundreds of URLs that I need to crawl and return the <title>tag for each page.

    I have no idea how to achieve this and would greatly appreciate some assistance.

    All my URLs are in a spreadsheet and what I need to do is when I get the data back, I need the <title> tag placing next to the URL in a spreadsheet which will allow for further manipulation in excel that is required.

    I thought about using Google Reader =importXML("URL";"query") in a Google spreadhsheet; however, this will only really work on tables and I can only display 50 records in one spreadsheet; therefore that is not a viable option.

    If some can assit me with a PHP script I would be grateful; however, I will need it to be simply steps and how to to run this on the webserver if required.

    Thanks

  2. #2
    Frenzied Member I_Love_My_Vans's Avatar
    Join Date
    Jan 2005
    Location
    In the PHP compiler
    Posts
    1,275

    Re: Website Scraper

    This is what you want --> http://simplehtmldom.sourceforge.net/

    You should be able to use it to execute a given URL and get any element from that DOM, including the <title> element!

  3. #3

    Thread Starter
    Hyperactive Member Olly79's Avatar
    Join Date
    May 2005
    Posts
    264

    Re: Website Scraper

    Hi there,

    That looks great and just what I neeeded.

    With regards to the script iself it shows all the elements; however, how would I run it? Can you possibly help me in what I need to do with it and how best to load that onto the server and run it please?

    Thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width