Re: Parsing URLs from a file
this is what you are looking for:
preg_match
Re: Parsing URLs from a file
Quote:
Originally Posted by VBlee
Hey,
I am having a problem trying to find a way of getting URLs from a file, i have no problems getting the html file or saving the information, its just getting the information from inside the file.
All i want to do is search through the file to find links e.g. find /tutorials/Maya/1 from a line that maybe:
<a href="/tutorials/Maya/1">Maya</a>
I also am trying to find a way so that when it gets a list of links it then looks if they contain the phrase:
/tutorials/ (I would be able to do that if i knew how to do the first bit i think!).
Can anyone help at all?
Thanks,
Lee.
Although a regular expression can be used it would be rather large due to the high variety of ways in which HTML is written.
HTML Code:
<P><A HREF=www.vbforums.com>My Link</p>
<P><A HREF="www.vbforums.com" >My Link<p>
<P><A HREF='http://www.vbforums.com' >My Link<p>
<P><A HREF="www.vbforums.com" >My Link</a><p>
<p><a href=
"www.vbforums.com"
>My Link</a><p>
And any combination of the above. The best way of doing this is using the loadHTML method of the DomDocument object. This will take into account any poorly formed HTML and inconsistencies in the markup. You can also be relativity sure that everything have been captured.
If the HTMl document is XHTML, you can just load the document using DOMDocument-->load().
You can then use the getElementsByTagName() and getAttribute() methods to get the values of the links attributes.
PHP Code:
$anchors = $dom->getElementsByTagName('a');
foreach($anchors as $anchor) {
echo ($anchor->getAttribute('href'));
}
Re: Parsing URLs from a file
Thanks you both so much, i had been looking at regular expressions, just getting my head round them really but i was just looking for another option such as visualAd's method.
Thanks again, i will test/investigate them :)
Lee.