|
-
May 27th, 2010, 05:40 PM
#1
Thread Starter
Frenzied Member
[RESOLVED] Optimizing regular expression
Hi,
I have this simple expression that retrieves the content from the HTML head.
Code:
/.*<head>(.*)<\/head>.*/
Since I don't have much experience with expressions, I'm not sure whether this is all that efficient.
Any suggestions?
Thanks.
Delete it. They just clutter threads anyway.
-
May 27th, 2010, 06:41 PM
#2
Re: Optimizing regular expression
you only really need this:
Code:
/<head>(.*)<\/head>/si
I added two modifiers: the 'i' modifier so that it would be treated as case insensitive (eg. HEAD, Head), and the 's' modifier to turn on single-line mode so that the single-character (".") also matches line-breaks.
-
May 27th, 2010, 07:00 PM
#3
Re: Optimizing regular expression
Avoid using .* where not necessary or if there's a practical alternative that's lazy (as opposed to greedy; use excluders, not includers). Instead of the .* in kows' sample, maybe you could use a "not </head>". But my example isn't working.
-
May 28th, 2010, 11:35 AM
#4
Frenzied Member
Re: Optimizing regular expression
Or you could do it without regex:
PHP Code:
function getHead($HTML) {
$start = stripos($HTML, "<HEAD>") + 6;
if($start) {
$stop = stripos($HTML, "</HEAD>", $start);
if($stop) {
return substr($HTML, $start, $stop - $start);
}
}
return "";
}
-
Jun 3rd, 2010, 06:07 AM
#5
Re: Optimizing regular expression
Depending on how you want it done you could even use the strip_tags PHP function.
-
Jun 3rd, 2010, 07:16 AM
#6
Re: Optimizing regular expression
 Originally Posted by sciguyryan
Depending on how you want it done you could even use the strip_tags PHP function.
uhh? all strip_tags() does is remove HTML tags. it doesn't remove content, which means that you wouldn't be able to use it to parse anything. it's used for sanitation of user input, usually.
regular expressions are the way to go in this case.
-
Jun 3rd, 2010, 07:52 AM
#7
Re: Optimizing regular expression
 Originally Posted by kows
uhh? all strip_tags() does is remove HTML tags. it doesn't remove content, which means that you wouldn't be able to use it to parse anything. it's used for sanitation of user input, usually.
regular expressions are the way to go in this case.
No. But had you looked on the page, there is a function on there that does. Such as:
PHP Code:
function strip_selected_tags($str, $tags = "", $stripContent = false) { preg_match_all("/<([^>]+)>/i", $tags, $allTags, PREG_PATTERN_ORDER); foreach ($allTags[1] as $tag) { $replace = "%(<$tag.*?>)(.*?)(<\/$tag.*?>)%is"; $replace2 = "%(<$tag.*?>)%is"; echo $replace; if ($stripContent) { $str = preg_replace($replace,'',$str); $str = preg_replace($replace2,'',$str); } $str = preg_replace($replace,'${2}',$str); $str = preg_replace($replace2,'${2}',$str); } return $str; }
... and it also makes for a pretty interesting demo of RegExp too.
-
Jun 3rd, 2010, 08:54 AM
#8
Re: Optimizing regular expression
 Originally Posted by sciguyryan
No. But had you looked on the page, there is a function on there that does.
What you said -- that you could use strip_tags() -- is still incorrect.
If you're going to point out that a function posted in the comments of the strip_tags() documentation might be useful, then you should probably say that it actually has nothing to do with strip_tags() and link to the comment itself. Otherwise, you're just misinforming.
Should I even mention that the function you posted doesn't even do what was needed (or what you suggested it did), anyway? It was written that way to support self-closing tags (like <input />).
-
Jun 5th, 2010, 10:52 AM
#9
Thread Starter
Frenzied Member
Re: Optimizing regular expression
 Originally Posted by kows
Code:
/<head>(.*)<\/head>/si
This one also stores the header value including the header tags in $matches[0] (instead of the whole input), which is actually something I also needed.
As the Dutch say, two flies in one swat.
I also appreciate the other suggestions made. Thanks all
Delete it. They just clutter threads anyway.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|