Results 1 to 9 of 9

Thread: [RESOLVED] Optimizing regular expression

  1. #1

    Thread Starter
    Frenzied Member TheBigB's Avatar
    Join Date
    Mar 2006
    Location
    *Stack Trace*
    Posts
    1,511

    Resolved [RESOLVED] Optimizing regular expression

    Hi,

    I have this simple expression that retrieves the content from the HTML head.
    Code:
    /.*<head>(.*)<\/head>.*/
    Since I don't have much experience with expressions, I'm not sure whether this is all that efficient.
    Any suggestions?

    Thanks.
    Delete it. They just clutter threads anyway.

  2. #2
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629

    Re: Optimizing regular expression

    you only really need this:
    Code:
    /<head>(.*)<\/head>/si
    I added two modifiers: the 'i' modifier so that it would be treated as case insensitive (eg. HEAD, Head), and the 's' modifier to turn on single-line mode so that the single-character (".") also matches line-breaks.

  3. #3
    Frenzied Member
    Join Date
    Apr 2009
    Location
    CA, USA
    Posts
    1,516

    Re: Optimizing regular expression

    Avoid using .* where not necessary or if there's a practical alternative that's lazy (as opposed to greedy; use excluders, not includers). Instead of the .* in kows' sample, maybe you could use a "not </head>". But my example isn't working.

  4. #4
    Frenzied Member
    Join Date
    Dec 2007
    Posts
    1,072

    Re: Optimizing regular expression

    Or you could do it without regex:
    PHP Code:
        function getHead($HTML) {
            
    $start stripos($HTML"<HEAD>") + 6;
            if(
    $start) {
                
    $stop stripos($HTML"</HEAD>"$start);
                if(
    $stop) {
                    return 
    substr($HTML$start$stop $start);
                }
            }
            return 
    "";
        } 

  5. #5
    Frenzied Member sciguyryan's Avatar
    Join Date
    Sep 2003
    Location
    Wales
    Posts
    1,763

    Re: Optimizing regular expression

    Depending on how you want it done you could even use the strip_tags PHP function.
    My Blog.

    Ryan Jones.

  6. #6
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629

    Re: Optimizing regular expression

    Quote Originally Posted by sciguyryan View Post
    Depending on how you want it done you could even use the strip_tags PHP function.
    uhh? all strip_tags() does is remove HTML tags. it doesn't remove content, which means that you wouldn't be able to use it to parse anything. it's used for sanitation of user input, usually.

    regular expressions are the way to go in this case.

  7. #7
    Frenzied Member sciguyryan's Avatar
    Join Date
    Sep 2003
    Location
    Wales
    Posts
    1,763

    Re: Optimizing regular expression

    Quote Originally Posted by kows View Post
    uhh? all strip_tags() does is remove HTML tags. it doesn't remove content, which means that you wouldn't be able to use it to parse anything. it's used for sanitation of user input, usually.

    regular expressions are the way to go in this case.
    No. But had you looked on the page, there is a function on there that does. Such as:

    PHP Code:
    function strip_selected_tags($str$tags ""$stripContent false)
    {
        
    preg_match_all("/<([^>]+)>/i"$tags$allTagsPREG_PATTERN_ORDER);
        foreach (
    $allTags[1] as $tag) {
            
    $replace "%(<$tag.*?>)(.*?)(<\/$tag.*?>)%is";
            
    $replace2 "%(<$tag.*?>)%is";
            echo 
    $replace;
            if (
    $stripContent) {
                
    $str preg_replace($replace,'',$str);
                
    $str preg_replace($replace2,'',$str);
            }
                
    $str preg_replace($replace,'${2}',$str);
                
    $str preg_replace($replace2,'${2}',$str);
        }
        return 
    $str;

    ... and it also makes for a pretty interesting demo of RegExp too.
    My Blog.

    Ryan Jones.

  8. #8
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629

    Re: Optimizing regular expression

    Quote Originally Posted by sciguyryan View Post
    No. But had you looked on the page, there is a function on there that does.
    What you said -- that you could use strip_tags() -- is still incorrect.

    If you're going to point out that a function posted in the comments of the strip_tags() documentation might be useful, then you should probably say that it actually has nothing to do with strip_tags() and link to the comment itself. Otherwise, you're just misinforming.

    Should I even mention that the function you posted doesn't even do what was needed (or what you suggested it did), anyway? It was written that way to support self-closing tags (like <input />).

  9. #9

    Thread Starter
    Frenzied Member TheBigB's Avatar
    Join Date
    Mar 2006
    Location
    *Stack Trace*
    Posts
    1,511

    Re: Optimizing regular expression

    Quote Originally Posted by kows View Post
    Code:
    /<head>(.*)<\/head>/si
    This one also stores the header value including the header tags in $matches[0] (instead of the whole input), which is actually something I also needed.

    As the Dutch say, two flies in one swat.

    I also appreciate the other suggestions made. Thanks all
    Delete it. They just clutter threads anyway.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width