Results 1 to 19 of 19

Thread: Greedy,Reluctant,Possessive Quantifers?

  1. #1

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Resolved Greedy,Reluctant,Possessive Quantifers?

    Anyone know what the similarities or differences are between these quantifers? Under the docs the descriptions are all the same.

    Greedy quantifiers
    X? X, once or not at all
    X* X, zero or more times
    X+ X, one or more times
    X{n} X, exactly n times
    X{n,} X, at least n times
    X{n,m} X, at least n but not more than m times

    Reluctant quantifiers
    X?? X, once or not at all
    X*? X, zero or more times
    X+? X, one or more times
    X{n}? X, exactly n times
    X{n,}? X, at least n times
    X{n,m}? X, at least n but not more than m times

    Possessive quantifiers
    X?+ X, once or not at all
    X*+ X, zero or more times
    X++ X, one or more times
    X{n}+ X, exactly n times
    X{n,}+ X, at least n times
    X{n,m}+ X, at least n but not more than m times

    Also how are these quantifers used? I made a simple test program using the greedy,reluctant,possessive quantifiers(X, once or not at all) quantifers
    but i don't understand the output.
    Code:
     import java.util.*;
     
     public class X{
      public static void main(String[] args){
     
       String data = new String("mxpxr");
       Scanner s = new Scanner(data);
      // s.useDelimiter("x"); //  returns mpr
      // s.useDelimiter("x?"); // returns mpr{greedy quantifer}
      // s.useDelimiter("x??");  // returns mxpxr{reluctant quantifer}
      // s.useDelimiter("x?+"); // returns mpr{possessive quantifer}
       while(s.hasNext()){
         System.out.println(s.next()); 
       }
      }
     }

  2. #2
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    I've never heard of possessive and reluctant qualifiers. But the little documentation we have is enough that I can see what they mean.

    example string: "abababab"
    The difference between greedy, possessive and reluctant (reluctant is usually called non-greedy) qualifiers is in the matching strategy.
    Greedy is the default: the quantifier will try to match as much as possible, so long as the overall pattern still matches.
    "(ab)*(ab)+"
    In this case, the first part is greedy. The second part only needs a single "ab" to succeed, so the first can greedily grab "ababab".

    Reluctant matches as little as possible to make the pattern succeed.
    "(ab)*?(ab)+"
    Since the second part can match the whole string and the first needs match nothing, the reluctant first part will indeed match the empty string "". The second part gets to take everything else, i.e. "abababab".

    Possessive matches take as much as they can, even if that prevents a complete match of the pattern.
    "(ab)*+(ab)+"
    In this case, the first part grabs the whole string, "abababab", leaving nothing for the second part. Since the second part needs at least "ab", the overall pattern fails.

    Non-greedy patterns are usually used in delimiter situations. Take, for example, this HTML snippet:
    Code:
    <DIV>Hello<P>Blabla</P></DIV>
    Now you want to do a search & replace to make all tag names lowercase. You devise this regexp:
    "<(/?)(.*)>"
    to be replaced with an epression like this:
    "<"+match1+match2.toLowerCase()">"
    When you match this against the above a single time, you'll find that the result is
    Code:
    <div>hello<p>blabla</p></div>
    This is not intended. The problem is that .* matches everything - really everything. In this case, it matched "DIV>Hello<P>Blabla</P></DIV".
    Making the group non-greedy solves this problem. Since the pattern CAN match only "DIV", it will.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  3. #3

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    Ok im starting to understand what the quantifers mean. I wish sun would
    have a little better documentation on them. CornedBee could you
    go over this pattern expression when you get a chance? "<(/?)(.*)>".
    (/?) Is a Greedy quantifer which matches once or not at all. Since
    <DIV>Hello<P>Blabla</P></DIV> has at least one / we should be fine.
    Now what is the use of the . in (.*)? It seems like it means match anything
    after the / but i dont see it in the docs. Also shouldn't we end up with
    <DIV>Hello<P>Blabla</p></div> since the pattern is looking for forward slashes?
    Thanks for the help.

  4. #4
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    /? means zero or one slashes, so it finds both <something> and </something>.

    The . in .* is the match-all. . matches any character (except a newline unless a special flag is set).
    * is the any-amount.
    So .* means "match any amount of any characters".
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  5. #5

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    If /? means zero or one slashes then why does /? match both instances of the / in <HTML><P>Hello</P></HTML>. I thought /* would be used instead.

  6. #6

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    Stuff like this is throwing me off. If a greedy quantifer is supposed to grab as much as possible then why is the output ab?
    Code:
      import java.util.*; 
      import java.util.regex.*; 
     
      public class X{
       public static void main(String[] args){
        
       Scanner s = new Scanner("ababab");  
       Pattern p = Pattern.compile("(ab)*");// shouldn't it be ababab?
       s.findInLine(p); 
     
       MatchResult result = s.match();
        for (int i=1; i<=result.groupCount(); i++)
         System.out.println(result.group(i)); 
          s.close(); 
        }
       }

  7. #7
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    Quote Originally Posted by Dilenger4
    If /? means zero or one slashes then why does /? match both instances of the / in <HTML><P>Hello</P></HTML>. I thought /* would be used instead.
    It doesn't. What gives you the impression?
    In the faulty case I've shown, the /? matches nothing and the .* everything, including the two slashes.

    As for your code, I don't know. I think the Scanner is making things more complicated:
    Code:
    public class X
    {
    	public static void main(String[] args) {
    		Pattern p = Pattern.compile("(ab)*");// shouldn't it be ababab?
    
    		Matcher m = p.matcher("ababa");
    
    		if(m.find()) {
    			System.out.println("Matched: " + m.group());
    		}
    	}
    }
    Matched: abab
    Last edited by CornedBee; Dec 13th, 2004 at 03:47 PM.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  8. #8

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    Yes the Scanner is definitely making things more difficult. Ive been trying to use the Scanner with a Pattern but i still can't figure it out.
    Code:
    /*
      runs with no output 
      import java.util.*; 
      import java.util.regex.*; 
     
      public class X{
       public static void main(String[] args){
        
       Scanner s = new Scanner("ababa");  
       Pattern p = Pattern.compile("(ab)*");// should be abab
       
       while(s.hasNext(p)){
         System.out.print(s.next()); 
       }
      }
     }
    */
    /*
      throws java.lang.IllegalStateException
      import java.util.*; 
      import java.util.regex.*; 
     
      public class X{
       public static void main(String[] args){
        
       Scanner s = new Scanner("ababa");  
       Pattern p = Pattern.compile("(ab)*");// should be abab
       MatchResult mr = s.match(); // if match took a Pattern as an arg but it dosen't
       System.out.print(mr.group()); 
       
      }
     }
    */

  9. #9

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    This is the only way ive been able to get a Scanner and Pattern to work together so far.
    Code:
      import java.util.*; 
      import java.util.regex.*; 
     
      public class X{
       public static void main(String[] args){
        
       Scanner s = new Scanner("ababa");  
       Pattern p = Pattern.compile("(ab)*");// should be abab
       String result = s.findInLine(p); 
       System.out.println(result); 
      }
     }

  10. #10
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    I just found this:
    http://www.weitz.de/regex-coach/

    Looks pretty cool.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  11. #11

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    That does look pretty cool. Im gonna dl it when i get a chance. Perhaps tommorow. The semester is almost over and i have a terrorism and civil liberties paper due tommorow so i doubt ill get to play with it today.

  12. #12

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    Ive been working with the Regex Coach but im pretty much lost. Take the following target string <DIV>Hello<P>Blabla</P></DIV>. I tried to match every instance of "<". Using a reg ex such as (<)* only matches <DIV>Hello<P>Blabla</P></DIV>. Shouldn't the greedy quantifer try to match as much as possible which would be every instance of < in the html string?

  13. #13
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    No, the quantifier expects the characters to be next to each other in order to match them.

    To match disjoint similar patterns, you need to make your pattern a global one. This won't result in a single continous match, though, it will result in several disjoint matches, and it depends on the matcher how you retrieve them.

    Example (using JavaScript RegEx syntax):
    Code:
    var base = "<DIV>Hello<P>Blabla</P></DIV>";
    var re = /</; // This pattern is not global.
    var input = base;
    input = input.replace(re, "|"); // Thus it matches once here.
    alert(input); // Gives "|DIV>Hello<P>Blabla</P></DIV>".
    re = /</g; // This pattern is global.
    input = base;
    input = input.replace(re, "|"); // Thus it matches multiple times here.
    alert(input); // Gives "|DIV>Hello|P>Blabla|/P>|/DIV>".
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  14. #14

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    I understand the semantics but the syntax confusing. . /</ and /</g i can't find in the docs. Plus everything uses backslashes and not forwardslashes.
    Are /</ and /</g from the POSIX character classes?

  15. #15
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    No, in JavaScript, the forward slashes delimit the pattern, like in Perl, and the letters after the second slash indicate options.
    So /</ is the RegEx pattern "<". /</g is the pattern "<" with the global match option.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  16. #16

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    JavaScript I was wondering why there were no types assigned to your variables.

  17. #17

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    The Regex Coach does have a global option but it only affects the replacement option. There is no effect on the match itself. Are you sure /g makes a pattern global? Regex dosen't recognize /g and there is no /g in the docs for the Pattern class.

  18. #18
    Kitten CornedBee's Avatar
    Join Date
    Aug 2001
    Location
    In a microchip!
    Posts
    11,594

    Re: Greedy,Reluctant,Possessive Quantifers?

    The Java docs have this to say about the g flag:
    Perl uses the g flag to request a match that resumes where the last match left off. This functionality is provided implicitly by the Matcher class: Repeated invocations of the find method will resume where the last match left off, unless the matcher is reset.
    A normal find is used when you want to find one match and take some action. With the g flag, you can then call find again and it will find the next match, where you can take some action again. If the g flag is not specified, it will find the same match again.

    The replace function has an implicit action, so if the g flag is there, it will replace all occurences, without it it will only replace the first.

    In Java, the g flag is always there. This means that find remembers the position where it left of, and replace is split into replaceAll and replaceFirst.
    All the buzzt
    CornedBee

    "Writing specifications is like writing a novel. Writing code is like writing poetry."
    - Anonymous, published by Raymond Chen

    Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.

  19. #19

    Thread Starter
    Dazed Member
    Join Date
    Oct 1999
    Location
    Ridgefield Park, NJ
    Posts
    3,418

    Re: Greedy,Reluctant,Possessive Quantifers?

    Ah ok. I didn't know the g flag is always there.

    Yup works like a dream.
    Code:
     String html = "<DIV>Hello<P>Blabla</P></DIV>";
     Scanner s = new Scanner(html);
     Pattern p = Pattern.compile("<");
     html = html.replace(s.findInLine(p),"[");
     System.out.println(html);

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width