|
-
Jul 30th, 2009, 03:51 AM
#1
[RESOLVED] Regular Expression for A NOT B
Using regular expressions, how can we find all paragraphs of match type A NOT B
E.g Find all paragraphs which has the word "pattern" but doesn't have the word "regular".
So in the following text, 2 & 3 should be selected while 1 should be omitted:
1. With regular expressions you can describe almost any text pattern
2. including a pattern that matches two words near each other.
3. This pattern is relatively simple, consisting of three parts.
-
Jul 30th, 2009, 07:58 AM
#2
Re: Regular Expression for A NOT B
Hi, have a look at msdn for the right syntax to use : MSDN regular expressions
I think you could do in in 2 steps. first check if it matches "^((?!regular).)*$" which will return lines that do NOT contain "regular" and then for each match check if it matches something like ".*pattern.*" and you're done !
have a look at the link I posted to better understand these regular expressions.
Alex
.NET developer
"No. Not even in the face of Armageddon. Never compromise." (Walter Kovacs/Rorschach)
Things to consider before posting.
Don't forget to rate the posts if they helped and mark thread as resolved when they are.
.Net Regex Syntax (Scripting) | .Net Regex Language Element | .Net Regex Class | DateTime format | Framework 4.0: what's new
My fresh new blog : writingthecode, even if I don't post much.
System: Intel i7 920, Kingston SSDNow V100 64gig, HDD WD Caviar Black 1TB, External WD "My Book" 500GB, XFX Radeon 4890 XT 1GB, 12 GBs Tri-Channel RAM, 1x27" and 1x23" LCDs, Windows 10 x64, ]VS2015, Framework 3.5 and 4.0 
-
Jul 31st, 2009, 03:23 AM
#3
Re: Regular Expression for A NOT B
That requires 2 passes thru the data. I was looking for something that can do this in just one pass as performance is critical to the application.
-
Jul 31st, 2009, 03:34 AM
#4
Re: Regular Expression for A NOT B
RegEx it's faster than reading and comparing strings in memory?
If not i think that 's a good way, split all text for line breaks, then for each line if string contains "pattern" and not the "regular" just write them to a new file/array whatever...
Rate People That Helped You
Mark Thread Resolved When Resolved
-
Jul 31st, 2009, 03:43 AM
#5
Re: Regular Expression for A NOT B
Sorry i take a look at some posts in the web, and yes the RegEx it's faster in the most of the cases...
Rate People That Helped You
Mark Thread Resolved When Resolved
-
Jul 31st, 2009, 06:27 AM
#6
Re: Regular Expression for A NOT B
 Originally Posted by Pradeep1210
That requires 2 passes thru the data. I was looking for something that can do this in just one pass as performance is critical to the application.
Then this would be it :
^((?!regular).)*pattern((?!regular).)*$
Alex
.NET developer
"No. Not even in the face of Armageddon. Never compromise." (Walter Kovacs/Rorschach)
Things to consider before posting.
Don't forget to rate the posts if they helped and mark thread as resolved when they are.
.Net Regex Syntax (Scripting) | .Net Regex Language Element | .Net Regex Class | DateTime format | Framework 4.0: what's new
My fresh new blog : writingthecode, even if I don't post much.
System: Intel i7 920, Kingston SSDNow V100 64gig, HDD WD Caviar Black 1TB, External WD "My Book" 500GB, XFX Radeon 4890 XT 1GB, 12 GBs Tri-Channel RAM, 1x27" and 1x23" LCDs, Windows 10 x64, ]VS2015, Framework 3.5 and 4.0 
-
Jul 31st, 2009, 08:05 AM
#7
Re: Regular Expression for A NOT B
 Originally Posted by stlaural
Then this would be it :
^((?!regular).)*pattern((?!regular).)*$
Very close, but not quite. Should not have the begining and end of string characters in the pattern because that will require matching of the whole input string. The OP wants to match substrings (line) within the input string, so the pattern should be like this:
Code:
(?<=\n)((?!regular).)*pattern((?!regular).)*
Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it.
- Abraham Lincoln -
-
Jul 31st, 2009, 08:22 AM
#8
Re: Regular Expression for A NOT B
 Originally Posted by stanav
Very close, but not quite. Should not have the begining and end of string characters in the pattern because that will require matching of the whole input string. The OP wants to match substrings (line) within the input string, so the pattern should be like this:
Code:
(?<=\n)((?!regular).)*pattern((?!regular).)*
Well it does work perfectly with the set of examples that were given. If I use them at once as my input string I get line 2 & 3 as matches. But just to be sure :
^ : Matches the position at the beginning of the input string. If the RegExp object's Multiline property is set, ^ also matches the position following '\n' or '\r'.
$ : Matches the position at the end of the input string. If the RegExp object's Multiline property is set, $ also matches the position preceding '\n' or '\r'.
So with multiline property enabled its the same thing right ?
Alex
.NET developer
"No. Not even in the face of Armageddon. Never compromise." (Walter Kovacs/Rorschach)
Things to consider before posting.
Don't forget to rate the posts if they helped and mark thread as resolved when they are.
.Net Regex Syntax (Scripting) | .Net Regex Language Element | .Net Regex Class | DateTime format | Framework 4.0: what's new
My fresh new blog : writingthecode, even if I don't post much.
System: Intel i7 920, Kingston SSDNow V100 64gig, HDD WD Caviar Black 1TB, External WD "My Book" 500GB, XFX Radeon 4890 XT 1GB, 12 GBs Tri-Channel RAM, 1x27" and 1x23" LCDs, Windows 10 x64, ]VS2015, Framework 3.5 and 4.0 
-
Jul 31st, 2009, 08:37 AM
#9
Re: Regular Expression for A NOT B
 Originally Posted by stlaural
So with multiline property enabled its the same thing right ?
Yes, that's true. But the default regex options in VS is none, thus unless the OP turns the multiline on, that pattern won't work. On the other hand, if we match a new line character at the begining of the pattern as I did, it will work regardless of what regex multiline option is.
Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it.
- Abraham Lincoln -
-
Jul 31st, 2009, 09:31 AM
#10
Re: Regular Expression for A NOT B
 Originally Posted by stanav
Yes, that's true. But the default regex options in VS is none, thus unless the OP turns the multiline on, that pattern won't work. On the other hand, if we match a new line character at the begining of the pattern as I did, it will work regardless of what regex multiline option is.
Good point ! thanks for the precisions. So Pradeep1210 now has two way to accomplish what he was trying to do.
Alex
.NET developer
"No. Not even in the face of Armageddon. Never compromise." (Walter Kovacs/Rorschach)
Things to consider before posting.
Don't forget to rate the posts if they helped and mark thread as resolved when they are.
.Net Regex Syntax (Scripting) | .Net Regex Language Element | .Net Regex Class | DateTime format | Framework 4.0: what's new
My fresh new blog : writingthecode, even if I don't post much.
System: Intel i7 920, Kingston SSDNow V100 64gig, HDD WD Caviar Black 1TB, External WD "My Book" 500GB, XFX Radeon 4890 XT 1GB, 12 GBs Tri-Channel RAM, 1x27" and 1x23" LCDs, Windows 10 x64, ]VS2015, Framework 3.5 and 4.0 
-
Jul 31st, 2009, 11:35 AM
#11
Re: Regular Expression for A NOT B
Thanks for the great help
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|