regex help i'm crap

**sagey** · Oct 17th, 2006, 09:19 AM

hello,

i'm trying to parse the following string with regex, but i'm really crap at it.

[server eventlog] [17/10/2006 14:31:39] ServerOne="swiRetrieve : 17/10/2006 14:25:51" ServerTwo="swiQuote : 17/10/2006 14:26:12"

what i'd like to do is match the following as 3 groups:

[server eventlog] [17/10/2006 14:31:39]
ServerOne="swiRetrieve : 17/10/2006 14:25:51"
ServerTwo="swiQuote : 17/10/2006 14:26:12"

if someone could help and talk me thru the regex so i can hopefully get it thru my thick head i'd really appreciate it

**penagate** · Oct 17th, 2006, 09:30 AM

The main issue with regular expressions is matching arbitrary text. What I usually do is do a negative character class match, which is more precise than matching anything.

[ and ] are special chars used to denote a character class so they will need to be escaped using a backslash.

+ means 1 or more of the class.

Brackets ( ) are used to capture text.

Probably something like this would work for you:
#(\[server eventlog\] \[[^\]]+\]) (ServerOne="[^"]+") (ServerTwo="[^"]+")#

I have highlighted the match groups using garish colours so that you can easily pick them out.

**sagey** · Oct 17th, 2006, 09:36 AM

hello penagate,

thanks for the reply, could you put that into a regex string as i'm getting warnings abaout the [ and "

if you could explain it a bit more that'd also be great.

**penagate** · Oct 17th, 2006, 09:46 AM

I am not very good at explaining regular expressions, they are like a kind of subconscious voodoo magic.

Nevertheless!

# and # are arbitary characters used to denote the start and end of the expression. After the tail # you may place any modifiers. This is used more in other languages than .NET as the .NET Regexp class allows you to pass modifiers in a separate argument.

Brackets are used to denote captures. Everything between a ( and ) will be captured. You can nest brackets to for sub-captures, or convenience.

Square brackets are used to denote a character class. For example, [a] will match a lowercase 'a' character. [az] will match lowercase 'a' and 'z'; [a-z] will match all lowercase alphabetical characters. You can have multiple character groups within a class: [a-z0-9], for example. There is also a case-insensitivity modifier that you can use instead of having to write [a-zA-Z].

+, as I say, means 1 or more of the item. So [a] will match one 'a' character and [a]+ will match one or more.

The ^ operator is logical NOT, within a character class it means match any string that does NOT match the character class. So [^a] means match any character that is not 'a'; [^"]+ means match any string of characters until a ".

That's about all you need to know about the syntax for this particular case. For more info and some tutorials head to regular-expressions.info.

Note that in C#, to use regular expressions in code you need to use verbatim string literals, which are prefixed with @. In VB.NET this is unnecessary as there are no escape sequences so the backslash character has no special meaning.

If you have problems running that expression then make sure you have used the ECMAScript compliance flag. I forget exactly what it is; it's one of the arguments for the Regexp constructor.

Thread: regex help i'm crap

Thread Tools

Display

regex help i'm crap

Re: regex help i'm crap

Re: regex help i'm crap

Re: regex help i'm crap

Posting Permissions