Now I need RegEx help please
I hope this is really simple for someone. I've been working on this for two hours and everything I've done so far is worse than what I was provided in my previous post.
What I want is to return anything at all that's between the colon and the > from a tag.
I don't care if it's symbols, null chars, numbers or anything else.
I want it to find it and return it and let the caller decide what to do with it.
It fails if there are any symbols or space characters in the tag. I haven't tried numbers but eventually I'm going to want to do something like:
<PlaySound: 30,400,25>
This is what I'm using:
Code:
Const STRING_PATTERN As String = "\s*(\w+)\s*>"
<SeeSound: Pew> ' Returns Pew.
<SeeSound: Pew!> ' Fails. (Returns <SeeSound: Pew!>)
<SeeSound: Pew Pew> ' Also fails.
Fails are not raised errors. But they don't work.
Anyway, if someone who knows what they're doing can sort this for me I will be very grateful.
Thanks.
Re: Now I need RegEx help please
\w matches only words (can contain only letters and underscores), any other character will fail the match.
For your example above a simple Mid$ with a couple of Instr would do the job just fine.
Re: Now I need RegEx help please
I'd rather not rewrite it all if I can just swap out the pattern instead.
Re: Now I need RegEx help please
A copy and paste from CoPilot:
Why it works for Pew but fails for Pew! or Pew Pew
\w+ matches only letters, digits, and underscore. It does not match punctuation like ! or spaces.
So Pew! and Pew Pew don't match because ! and the second word are outside the \w+ range.
Code:
Option Explicit
Public Function ParseTag(ByVal inputText As String) As Collection
Dim regex As Object
Dim matches As Object
Dim result As New Collection
' Pattern: <Command: Arguments>
' Command = letters only
' Arguments = everything until >
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "<\s*([A-Za-z]+)\s*:\s*([^>]+)\s*>"
' Explanation:
' \s* ? optional whitespace before
' ([^>]+) ? capture any characters except > (so it includes spaces and punctuation)
' \s*> ? optional whitespace before > (HTML encoded)
regex.IgnoreCase = True
regex.Global = False
If regex.Test(inputText) Then
Set matches = regex.Execute(inputText)
result.Add matches(0).SubMatches(0) ' Command
result.Add Trim(matches(0).SubMatches(1)) ' Arguments
Else
result.Add "" ' Command empty
result.Add "" ' Arguments empty
End If
Set ParseTag = result
Re: Now I need RegEx help please
https://regex101.com/r/oTMG6W/1
Catches all 3
<SeeSound:\s*(.+)\s*>
Beware: (.+) returns everything between the Colon (and optional Whitespace) and the ">"
Maybe additionally Trim the Result to remove leading/trailing blanks
Re: Now I need RegEx help please
You can use ([^>]*?)> pattern. The expression in parens basicaly says “capture everything that is not a > and stop on first >”
The *? quantifier differs to simple * by being non-greedy i.e. it will stop on first match, not the last one which is better in your case.
You have to learn how to deal with capturing groups. You don’t need to use InStr to post-process your regex results — this looks silly provided you already have the power of regex at your disposal (VB’s string processing functions are puny in comparison).
Re: Now I need RegEx help please
wqweto to the rescue.
totally forgot about that one….
Re: Now I need RegEx help please
Quote:
Originally Posted by
wqweto
You can use ([^>]*?)> pattern. The expression in parens basicaly says “capture everything that is not a > and stop on first >”
The *? quantifier differs to simple * by being non-greedy i.e. it will stop on first match, not the last one which is better in your case.
You have to learn how to deal with capturing groups. You don’t need to use InStr to post-process your regex results — this looks silly provided you already have the power of regex at your disposal (VB’s string processing functions are puny in comparison).
That is a beautiful thing. Does exactly what I want and other than changing the pattern, I didn't have to change a line of code.
Thanks!