can we do this in regular expression?
have a sentence:
I have a test.
Now, I want to know how many 2-word group in this sentence. I can do this using: New Regex("\b([a-zA-Z]+\s+(?=[a-zA-Z]+))\b") to get this #.
Is there a way to capture these 2-word groups?
in this example it should be:
I have
have a
a test
can we do this in regular expression directly?
thanks
Re: can we do this in regular expression?
Check out the regex help, it shows that you have the option to Match, that is tell you the results of the expression. In there is the matches including extracts which you get by using the brackets as you have ( ).
There is a good example using the Capture property (it is from the .Match.Groups property).
Re: can we do this in regular expression?
thanks for reply. the problem is: I need check the 2-word phase, but, the phase proceed to the next word, not the next phase. not sure whether I'm using the correct term here.
for example, if we just group it. for sentence:
this is a sentence
will have 2 groups:
this is
a sentence
but I need 3 groups:
this is
is a
a sentence
that's why regex uses (?=... )
but since use (?=....) the groups will not include it as capture.
so, it will be
this
is
a
which is not what I want.
thanks
Re: can we do this in regular expression?
Hi,
You will struggle to do this without looking over your sentence more than once. Regex cannot step backwards to look at substrings that have already been matched.
If you would like to stick with Regex then you could do something like this:
VB.NET Code:
Dim str As String = "This is a test"
Dim str2 As String = str.Remove(0, str.IndexOf(" ") + 1)
Dim Results As MatchCollection = Regex.Matches(str, "\b\w+\b\s?\b\w+\b")
Dim Results2 As MatchCollection = Regex.Matches(str2, "\b\w+\b\s?\b\w+\b")
Dim output As String = Nothing
For i As Integer = 0 To Results.Count - 1
output += Results.Item(i).Value & vbCrLf
If i <= Results2.Count - 1 Then
output += Results2.Item(i).Value & vbCrLf
End If
Next
MsgBox(output)
Although you should be able to write a better function to achieve the same goal using different methods e.g. String.Split, For-Next and Concatenation.
Hope this helps :)
Re: can we do this in regular expression?
You can use plain string manipulation for this. Just split the text into words and run a loop to get 2 words out of each iteration. Something like this:
Code:
Dim txt As String = "This is a random sentence"
Dim words() As String = txt.Split(" "c)
Dim twoWordList As New List(Of String)
For i As Integer = 0 To words.Length - 2
twoWordList.Add(String.Format("{0} {1}", words(i), words(i + 1)))
Next
Dim results As New System.Text.StringBuilder
results.AppendLine(String.Format("There are ({0}) 2-word phrases found in the text. They are:", twoWordList.Count))
For Each s As String In twoWordList
results.AppendLine(s)
Next
MessageBox.Show(results.ToString)