-
Apr 8th, 2015, 03:04 AM
#1
Thread Starter
Junior Member
[RESOLVED] Regex Help
Hello I don't understand regex fully but I found a great javascript function that does exactly what I need, How would I convert this properly to vb.net? I've tried a few things but could not get the same exact regex string to work when doing it the vb.net way...any thoughts? Any help would be appreciated.
Code:
function maxRepeat(input) {
var reg = /(?=((.+)(?:.*?\2)+))/g;
var sub = "";
var maxstr = "";
reg.lastIndex = 0;
sub = reg.exec(input);
while (!(sub == null)) {
if ((!(sub == null)) && (sub[2].length > maxstr.length)) {
maxstr = sub[2];
}
sub = reg.exec(input);
reg.lastIndex++;
}
return maxstr;
}
This functions returns the largest sequence of characters that appear at least twice. "one two one three one four" would return "one t" <--with space || "onetwoonethreeonefour" would return "onet"
"324234241122332211345435311223322112342345541122332211234234324" returns "1122332211234234"
So far no luck with getting the same results as with javascript even with tweaking the regex string around a bit.. there must be something I don't understand.
Code:
Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")
Dim match = regex.Matches("one two three one four",0)
MsgBox(match.count) <---find anything?
What is the vb.net regex engine equivalent of "/(?=((.+)(?:.*?\2)+))/g"?
Last edited by paulmak10; Apr 8th, 2015 at 04:46 PM.
-
Apr 8th, 2015, 08:28 AM
#2
Re: Regex Help
I'm not sure how that regex is working but the equivalent function appears to be this;
Code:
Public Class Form1
Dim Words As New List(Of String)
Dim Occurrence As New List(Of Integer)
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) _
Handles MyBase.Load
MessageBox.Show(CountWords("one two three four five one two two one"))
End Sub
Public Function CountWords(ByVal s As String) As String
For Each str As String In s.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries)
If Words.Contains(str) Then
Occurrence(Words.IndexOf(str)) += 1
Else
Words.Add(str)
Occurrence.Add(1)
End If
Next
Return Words(Occurrence.IndexOf(Occurrence.Max()))
End Function
End Class
This will show the first word (there could me more than one) that has the highest number of occurrences.
- If my post helped you, please Rate it
- If your problem is solved please also mark the thread resolved
I use VS2015 (unless otherwise stated).
_________________________________________________________________________________
B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.
-
Apr 8th, 2015, 09:02 AM
#3
Re: Regex Help
My attempt at this is rather clunky but its still something.
Code:
Private Function MostOccuringWord(value As String) As String
Dim originalWord As String = value
Dim words() As String = originalWord.Split(" "c)
Dim highestFound As Integer = Regex.Matches(originalWord, words(0)).Count
Dim highestWord As String = words(0)
For i = 1 To words.Length - 1
Dim wordCount As Integer = Regex.Matches(originalWord, words(i)).Count
If wordCount > highestFound Then
highestFound = wordCount
highestWord = words(i)
End If
Next
Return highestWord
End Function
If you find my contributions helpful then rate them.
-
Apr 8th, 2015, 10:32 AM
#4
Re: Regex Help
LINQ would work in this case too:
Code:
Private Function GetMostOccuringWord(ByVal input As String) As String
Dim sentence() As String = input.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
Return sentence.GroupBy(Function(n) n).OrderByDescending(Function(g) g.Count).Select(Function(g) g.Key).FirstOrDefault
End Function
-
Apr 8th, 2015, 10:42 AM
#5
Re: Regex Help
Originally Posted by dday9
LINQ would work in this case too:
Code:
Private Function GetMostOccuringWord(ByVal input As String) As String
Dim sentence() As String = input.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
Return sentence.GroupBy(Function(n) n).OrderByDescending(Function(g) g.Count).Select(Function(g) g.Key).FirstOrDefault
End Function
I was waiting for someone to whip up a Linq version. Very nice. I really need to learn Linq well.
If you find my contributions helpful then rate them.
-
Apr 8th, 2015, 02:28 PM
#6
Thread Starter
Junior Member
Re: Regex Help
Thanks for the responses everyone! All of these solutions are clever and linq seems like something I should dive into more. Furthermore I guess I should have mentioned that the string to be searched does not (in my case) contain a delimiter. So a string of "onetwoonethreeonefour" would also return "one", the regex method with javascript seems to work in either case as desired (with or without delimiter). Anyone have any more clever methods of doing this without splitting the string with a deliminator and counting each string?
Honestly this is an interesting problem and I am lost with how the magic of regex solves it without a delimiter ..... If I did not find the javascript function above I would probably be doing crazy array work breaking my mind
Last edited by paulmak10; Apr 8th, 2015 at 02:33 PM.
-
Apr 8th, 2015, 02:32 PM
#7
Re: Regex Help
Originally Posted by paulmak10
Thanks for the responses everyone! All of these solutions are clever and linq seems like something I should dive into more. Furthermore I guess I should have mentioned that the string to be searched does not (in my case) contain a delimiter. So a string of "onetwoonethreeonefour" would also return "one", the regex method with javascript seems to work in either case as desired (with or without delimiter). Anyone have any more clever methods of doing this without splitting the string with a deliminator and counting each string?
A work around would be to turn the javascript version to an API. so you send it data and it will output the data. you simply read it from your vb.net application.
If you find my contributions helpful then rate them.
-
Apr 8th, 2015, 02:51 PM
#8
Thread Starter
Junior Member
Re: Regex Help
Thanks Toph but I plan to use this for a critical part of the server side and would want it isolated....
Is the key "/(?=((.+)(?:.*?\2)+))/g" valid for vb.net regex? Does anyone know how to translate this regex string to a vb.net regex?
-
Apr 8th, 2015, 03:07 PM
#9
Re: Regex Help
Regex is universal. Regex written is not dependent on the programming language. The thing is, when I tested your Regex pattern with a test string, nothing matched. So I don't think the regex actually does anything.
Quick question, without a delimiter, how would the program know when a word has finished or started?
How does it know that onetwothree is actually three words? Without using a dictionary which is just slow.
If you find my contributions helpful then rate them.
-
Apr 8th, 2015, 03:19 PM
#10
Thread Starter
Junior Member
Re: Regex Help
You know what, I have noticed that the pattern does not seem to match with online tools, but I assure you it works in javascript....and I have no idea how it works from the code that I see...
As far as the delimiter question...im not sure but It seems to find a way, I had the same question when thinking about how I would approach this problem..and still do after I found the javascript function.
And btw side note - "onetwoonethreeonefourone" returns with the longest repeating string being "onet"
"one two one three one four one" would return "one t" <-- with a space
its a very handy function I find it interesting, "1122332211542340543534243245934583458911223322114345345" would return "1122332211"
"324234241122332211345435311223322112342345541122332211234234324" returns "1122332211234234"
Very useful for encrypting/compressing data
Last edited by paulmak10; Apr 8th, 2015 at 03:33 PM.
-
Apr 8th, 2015, 03:27 PM
#11
Re: Regex Help
I know a lot about regex but this one has me stumped. But I think I see how this works and it doesn't do what I thought, it's not looking for words.
It starts an index of zero and looks for repetitions of characters from 1, using letters 1 then 1,2 the 1,2,3 and so on. The forward and back references are there because it will search the whole string regardless of index, but will exclude the first search term (which is why it's grabbing \2). It stores any match. It then advances the index and repeats this process until the string is exausted. So for example if we have "onetwothreefouronetwelve" it will report "onetw", since that is the longest letter sequence that repeats (other matches are "on", "one", "onet").
I'd like to see that done in one line of linq
- If my post helped you, please Rate it
- If your problem is solved please also mark the thread resolved
I use VS2015 (unless otherwise stated).
_________________________________________________________________________________
B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.
-
Apr 8th, 2015, 03:36 PM
#12
Thread Starter
Junior Member
Re: Regex Help
Bulldog thanks for the insight, the break down at least has some of the magic cleared up for me, but I'm still lost as to how to express the correct regex...Would you mind experimenting? I normally like to rely on myself for solutions, but with little experience in regex and the fact that most troubleshooting regex tools do not recognize the string I see as "correct" has me at a roadblock.....I cannot seem to get any feedback the trial and error way..
In more specific context, I have this function repeatedly compressing(replacing found string with G-Z) encrypted data until the only match is a string that is less than 2 characters long, I then have a method for encoding the key(all of the matched strings and their corresponding G-Z) with whats left of the hex data...the way im doing it is the hex data can have up to 20 "keys" (being G-Z since thats unused in hex) to help compress things
Just need to convert to server side code..ugh
Last edited by paulmak10; Apr 8th, 2015 at 03:50 PM.
-
Apr 8th, 2015, 03:42 PM
#13
Re: Regex Help
Originally Posted by paulmak10
Bulldog thanks for the insight, the break down at least has some of the magic cleared up for me, but I'm still lost as to how to express the correct regex...Would you mind experimenting? I normally like to rely on myself for solutions, but with little experience in regex and the fact that most troubleshooting regex tools do not recognize the string I see as "correct" has me at a roadblock.....I cannot seem to get any feedback the trial and error way..
I don't think you can implement such a regex to do the WHOLE thing. It would be easier to use a programming language to help you solve the problem. You can however use Regex as part of the solution.
But like Bulldog said, this had be stumpted and I have no idea how to do this too.
If you find my contributions helpful then rate them.
-
Apr 8th, 2015, 03:54 PM
#14
Thread Starter
Junior Member
Re: Regex Help
Why do these not seem to mean the same thing?
var reg = /(?=((.+)(?:.*?\2)+))/g;
Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")
I get how the rest of the code goes through the results and does that logic, im just not getting the expected returned results when running the regex string in vb.net
-
Apr 8th, 2015, 03:57 PM
#15
Re: Regex Help
It depends on the engine, the VB.Net Regex engine is different to other implementations. I'll have a think about this, now I can see how it works, I can decompose it and reconstruct it.
- If my post helped you, please Rate it
- If your problem is solved please also mark the thread resolved
I use VS2015 (unless otherwise stated).
_________________________________________________________________________________
B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.
-
Apr 8th, 2015, 04:00 PM
#16
Thread Starter
Junior Member
Re: Regex Help
Oh and let me add that I would actually want to use
Code:
Dim match = regex.Matches("onetwoonethreeonefour", 0)
instead of singular match, but ive played around with this too, and match.count shows 0 so I assume my efforts fail..
-
Apr 8th, 2015, 04:03 PM
#17
Thread Starter
Junior Member
Re: Regex Help
Bulldog thanks for taking a deep thought on this, please do let me know if you come to find anything, much appreciated !
-
Apr 8th, 2015, 04:07 PM
#18
Thread Starter
Junior Member
Re: Regex Help
Stumped indeed Toph
-
Apr 8th, 2015, 04:27 PM
#19
Re: Regex Help
Originally Posted by paulmak10
Why do these not seem to mean the same thing?
var reg = /(?=((.+)(?:.*?\2)+))/g;
Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")
I get how the rest of the code goes through the results and does that logic, im just not getting the expected returned results when running the regex string in vb.net
Because you have not learn't regex.
-
Apr 8th, 2015, 04:33 PM
#20
Thread Starter
Junior Member
Re: Regex Help
Originally Posted by ident
Because you have not learn't regex.
Correct, this is the first practical exposure I have had to regex, I understand how regex is used for input verification such as email and phone formats but the specifics with this application are over my head at the current time. And as talked about through previous posts...the regex tools i've tried to use to "take a better look" have not been able to find matches (although it clearly works as i've ran the code), that coupled with my lack of experience makes it hard to debug exactly whats not working out.
Last edited by paulmak10; Apr 8th, 2015 at 04:37 PM.
-
Apr 8th, 2015, 05:32 PM
#21
Re: Regex Help
How can you understand email validation? Have you even started anchors?
-
Apr 8th, 2015, 06:16 PM
#22
Re: Regex Help
Well... my interpretation is below. I tested this on quite a few examples (but I'm not sure it's 100% correct, so code read it and check it for yourself).
I'm sure this could be cut down but having done the hard bit, someone else can do the easy bit
Code:
Imports System.Text.RegularExpressions
Public Class Form1
Dim FoundString As String = String.Empty
Dim SearchString As String = String.Empty
Dim Remainder As String = String.Empty
Dim FindIt As Regex
Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) _
Handles MyBase.Load
MessageBox.Show("Longest Repeating String = " & _
MaxString("ffourfiveonetffourfiveonett"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("fourfiveonetffourfiveonett"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("ivffourfiveonetttffourfiveonettt"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("onettwoonett"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("812345678901234569"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("1122332211542340543534243245934583458911223322114345345"))
MessageBox.Show("Longest Repeating String = " & _
MaxString("324234241122332211345435311223322112342345541122332211234234324"))
End Sub
Private Function MaxString(str As String) As String
FoundString = String.Empty
For Offset As Integer = 0 To str.Length - 1
For SearchLength As Integer = 1 To str.Length
If Offset < str.Length - SearchLength And Offset < SearchLength Then
SearchString = str.Substring(Offset, SearchLength)
Remainder = str.Substring(SearchLength, str.Length - SearchLength)
If SearchString.Length < Remainder.Length Then
FindIt = New Regex(SearchString)
If FindIt.Matches(Remainder).Count > 0 Then
If SearchString.Length > FoundString.Length Then
FoundString = SearchString
End If
End If
End If
End If
Next
Next
Return FoundString
End Function
End Class
Last edited by Bulldog; Apr 8th, 2015 at 06:21 PM.
- If my post helped you, please Rate it
- If your problem is solved please also mark the thread resolved
I use VS2015 (unless otherwise stated).
_________________________________________________________________________________
B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.
-
Apr 8th, 2015, 08:15 PM
#23
Thread Starter
Junior Member
Re: Regex Help
Thanks Bulldog that is great, only it doesnt catch a larger string if it happens after, such as onetwoonefour11223344332211onefiveon11223344332211esixone112233221176112233221198. After stepping away for a bit and then looking at your example I realize im just confusing myself with the regex string....I was misunderstanding regex and getting caught up in thought. I understand the process now, with your inspiration and a clear head I've come up with this function. Thanks a lot! now I feel dumb
Code:
Function maxRepeat(str As String) As String
Dim start As Integer = 0
Dim length As Integer = 2
Dim found As String = ""
While start + length < str.Length
Dim search As String = str.Substring(start, length)
Dim results = New Regex(search)
If results.Matches(str).Count > 1 Then
If search.Length > found.Length Then
found = search
End If
length = length + 1
Else
start = start + length - 1
length = 2
End If
End While
Return found
End Function
...still not exactly clear on how the regex string from the javascript example literally works but same concept I guess
Last edited by paulmak10; Apr 8th, 2015 at 08:24 PM.
-
Apr 8th, 2015, 11:01 PM
#24
Re: Regex Help
I thought I would take a stab at this and think I finally got it. First thing you should look at is this "JavaScript RegExp Reference" where you will see that you included extra literals in the pattern you sent to the Regex engine.
Code:
Private Function LongestFirstPatternRepeated(input As String) As String
Dim ret As String = Nothing
Dim rgx As Regex = New Regex("(?=((.+)(?:.*?\2)+))")
Dim len As Int32
For Each m As Match In rgx.Matches(input)
If m.Groups(2).Value.Length > len Then
len = m.Groups(2).Value.Length
ret = m.Groups(2).Value
End If
Next
Return ret
End Function
-
Apr 9th, 2015, 09:38 AM
#25
Thread Starter
Junior Member
Re: Regex Help
Right on the money, I thought I played around with removing / /g hmm Im not sure what happened, Thank you very much makes sense now
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|