Results 1 to 25 of 25

Thread: [RESOLVED] Regex Help

  1. #1

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Resolved [RESOLVED] Regex Help

    Hello I don't understand regex fully but I found a great javascript function that does exactly what I need, How would I convert this properly to vb.net? I've tried a few things but could not get the same exact regex string to work when doing it the vb.net way...any thoughts? Any help would be appreciated.

    Code:
    function maxRepeat(input) {
                var reg = /(?=((.+)(?:.*?\2)+))/g;
                var sub = ""; 
                var maxstr = ""; 
                reg.lastIndex = 0; 
                sub = reg.exec(input); 
                while (!(sub == null)) {
                    if ((!(sub == null)) && (sub[2].length > maxstr.length)) {
                        maxstr = sub[2];
                    }
                    sub = reg.exec(input);
                    reg.lastIndex++; 
                }
                return maxstr;
            }

    This functions returns the largest sequence of characters that appear at least twice. "one two one three one four" would return "one t" <--with space || "onetwoonethreeonefour" would return "onet"

    "324234241122332211345435311223322112342345541122332211234234324" returns "1122332211234234"


    So far no luck with getting the same results as with javascript even with tweaking the regex string around a bit.. there must be something I don't understand.
    Code:
    Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")
            Dim match  = regex.Matches("one two three one four",0)
                MsgBox(match.count)   <---find anything?
    What is the vb.net regex engine equivalent of "/(?=((.+)(?:.*?\2)+))/g"?
    Last edited by paulmak10; Apr 8th, 2015 at 04:46 PM.

  2. #2
    Frenzied Member Bulldog's Avatar
    Join Date
    Jun 2005
    Location
    South UK
    Posts
    1,950

    Re: Regex Help

    I'm not sure how that regex is working but the equivalent function appears to be this;

    Code:
    Public Class Form1
    
        Dim Words As New List(Of String)
        Dim Occurrence As New List(Of Integer)
    
        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) _
        Handles MyBase.Load
            MessageBox.Show(CountWords("one two three four five one two two one"))
        End Sub
    
        Public Function CountWords(ByVal s As String) As String
            For Each str As String In s.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries)
                If Words.Contains(str) Then
                    Occurrence(Words.IndexOf(str)) += 1
                Else
                    Words.Add(str)
                    Occurrence.Add(1)
                End If
            Next
            Return Words(Occurrence.IndexOf(Occurrence.Max()))
        End Function
    
    End Class
    This will show the first word (there could me more than one) that has the highest number of occurrences.


    • If my post helped you, please Rate it
    • If your problem is solved please also mark the thread resolved

    I use VS2015 (unless otherwise stated).
    _________________________________________________________________________________
    B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
    I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.

  3. #3
    Fanatic Member Toph's Avatar
    Join Date
    Oct 2014
    Posts
    655

    Re: Regex Help

    My attempt at this is rather clunky but its still something.

    Code:
        Private Function MostOccuringWord(value As String) As String
    
            Dim originalWord As String = value
            Dim words() As String = originalWord.Split(" "c)
    
            Dim highestFound As Integer = Regex.Matches(originalWord, words(0)).Count
            Dim highestWord As String = words(0)
    
            For i = 1 To words.Length - 1
                Dim wordCount As Integer = Regex.Matches(originalWord, words(i)).Count
    
                If wordCount > highestFound Then
                    highestFound = wordCount
                    highestWord = words(i)
                End If
    
            Next
    
            Return highestWord
        End Function
    If you find my contributions helpful then rate them.

  4. #4
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,760

    Re: Regex Help

    LINQ would work in this case too:
    Code:
    Private Function GetMostOccuringWord(ByVal input As String) As String
        Dim sentence() As String = input.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
    
        Return sentence.GroupBy(Function(n) n).OrderByDescending(Function(g) g.Count).Select(Function(g) g.Key).FirstOrDefault
    End Function
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  5. #5
    Fanatic Member Toph's Avatar
    Join Date
    Oct 2014
    Posts
    655

    Re: Regex Help

    Quote Originally Posted by dday9 View Post
    LINQ would work in this case too:
    Code:
    Private Function GetMostOccuringWord(ByVal input As String) As String
        Dim sentence() As String = input.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
    
        Return sentence.GroupBy(Function(n) n).OrderByDescending(Function(g) g.Count).Select(Function(g) g.Key).FirstOrDefault
    End Function
    I was waiting for someone to whip up a Linq version. Very nice. I really need to learn Linq well.
    If you find my contributions helpful then rate them.

  6. #6

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Thanks for the responses everyone! All of these solutions are clever and linq seems like something I should dive into more. Furthermore I guess I should have mentioned that the string to be searched does not (in my case) contain a delimiter. So a string of "onetwoonethreeonefour" would also return "one", the regex method with javascript seems to work in either case as desired (with or without delimiter). Anyone have any more clever methods of doing this without splitting the string with a deliminator and counting each string?

    Honestly this is an interesting problem and I am lost with how the magic of regex solves it without a delimiter ..... If I did not find the javascript function above I would probably be doing crazy array work breaking my mind
    Last edited by paulmak10; Apr 8th, 2015 at 02:33 PM.

  7. #7
    Fanatic Member Toph's Avatar
    Join Date
    Oct 2014
    Posts
    655

    Re: Regex Help

    Quote Originally Posted by paulmak10 View Post
    Thanks for the responses everyone! All of these solutions are clever and linq seems like something I should dive into more. Furthermore I guess I should have mentioned that the string to be searched does not (in my case) contain a delimiter. So a string of "onetwoonethreeonefour" would also return "one", the regex method with javascript seems to work in either case as desired (with or without delimiter). Anyone have any more clever methods of doing this without splitting the string with a deliminator and counting each string?
    A work around would be to turn the javascript version to an API. so you send it data and it will output the data. you simply read it from your vb.net application.
    If you find my contributions helpful then rate them.

  8. #8

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Thanks Toph but I plan to use this for a critical part of the server side and would want it isolated....

    Is the key "/(?=((.+)(?:.*?\2)+))/g" valid for vb.net regex? Does anyone know how to translate this regex string to a vb.net regex?

  9. #9
    Fanatic Member Toph's Avatar
    Join Date
    Oct 2014
    Posts
    655

    Re: Regex Help

    Regex is universal. Regex written is not dependent on the programming language. The thing is, when I tested your Regex pattern with a test string, nothing matched. So I don't think the regex actually does anything.




    Quick question, without a delimiter, how would the program know when a word has finished or started?

    How does it know that onetwothree is actually three words? Without using a dictionary which is just slow.
    If you find my contributions helpful then rate them.

  10. #10

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    You know what, I have noticed that the pattern does not seem to match with online tools, but I assure you it works in javascript....and I have no idea how it works from the code that I see...

    As far as the delimiter question...im not sure but It seems to find a way, I had the same question when thinking about how I would approach this problem..and still do after I found the javascript function.

    And btw side note - "onetwoonethreeonefourone" returns with the longest repeating string being "onet"
    "one two one three one four one" would return "one t" <-- with a space


    its a very handy function I find it interesting, "1122332211542340543534243245934583458911223322114345345" would return "1122332211"

    "324234241122332211345435311223322112342345541122332211234234324" returns "1122332211234234"

    Very useful for encrypting/compressing data
    Last edited by paulmak10; Apr 8th, 2015 at 03:33 PM.

  11. #11
    Frenzied Member Bulldog's Avatar
    Join Date
    Jun 2005
    Location
    South UK
    Posts
    1,950

    Re: Regex Help

    I know a lot about regex but this one has me stumped. But I think I see how this works and it doesn't do what I thought, it's not looking for words.

    It starts an index of zero and looks for repetitions of characters from 1, using letters 1 then 1,2 the 1,2,3 and so on. The forward and back references are there because it will search the whole string regardless of index, but will exclude the first search term (which is why it's grabbing \2). It stores any match. It then advances the index and repeats this process until the string is exausted. So for example if we have "onetwothreefouronetwelve" it will report "onetw", since that is the longest letter sequence that repeats (other matches are "on", "one", "onet").

    I'd like to see that done in one line of linq


    • If my post helped you, please Rate it
    • If your problem is solved please also mark the thread resolved

    I use VS2015 (unless otherwise stated).
    _________________________________________________________________________________
    B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
    I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.

  12. #12

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Bulldog thanks for the insight, the break down at least has some of the magic cleared up for me, but I'm still lost as to how to express the correct regex...Would you mind experimenting? I normally like to rely on myself for solutions, but with little experience in regex and the fact that most troubleshooting regex tools do not recognize the string I see as "correct" has me at a roadblock.....I cannot seem to get any feedback the trial and error way..

    In more specific context, I have this function repeatedly compressing(replacing found string with G-Z) encrypted data until the only match is a string that is less than 2 characters long, I then have a method for encoding the key(all of the matched strings and their corresponding G-Z) with whats left of the hex data...the way im doing it is the hex data can have up to 20 "keys" (being G-Z since thats unused in hex) to help compress things

    Just need to convert to server side code..ugh
    Last edited by paulmak10; Apr 8th, 2015 at 03:50 PM.

  13. #13
    Fanatic Member Toph's Avatar
    Join Date
    Oct 2014
    Posts
    655

    Re: Regex Help

    Quote Originally Posted by paulmak10 View Post
    Bulldog thanks for the insight, the break down at least has some of the magic cleared up for me, but I'm still lost as to how to express the correct regex...Would you mind experimenting? I normally like to rely on myself for solutions, but with little experience in regex and the fact that most troubleshooting regex tools do not recognize the string I see as "correct" has me at a roadblock.....I cannot seem to get any feedback the trial and error way..
    I don't think you can implement such a regex to do the WHOLE thing. It would be easier to use a programming language to help you solve the problem. You can however use Regex as part of the solution.

    But like Bulldog said, this had be stumpted and I have no idea how to do this too.
    If you find my contributions helpful then rate them.

  14. #14

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Why do these not seem to mean the same thing?

    var reg = /(?=((.+)(?:.*?\2)+))/g;

    Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")

    I get how the rest of the code goes through the results and does that logic, im just not getting the expected returned results when running the regex string in vb.net

  15. #15
    Frenzied Member Bulldog's Avatar
    Join Date
    Jun 2005
    Location
    South UK
    Posts
    1,950

    Re: Regex Help

    It depends on the engine, the VB.Net Regex engine is different to other implementations. I'll have a think about this, now I can see how it works, I can decompose it and reconstruct it.


    • If my post helped you, please Rate it
    • If your problem is solved please also mark the thread resolved

    I use VS2015 (unless otherwise stated).
    _________________________________________________________________________________
    B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
    I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.

  16. #16

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Oh and let me add that I would actually want to use
    Code:
    Dim match = regex.Matches("onetwoonethreeonefour", 0)
    instead of singular match, but ive played around with this too, and match.count shows 0 so I assume my efforts fail..

  17. #17

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Bulldog thanks for taking a deep thought on this, please do let me know if you come to find anything, much appreciated !

  18. #18

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Stumped indeed Toph

  19. #19
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,398

    Re: Regex Help

    Quote Originally Posted by paulmak10 View Post
    Why do these not seem to mean the same thing?

    var reg = /(?=((.+)(?:.*?\2)+))/g;

    Dim regex As Regex = New Regex("/(?=((.+)(?:.*?\2)+))/g")

    I get how the rest of the code goes through the results and does that logic, im just not getting the expected returned results when running the regex string in vb.net
    Because you have not learn't regex.

  20. #20

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Quote Originally Posted by ident View Post
    Because you have not learn't regex.
    Correct, this is the first practical exposure I have had to regex, I understand how regex is used for input verification such as email and phone formats but the specifics with this application are over my head at the current time. And as talked about through previous posts...the regex tools i've tried to use to "take a better look" have not been able to find matches (although it clearly works as i've ran the code), that coupled with my lack of experience makes it hard to debug exactly whats not working out.
    Last edited by paulmak10; Apr 8th, 2015 at 04:37 PM.

  21. #21
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,398

    Re: Regex Help

    How can you understand email validation? Have you even started anchors?

  22. #22
    Frenzied Member Bulldog's Avatar
    Join Date
    Jun 2005
    Location
    South UK
    Posts
    1,950

    Re: Regex Help

    Well... my interpretation is below. I tested this on quite a few examples (but I'm not sure it's 100% correct, so code read it and check it for yourself).

    I'm sure this could be cut down but having done the hard bit, someone else can do the easy bit

    Code:
    Imports System.Text.RegularExpressions
    Public Class Form1
    
        Dim FoundString As String = String.Empty
        Dim SearchString As String = String.Empty
        Dim Remainder As String = String.Empty
        Dim FindIt As Regex
    
        Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) _
        Handles MyBase.Load
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("ffourfiveonetffourfiveonett"))
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("fourfiveonetffourfiveonett"))
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("ivffourfiveonetttffourfiveonettt"))
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("onettwoonett"))
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("812345678901234569"))
    
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("1122332211542340543534243245934583458911223322114345345"))
    
     
            MessageBox.Show("Longest Repeating String = " & _
            MaxString("324234241122332211345435311223322112342345541122332211234234324"))
        End Sub
    
        Private Function MaxString(str As String) As String
            FoundString = String.Empty
            For Offset As Integer = 0 To str.Length - 1
                For SearchLength As Integer = 1 To str.Length
                    If Offset < str.Length - SearchLength And Offset < SearchLength Then
                        SearchString = str.Substring(Offset, SearchLength)
                        Remainder = str.Substring(SearchLength, str.Length - SearchLength)
                        If SearchString.Length < Remainder.Length Then
                            FindIt = New Regex(SearchString)
                            If FindIt.Matches(Remainder).Count > 0 Then
                                If SearchString.Length > FoundString.Length Then
                                    FoundString = SearchString
                                End If
                            End If
                        End If
                    End If
                Next
            Next
            Return FoundString
        End Function
    
    End Class
    Last edited by Bulldog; Apr 8th, 2015 at 06:21 PM.


    • If my post helped you, please Rate it
    • If your problem is solved please also mark the thread resolved

    I use VS2015 (unless otherwise stated).
    _________________________________________________________________________________
    B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
    I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.

  23. #23

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Thanks Bulldog that is great, only it doesnt catch a larger string if it happens after, such as onetwoonefour11223344332211onefiveon11223344332211esixone112233221176112233221198. After stepping away for a bit and then looking at your example I realize im just confusing myself with the regex string....I was misunderstanding regex and getting caught up in thought. I understand the process now, with your inspiration and a clear head I've come up with this function. Thanks a lot! now I feel dumb

    Code:
    Function maxRepeat(str As String) As String
            Dim start As Integer = 0
            Dim length As Integer = 2
            Dim found As String = ""
            While start + length < str.Length
                Dim search As String = str.Substring(start, length)
                Dim results = New Regex(search)
                If results.Matches(str).Count > 1 Then
                    If search.Length > found.Length Then
                        found = search
    
                    End If
                    length = length + 1
                Else
                    start = start + length - 1
                    length = 2
                End If
            End While
            Return found
        End Function

    ...still not exactly clear on how the regex string from the javascript example literally works but same concept I guess
    Last edited by paulmak10; Apr 8th, 2015 at 08:24 PM.

  24. #24
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: Regex Help

    I thought I would take a stab at this and think I finally got it. First thing you should look at is this "JavaScript RegExp Reference" where you will see that you included extra literals in the pattern you sent to the Regex engine.

    Code:
    Private Function LongestFirstPatternRepeated(input As String) As String
       Dim ret As String = Nothing
       Dim rgx As Regex = New Regex("(?=((.+)(?:.*?\2)+))")
       Dim len As Int32
       For Each m As Match In rgx.Matches(input)
          If m.Groups(2).Value.Length > len Then
             len = m.Groups(2).Value.Length
             ret = m.Groups(2).Value
          End If
       Next
       Return ret
    End Function

  25. #25

    Thread Starter
    Junior Member
    Join Date
    Feb 2012
    Posts
    18

    Re: Regex Help

    Right on the money, I thought I played around with removing / /g hmm Im not sure what happened, Thank you very much makes sense now

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width