Results 1 to 15 of 15

Thread: Some help from REGEX experts please

  1. #1

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Some help from REGEX experts please

    Hello,

    I am just learning Regex and find it fascinating. I have created a few strings to validate the data coming from a string. But I can't find a way to test for these cases.

    dates like: Jul-01-2015
    file names like: test.txt

    Can you help me please?
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  2. #2

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    OK, got the first one: (including the time)

    (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dic)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2}

    the second one appears easier but I am stuck. please help
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  3. #3
    Frenzied Member
    Join Date
    May 2014
    Location
    Central Europe
    Posts
    1,372

    Re: Some help from REGEX experts please

    not an expert on regex here. but you can improve the first one to make sure the month is [0-3]\d. and december is misspelled same with the time you only want [0-2]\d, [0-5]\d etc.

    what is the requirement for the test.txt? if its just ending on .txt the "\.txt$" should do it.

  4. #4

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    Thanks, mixing Spanish with English LOL (Diciembre).
    No, that is an example, any file name will come in the string.
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  5. #5
    Frenzied Member
    Join Date
    May 2014
    Location
    Central Europe
    Posts
    1,372

    Re: Some help from REGEX experts please

    any file name will come in the string.
    you cant check if its a file, but if you want check if it contains only alphanumeric plus underscore then with dot then 3 alphanumeric plus underscore chars extension then this would do:
    Code:
    ^[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}$

  6. #6

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    Thank you again.
    What are the ^ and $ characters for?
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  7. #7
    Frenzied Member
    Join Date
    May 2014
    Location
    Central Europe
    Posts
    1,372

    Re: Some help from REGEX experts please

    its the starting and ending anchor. if you miss them out then your input string can have anything before and after your phrase.
    eg
    "this should not be valid$$$$/@filename.txt%%%//\\"
    against
    ^[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}$
    ->false
    against
    [0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}
    ->true

  8. #8
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: Some help from REGEX experts please

    The ^ matches the beginning of the string and the $ matches the end of the string. Together they're used to match only exact matches rather than partial matches for example, this code:
    Code:
    ^[a-c]*$
    Will match abc but not abcd, where as this code:
    Code:
    [a-c]*
    Will match the abc in abcd.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  9. #9
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: Some help from REGEX experts please

    I have also moved this thread from VB.Net to Other Languages.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  10. #10

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    OK, I am getting it now, in this case it should not include the anchors then. I am trying to parse a whole line that has the file info. It goes something like:

    12 Jan-01-1980 00:00:02 Config.bin

    And also, the issue is a bit more complex because I am using a Match object to define "fields"

    the line that calls the match is this:

    Code:
    Dim m as Match = GetMatch(strLine)
    where strLine is the string

    GetMatch is this:
    Code:
    Private function GetMatch(ByVal strLine as String) as Match
        Dim rx As Regex, m As Match
        rx = New Regex("(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{1,3})")
        m = rx.Match(strLine)
        Return m
    End Function
    So far, it appears to work and it gives me 3 fiels for m (size, timestamp and name).
    But there are some lines that have directory (subfolder) entrees, what to do?

    here is an example of a few lines from the source:
    Code:
     2558245    Dec-12-2014  13:22:36   source.dat       
      130944    Jul-21-2015  10:46:46   retain.bin       
         904    Nov-07-2007  09:57:08   PlcVxW.ini       
        2048    Oct-23-2008  13:45:18   HMIConfiguration  <DIR>
        2048    Jul-29-2010  10:13:54   WEBROOT           <DIR>
    I would like to have a fourth field (Dir) stating if it is a directory.
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  11. #11

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    Could you change it back to VB.Net please?
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  12. #12
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: Some help from REGEX experts please

    Quote Originally Posted by kaliman79912 View Post
    Could you change it back to VB.Net please?
    Yeah, it was just that up to this point it was strictly RegEx and not VB.Net. That was why I moved it in the first place.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  13. #13

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    Quote Originally Posted by dday9 View Post
    Yeah, it was just that up to this point it was strictly RegEx and not VB.Net. That was why I moved it in the first place.
    Yes I get it, I should have explained the whole issue from the beginning. Thank you
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

  14. #14
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: Some help from REGEX experts please

    I feel like you're going about this the wrong way. Don't get me wrong, RegEx is power and it some cases great, but in this case you can use the .NET methods to accomplish what you're wanting to do much easier. Take a look at this example:
    Code:
    Imports System
    Imports System.Collections.Generic
    
    Public Module Module1
    
        Public Sub Main()
            Dim input As String = String.Format("2558245    Dec-12-2014  13:22:36   source.dat{0}130944    Jul-21-2015  10:46:46   retain.bin{0}904    Nov-07-2007  09:57:08   PlcVxW.ini{0}2048    Oct-23-2008  13:45:18   HMIConfiguration  <DIR>{0}2048    Jul-29-2010  10:13:54   WEBROOT           <DIR>", Environment.NewLine())
            Dim dataCollection As New List(Of Data)
            For Each line As String In input.Split(Environment.NewLine())
                dataCollection.Add(Data.CreateData(line))
            Next
    
            dataCollection.ForEach(Sub(d) Console.WriteLine(String.Format("{0}|{1}|{2}|{3}|{4}", d.Date.ToString("MMM-dd-yyyy"), d.Directory, d.Name, d.Size, d.Time)))
        End Sub
    End Module
    
    Public Class Data
    
        Public Property [Date] As Date
        Public Property Directory As Boolean
        Public Property Name As String
        Public Property [Size] As Integer
        Public Property Time As TimeSpan
    
        Public Shared Function CreateData(ByVal line As String) As Data
            Dim splitData() As String = line.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
            Dim d As New Data
            If splitData.Length < 0 OrElse Not Integer.TryParse(splitData(0), d.Size) Then
                Throw New Exception("Invalid size value.")
            End If
    
            If splitData.Length < 1 OrElse Not Date.TryParseExact(splitData(1), "MMM-dd-yyyy", New Globalization.CultureInfo("en-US"), Globalization.DateTimeStyles.None, d.[Date]) Then
                Throw New Exception("Invalid date value.")
            End If
    
            If splitData.Length < 2 OrElse Not TimeSpan.TryParse(splitData(2), d.Time) Then
                Throw New Exception("Invalid time value.")
            End If
    
            If splitData.Length < 3 Then
                Throw New Exception("Invalid name value.")
            Else
                d.Name = splitData(3)
            End If
    
            d.Directory = Not splitData.Length = 4
            Return d
        End Function
    End Class
    Last edited by dday9; Jul 29th, 2015 at 03:09 PM. Reason: splitData.Length = 4
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  15. #15

    Thread Starter
    PowerPoster kaliman79912's Avatar
    Join Date
    Jan 2009
    Location
    Ciudad Juarez, Chihuahua. Mexico
    Posts
    2,593

    Re: Some help from REGEX experts please

    Very interesting, (a typo in line 41 should be Invalid Name value). It does work great. Except that the example I posted was just part of the solution. The source string coming from an FTP server located at a PLC could have many different formats. So what I am doing is this:

    Code:
        Private Function GetMatchingRegex(ByVal line As String) As Match
            Dim formats As String() = { _
                        "(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
                        "(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
                        "(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
                        "(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
                        "(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})(\s+)(?<size>(\d+))(\s+)(?<ctbit>(\w+\s\w+))(\s+)(?<size2>(\d+))\s+(?<timestamp>\w+\s+\d+\s+\d{2}:\d{2})\s+(?<name>.+)", _
                        "(?<timestamp>\d{2}\-\d{2}\-\d{2}\s+\d{2}:\d{2}[Aa|Pp][mM])\s+(?<dir>\<\w+\>){0,1}(?<size>\d+){0,1}\s+(?<name>.+)",
                        "(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{1,3})\s+(?<dir>.+)",
                        "(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,})\s+(?<dir>.+)"}
            Dim rx As Regex, m As Match
            m = Nothing
            For i As Integer = 0 To formats.Length - 1
                rx = New Regex(formats(i))
                m = rx.Match(line)
                If m.Success Then
                    Return m
                End If
            Next
            Return m
        End Function
    I will explore the way I could use your code with all the different file info formats that I have detected so far.

    Thank you
    More important than the will to succeed, is the will to prepare for success.

    Please rate the posts, your comments are the fuel to keep helping people

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width