-
Jul 29th, 2015, 01:18 PM
#1
Some help from REGEX experts please
Hello,
I am just learning Regex and find it fascinating. I have created a few strings to validate the data coming from a string. But I can't find a way to test for these cases.
dates like: Jul-01-2015
file names like: test.txt
Can you help me please?
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 01:39 PM
#2
Re: Some help from REGEX experts please
OK, got the first one: (including the time)
(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dic)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2}
the second one appears easier but I am stuck. please help
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 01:50 PM
#3
Re: Some help from REGEX experts please
not an expert on regex here. but you can improve the first one to make sure the month is [0-3]\d. and december is misspelled same with the time you only want [0-2]\d, [0-5]\d etc.
what is the requirement for the test.txt? if its just ending on .txt the "\.txt$" should do it.
-
Jul 29th, 2015, 01:54 PM
#4
Re: Some help from REGEX experts please
Thanks, mixing Spanish with English LOL (Diciembre).
No, that is an example, any file name will come in the string.
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 01:59 PM
#5
Re: Some help from REGEX experts please
any file name will come in the string.
you cant check if its a file, but if you want check if it contains only alphanumeric plus underscore then with dot then 3 alphanumeric plus underscore chars extension then this would do:
Code:
^[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}$
-
Jul 29th, 2015, 02:06 PM
#6
Re: Some help from REGEX experts please
Thank you again.
What are the ^ and $ characters for?
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 02:11 PM
#7
Re: Some help from REGEX experts please
its the starting and ending anchor. if you miss them out then your input string can have anything before and after your phrase.
eg
"this should not be valid$$$$/@filename.txt%%%//\\"
against
^[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}$
->false
against
[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{3}
->true
-
Jul 29th, 2015, 02:13 PM
#8
Re: Some help from REGEX experts please
The ^ matches the beginning of the string and the $ matches the end of the string. Together they're used to match only exact matches rather than partial matches for example, this code:
Will match abc but not abcd, where as this code:
Will match the abc in abcd.
-
Jul 29th, 2015, 02:14 PM
#9
Re: Some help from REGEX experts please
I have also moved this thread from VB.Net to Other Languages.
-
Jul 29th, 2015, 02:25 PM
#10
Re: Some help from REGEX experts please
OK, I am getting it now, in this case it should not include the anchors then. I am trying to parse a whole line that has the file info. It goes something like:
12 Jan-01-1980 00:00:02 Config.bin
And also, the issue is a bit more complex because I am using a Match object to define "fields"
the line that calls the match is this:
Code:
Dim m as Match = GetMatch(strLine)
where strLine is the string
GetMatch is this:
Code:
Private function GetMatch(ByVal strLine as String) as Match
Dim rx As Regex, m As Match
rx = New Regex("(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{1,3})")
m = rx.Match(strLine)
Return m
End Function
So far, it appears to work and it gives me 3 fiels for m (size, timestamp and name).
But there are some lines that have directory (subfolder) entrees, what to do?
here is an example of a few lines from the source:
Code:
2558245 Dec-12-2014 13:22:36 source.dat
130944 Jul-21-2015 10:46:46 retain.bin
904 Nov-07-2007 09:57:08 PlcVxW.ini
2048 Oct-23-2008 13:45:18 HMIConfiguration <DIR>
2048 Jul-29-2010 10:13:54 WEBROOT <DIR>
I would like to have a fourth field (Dir) stating if it is a directory.
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 02:26 PM
#11
Re: Some help from REGEX experts please
Could you change it back to VB.Net please?
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 02:38 PM
#12
Re: Some help from REGEX experts please
Originally Posted by kaliman79912
Could you change it back to VB.Net please?
Yeah, it was just that up to this point it was strictly RegEx and not VB.Net. That was why I moved it in the first place.
-
Jul 29th, 2015, 02:39 PM
#13
Re: Some help from REGEX experts please
Originally Posted by dday9
Yeah, it was just that up to this point it was strictly RegEx and not VB.Net. That was why I moved it in the first place.
Yes I get it, I should have explained the whole issue from the beginning. Thank you
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
-
Jul 29th, 2015, 03:06 PM
#14
Re: Some help from REGEX experts please
I feel like you're going about this the wrong way. Don't get me wrong, RegEx is power and it some cases great, but in this case you can use the .NET methods to accomplish what you're wanting to do much easier. Take a look at this example:
Code:
Imports System
Imports System.Collections.Generic
Public Module Module1
Public Sub Main()
Dim input As String = String.Format("2558245 Dec-12-2014 13:22:36 source.dat{0}130944 Jul-21-2015 10:46:46 retain.bin{0}904 Nov-07-2007 09:57:08 PlcVxW.ini{0}2048 Oct-23-2008 13:45:18 HMIConfiguration <DIR>{0}2048 Jul-29-2010 10:13:54 WEBROOT <DIR>", Environment.NewLine())
Dim dataCollection As New List(Of Data)
For Each line As String In input.Split(Environment.NewLine())
dataCollection.Add(Data.CreateData(line))
Next
dataCollection.ForEach(Sub(d) Console.WriteLine(String.Format("{0}|{1}|{2}|{3}|{4}", d.Date.ToString("MMM-dd-yyyy"), d.Directory, d.Name, d.Size, d.Time)))
End Sub
End Module
Public Class Data
Public Property [Date] As Date
Public Property Directory As Boolean
Public Property Name As String
Public Property [Size] As Integer
Public Property Time As TimeSpan
Public Shared Function CreateData(ByVal line As String) As Data
Dim splitData() As String = line.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
Dim d As New Data
If splitData.Length < 0 OrElse Not Integer.TryParse(splitData(0), d.Size) Then
Throw New Exception("Invalid size value.")
End If
If splitData.Length < 1 OrElse Not Date.TryParseExact(splitData(1), "MMM-dd-yyyy", New Globalization.CultureInfo("en-US"), Globalization.DateTimeStyles.None, d.[Date]) Then
Throw New Exception("Invalid date value.")
End If
If splitData.Length < 2 OrElse Not TimeSpan.TryParse(splitData(2), d.Time) Then
Throw New Exception("Invalid time value.")
End If
If splitData.Length < 3 Then
Throw New Exception("Invalid name value.")
Else
d.Name = splitData(3)
End If
d.Directory = Not splitData.Length = 4
Return d
End Function
End Class
Last edited by dday9; Jul 29th, 2015 at 03:09 PM.
Reason: splitData.Length = 4
-
Jul 29th, 2015, 03:17 PM
#15
Re: Some help from REGEX experts please
Very interesting, (a typo in line 41 should be Invalid Name value). It does work great. Except that the example I posted was just part of the solution. The source string coming from an FTP server located at a PLC could have many different formats. So what I am doing is this:
Code:
Private Function GetMatchingRegex(ByVal line As String) As Match
Dim formats As String() = { _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})(\s+)(?<size>(\d+))(\s+)(?<ctbit>(\w+\s\w+))(\s+)(?<size2>(\d+))\s+(?<timestamp>\w+\s+\d+\s+\d{2}:\d{2})\s+(?<name>.+)", _
"(?<timestamp>\d{2}\-\d{2}\-\d{2}\s+\d{2}:\d{2}[Aa|Pp][mM])\s+(?<dir>\<\w+\>){0,1}(?<size>\d+){0,1}\s+(?<name>.+)",
"(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,}\.[0-9a-zA-Z_]{1,3})\s+(?<dir>.+)",
"(?<size>\d+)\s+(?<timestamp>(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\-\d{2}\-\d{4}\s+\d{2}:\d{2}:\d{2})\s+(?<name>[0-9a-zA-Z_]{1,})\s+(?<dir>.+)"}
Dim rx As Regex, m As Match
m = Nothing
For i As Integer = 0 To formats.Length - 1
rx = New Regex(formats(i))
m = rx.Match(line)
If m.Success Then
Return m
End If
Next
Return m
End Function
I will explore the way I could use your code with all the different file info formats that I have detected so far.
Thank you
More important than the will to succeed, is the will to prepare for success.
Please rate the posts, your comments are the fuel to keep helping people
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|