Hello... I am need of some help on this. I cannot figure out how to filter a file where I want to extract the last line of a persons badge number for a given day. I have attached a screenshot of what I am working with. Each hi-lighted lines are the ones that I need to extract from the file and the file can usually be fairly large. The only change in the file will be "passed" or "failed" for each badge number. I have not been successful searching for an example of what I want to do. I can split the file and get values, but that's as far as it goes. Thanks for any examples someone can help me with.
Re: Filter Lines of Text File based on Value AND Date
Try this…
Code:
Dim lines() As String = IO.File.ReadAllLines(path)
Dim distinctBadges() As String = lines.Distinct(Function(l) l.SubString(0, 13)).ToArray
distinctBadges = Array.ConvertAll(distinctBadges, Function(l) l.SubString(0, 13))
Dim lastLines As New List(Of String)
For each db As String in distinctBadges
For x As Integer = 0 To lines.GetUpperBound(0)
If lines(x) < lines.GetUpperBound(0) AndAlso lines(x).StartsWith(db) And Not lines(x + 1).StartsWith(db) Then
lastLines.Add(lines(x))
End If
Next
Next
Re: Filter Lines of Text File based on Value AND Date
You will need to parse the line. While it doesn't appear as though you have a common delimiter, it isn't really all that bad.
It looks like the line always starts with Badge# (notice the space). The next value is always numeric. The next value is always a comma followed by a space. The next value is always Test [PASSED|FAILED] ON: (notice the space). The next value is always a date in MM-dd-yyyy hh:mm:ss tt format.
Take a look at this example:
Code:
Public Class Line
Private _badge As String
Public Property Badge() As String
Get
Return _badge
End Get
Set(value As String)
If (_badge <> value) Then
_badge = value
OnBadgeChanged()
End If
End Set
End Property
Protected Overridable Sub OnBadgeChanged()
RaiseEvent BadgeChanged(Me, EventArgs.Empty)
End Sub
Public Event BadgeChanged(sender As Object, e As EventArgs)
Private _result As String
Public Property Result() As String
Get
Return _result
End Get
Set(value As String)
If (_result <> value) Then
_result = value
OnResultChanged()
End If
End Set
End Property
Protected Overridable Sub OnResultChanged()
RaiseEvent ResultChanged(Me, EventArgs.Empty)
End Sub
Public Event ResultChanged(sender As Object, e As EventArgs)
Private _timestamp As DateTime
Public Property Timestamp() As DateTime
Get
Return _timestamp
End Get
Set(value As DateTime)
If (_timestamp <> value) Then
_timestamp = value
OnTimestampChanged()
End If
End Set
End Property
Protected Overridable Sub OnTimestampChanged()
RaiseEvent TimestampChanged(Me, EventArgs.Empty)
End Sub
Public Event TimestampChanged(sender As Object, e As EventArgs)
Public Shared Function TryParse(input As String, ByRef output As Line) As Boolean
output = Nothing
' cannot be empty
If (String.IsNullOrWhiteSpace(input)) Then
Return False
End If
' must start with Badge#
If (Not input.StartsWith("Badge# ")) Then
Return False
End If
' must have a numeric badge number
input = input.Substring("Badge# ".Length)
Dim index As Integer = 0
Dim badge As String = String.Empty
Do While index < input.Length AndAlso Char.IsDigit(input(index))
badge &= input(index).ToString()
index += 1
Loop
If (String.IsNullOrWhiteSpace(badge)) Then
Return False
End If
' must have a comma followed by a space follwed by Test
input = input.Substring(badge.Length)
If (Not input.StartsWith(", Test ")) Then
Return False
End If
' must have PASSED or FAILED
input = input.Substring(", Test ".Length)
Dim result As String = String.Empty
If (input.StartsWith("PASSED")) Then
result = "PASSED"
ElseIf (input.StartsWith("FAILED")) Then
result = "FAILED"
Else
Return False
End If
' must have a space, followed by ON:, followed by a space
input = input.Substring(result.Length)
If (Not input.StartsWith(" ON: ")) Then
Return False
End If
' the remainder must be a valid DateTime in MM/DD/YYYY hh:mm:ss tt format
input = input.Substring(" ON: ".Length)
Dim timestamp As DateTime
If (Not DateTime.TryParseExact(input, "MM-dd-yyyy hh:mm:ss tt", Nothing, Nothing, timestamp)) Then
Return False
End If
' everything passes
output = New Line() With {
.Badge = badge,
.Result = result,
.Timestamp = timestamp
}
Return True
End Function
End Class
Here is an example of parsing one of the lines:
Code:
Dim input As String = "Badge# 0040274, Test PASSED ON: 10-19-2021 05:17:44 AM"
Dim output As Line = Nothing
If (Line.TryParse(input, output)) Then
Console.WriteLine("Badge#: {0}, Result: {1}, Timestamp: {2}", output.Badge, output.Result, output.Timestamp)
Else
Console.WriteLine("Unable to parse the line.")
End If
Console.ReadLine()
From here, you would parse each line in your file to a Line reference and use LINQ to group by the appropriate data.
Re: Filter Lines of Text File based on Value AND Date
You would use IO.File.ReadAllLines (documentation) and then call the TryParse method on each line:
Code:
Dim parsedLines As New List(Of Line)
For Each input As String In IO.File.ReadAllLines("my-file.txt")
Dim output As Line = Nothing
If (Line.TryParse(input, output)) Then
parsedLines.Add(output)
End If
Next
Then What you would do is use LINQ to group by the BadgeNumber and select the newest record in the group:
Code:
Dim results As IEnumerable(Of Line) = parsedLines.GroupBy(Function(l) l.Badge).Select(Function(g) parsedLines.Where(Function(l) l.Badge = g.Key).OrderByDescending(Function(l) l.Timestamp).First())
Re: Filter Lines of Text File based on Value AND Date
thanks dday9 - I was able to get your code working, thanks for the help. The only issue I am seeing is when the date changes and the same badge# is found, it becomes very confused. I have attached the file I am using if you want to check it out. This file starts and ends with the same badge# as if it were a different date. The top section badge "0005269" ends with a PASS, and the bottom section on different date ends with a FAIL. The data output runs and shows 10/24/2021 as PASS. I need each date change to show PASS or FAIL for each Badge#. Thanks for your help
Code:
Imports System
Imports System.Collections.Generic
Imports System.IO
Imports System.Linq
Public Module Module1
Public Sub Main()
Dim parsedLines As New List(Of Line)
For Each input As String In IO.File.ReadAllLines("C:\temp\esd.txt")
Dim output As Line = Nothing
If (Line.TryParse(input, output)) Then
parsedLines.Add(output)
End If
Next
' actual code here
Dim results As IEnumerable(Of Line) = parsedLines.GroupBy(Function(l) l.Badge).Select(Function(g) parsedLines.Where(Function(l) l.Badge = g.Key).OrderByDescending(Function(l) l.Timestamp).First())
For Each item In results
Console.WriteLine("Badge: {0}, Result: {1}, Timestamp: {2}", item.Badge, item.Result, item.Timestamp)
Dim strFile As String = "C:\temp\results.txt"
Dim objWriter As New System.IO.StreamWriter(strFile, True)
objWriter.WriteLine("Badge: {0}, Result: {1}, Timestamp: {2}", item.Badge, item.Result, item.Timestamp)
objWriter.Close()
Next
Console.ReadLine()
End Sub
End Module
Public Class Line
Private _badge As String
Public Property Badge() As String
Get
Return _badge
End Get
Set(value As String)
If (_badge <> value) Then
_badge = value
OnBadgeChanged()
End If
End Set
End Property
Protected Overridable Sub OnBadgeChanged()
RaiseEvent BadgeChanged(Me, EventArgs.Empty)
End Sub
Public Event BadgeChanged(sender As Object, e As EventArgs)
Private _result As String
Public Property Result() As String
Get
Return _result
End Get
Set(value As String)
If (_result <> value) Then
_result = value
OnResultChanged()
End If
End Set
End Property
Protected Overridable Sub OnResultChanged()
RaiseEvent ResultChanged(Me, EventArgs.Empty)
End Sub
Public Event ResultChanged(sender As Object, e As EventArgs)
Private _timestamp As DateTime
Public Property Timestamp() As DateTime
Get
Return _timestamp
End Get
Set(value As DateTime)
If (_timestamp <> value) Then
_timestamp = value
OnTimestampChanged()
End If
End Set
End Property
Protected Overridable Sub OnTimestampChanged()
RaiseEvent TimestampChanged(Me, EventArgs.Empty)
End Sub
Public Event TimestampChanged(sender As Object, e As EventArgs)
Public Shared Function TryParse(input As String, ByRef output As Line) As Boolean
output = Nothing
' cannot be empty
If (String.IsNullOrWhiteSpace(input)) Then
Return False
End If
' must start with Badge#
If (Not input.StartsWith("Badge# ")) Then
Return False
End If
' must have a numeric badge number
input = input.Substring("Badge# ".Length)
Dim index As Integer = 0
Dim badge As String = String.Empty
Do While index < input.Length AndAlso Char.IsDigit(input(index))
badge &= input(index).ToString()
index += 1
Loop
If (String.IsNullOrWhiteSpace(badge)) Then
Return False
End If
' must have a comma followed by a space follwed by Test
input = input.Substring(badge.Length)
If (Not input.StartsWith(", Test ")) Then
Return False
End If
' must have PASSED or FAILED
input = input.Substring(", Test ".Length)
Dim result As String = String.Empty
If (input.StartsWith("PASSED")) Then
result = "PASSED"
ElseIf (input.StartsWith("FAILED")) Then
result = "FAILED"
Else
Return False
End If
' must have a space, followed by ON:, followed by a space
input = input.Substring(result.Length)
If (Not input.StartsWith(" ON: ")) Then
Return False
End If
' the remainder must be a valid DateTime in MM/DD/YYYY hh:mm:ss tt format
input = input.Substring(" ON: ".Length)
Dim timestamp As DateTime
If (Not DateTime.TryParseExact(input, "MM-dd-yyyy hh:mm:ss tt", Nothing, Nothing, timestamp)) Then
Return False
End If
' everything passes
output = New Line() With {
.Badge = badge,
.Result = result,
.Timestamp = timestamp
}
Return True
End Function
End Class
Re: Filter Lines of Text File based on Value AND Date
Try this. Tested now...
Code:
Dim lines() As String = IO.File.ReadAllLines(path)
Dim distinctBadges() As String = lines.Select(Function(l) l.Substring(0, 13)).Distinct.ToArray
Dim lastLines As New List(Of String)
For Each db As String In distinctBadges
For x As Integer = 0 To lines.GetUpperBound(0) - 1
If lines(x).StartsWith(db) And Not lines(x + 1).StartsWith(db) Then
lastLines.Add(lines(x))
End If
Next
Next
This is assuming you want the last entry for every distinct badge. Using your text file, i found 12 lines...
Last edited by .paul.; Oct 24th, 2021 at 04:00 PM.
Re: Filter Lines of Text File based on Value AND Date
Thanks paul... this does work now. My issue now is, how do I deal with separate dates? In my attached file to dday9 has Badge# 0005269 at the top and bottom of the file with different dates. One ended in PASS and the other ended in FAIL. How can that be dealt with?
Re: Filter Lines of Text File based on Value AND Date
Originally Posted by mikeg71
Thanks paul... this does work now. My issue now is, how do I deal with separate dates?
Can you give an example of the text file? Do you want to take just the latest last line for each badge number, or the last line for each date for each badge number?
Edit... I see the multiple dates in the text file. I still need to know which lines to select
Last edited by .paul.; Oct 24th, 2021 at 06:11 PM.
Re: Filter Lines of Text File based on Value AND Date
Try this. I found 13 distinct badges over 3 different dates...
Code:
Dim lines() As String = IO.File.ReadAllLines("C:\Users\Paul\Desktop\esd - Copy.txt")
Dim distinctBadges() As String = lines.Select(Function(l) l.Substring(0, l.IndexOf(","c))).Distinct.ToArray
Dim days = lines.GroupBy(Function(l) Date.ParseExact(l.Substring(l.IndexOf(":"c) + 2, 10), "MM-dd-yyyy", Nothing)).ToArray
Dim lastLines As New List(Of String)
For Each distinctDay In days
Dim dayLines() As String = distinctDay.ToArray
For Each db As String In distinctBadges
For x As Integer = 0 To dayLines.GetUpperBound(0) - 1
If dayLines(x).StartsWith(db) And Not dayLines(x + 1).StartsWith(db) Then
lastLines.Add(dayLines(x))
End If
Next
Next
If Not lastLines.Last = distinctDay.Last Then
lastLines.Add(distinctDay.Last)
End If
Next