Results 1 to 16 of 16

Thread: Filter Lines of Text File based on Value AND Date

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Filter Lines of Text File based on Value AND Date

    Hello... I am need of some help on this. I cannot figure out how to filter a file where I want to extract the last line of a persons badge number for a given day. I have attached a screenshot of what I am working with. Each hi-lighted lines are the ones that I need to extract from the file and the file can usually be fairly large. The only change in the file will be "passed" or "failed" for each badge number. I have not been successful searching for an example of what I want to do. I can split the file and get values, but that's as far as it goes. Thanks for any examples someone can help me with.

    Name:  Filter-Test - Copy.JPG
Views: 298
Size:  154.2 KB

  2. #2
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    So you want to extract the last line for each distinct badge number?

  3. #3
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    Try this…

    Code:
    Dim lines() As String = IO.File.ReadAllLines(path)
    Dim distinctBadges() As String = lines.Distinct(Function(l) l.SubString(0, 13)).ToArray
    distinctBadges = Array.ConvertAll(distinctBadges, Function(l) l.SubString(0, 13))
    
    Dim lastLines As New List(Of String)
    For each db As String in distinctBadges
        For x As Integer = 0 To lines.GetUpperBound(0)
            If lines(x) < lines.GetUpperBound(0) AndAlso lines(x).StartsWith(db) And Not lines(x + 1).StartsWith(db) Then
                lastLines.Add(lines(x))
            End If
        Next
    Next

  4. #4
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,762

    Re: Filter Lines of Text File based on Value AND Date

    You will need to parse the line. While it doesn't appear as though you have a common delimiter, it isn't really all that bad.

    It looks like the line always starts with Badge# (notice the space). The next value is always numeric. The next value is always a comma followed by a space. The next value is always Test [PASSED|FAILED] ON: (notice the space). The next value is always a date in MM-dd-yyyy hh:mm:ss tt format.

    Take a look at this example:
    Code:
    Public Class Line
    
        Private _badge As String
        Public Property Badge() As String
            Get
                Return _badge
            End Get
            Set(value As String)
                If (_badge <> value) Then
                    _badge = value
                    OnBadgeChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnBadgeChanged()
            RaiseEvent BadgeChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event BadgeChanged(sender As Object, e As EventArgs)
    
        Private _result As String
        Public Property Result() As String
            Get
                Return _result
            End Get
            Set(value As String)
                If (_result <> value) Then
                    _result = value
                    OnResultChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnResultChanged()
            RaiseEvent ResultChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event ResultChanged(sender As Object, e As EventArgs)
    
        Private _timestamp As DateTime
        Public Property Timestamp() As DateTime
            Get
                Return _timestamp
            End Get
            Set(value As DateTime)
                If (_timestamp <> value) Then
                    _timestamp = value
                    OnTimestampChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnTimestampChanged()
            RaiseEvent TimestampChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event TimestampChanged(sender As Object, e As EventArgs)
    
        Public Shared Function TryParse(input As String, ByRef output As Line) As Boolean
            output = Nothing
    
            ' cannot be empty
            If (String.IsNullOrWhiteSpace(input)) Then
                Return False
            End If
    
            ' must start with Badge# 
            If (Not input.StartsWith("Badge# ")) Then
                Return False
            End If
    
            ' must have a numeric badge number
            input = input.Substring("Badge# ".Length)
            Dim index As Integer = 0
            Dim badge As String = String.Empty
            Do While index < input.Length AndAlso Char.IsDigit(input(index))
                badge &= input(index).ToString()
                index += 1
            Loop
            If (String.IsNullOrWhiteSpace(badge)) Then
                Return False
            End If
    
            ' must have a comma followed by a space follwed by Test 
            input = input.Substring(badge.Length)
            If (Not input.StartsWith(", Test ")) Then
                Return False
            End If
    
            ' must have PASSED or FAILED
            input = input.Substring(", Test ".Length)
            Dim result As String = String.Empty
            If (input.StartsWith("PASSED")) Then
                result = "PASSED"
            ElseIf (input.StartsWith("FAILED")) Then
                result = "FAILED"
            Else
                Return False
            End If
    
            ' must have a space, followed by ON:, followed by a space
            input = input.Substring(result.Length)
            If (Not input.StartsWith(" ON: ")) Then
                Return False
            End If
    
            ' the remainder must be a valid DateTime in MM/DD/YYYY hh:mm:ss tt format
            input = input.Substring(" ON: ".Length)
            Dim timestamp As DateTime
            If (Not DateTime.TryParseExact(input, "MM-dd-yyyy hh:mm:ss tt", Nothing, Nothing, timestamp)) Then
                Return False
            End If
    
            ' everything passes
            output = New Line() With {
                .Badge = badge,
                .Result = result,
                .Timestamp = timestamp
            }
            Return True
        End Function
    
    End Class
    Here is an example of parsing one of the lines:
    Code:
    Dim input As String = "Badge# 0040274, Test PASSED ON: 10-19-2021 05:17:44 AM"
    Dim output As Line = Nothing
    If (Line.TryParse(input, output)) Then
        Console.WriteLine("Badge#: {0}, Result: {1}, Timestamp: {2}", output.Badge, output.Result, output.Timestamp)
    Else
        Console.WriteLine("Unable to parse the line.")
    End If
    Console.ReadLine()
    From here, you would parse each line in your file to a Line reference and use LINQ to group by the appropriate data.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    thanks Paul.. I am getting an error on line 2 about Lambda.

  6. #6

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    thanks dday9... this is a little over my head. How would I run this for my text file? I have also not used LINQ

  7. #7
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,762

    Re: Filter Lines of Text File based on Value AND Date

    You would use IO.File.ReadAllLines (documentation) and then call the TryParse method on each line:
    Code:
    Dim parsedLines As New List(Of Line)
    For Each input As String In IO.File.ReadAllLines("my-file.txt")
        Dim output As Line = Nothing
        If (Line.TryParse(input, output)) Then
            parsedLines.Add(output)
        End If
    Next
    Then What you would do is use LINQ to group by the BadgeNumber and select the newest record in the group:
    Code:
    Dim results As IEnumerable(Of Line) = parsedLines.GroupBy(Function(l) l.Badge).Select(Function(g) parsedLines.Where(Function(l) l.Badge = g.Key).OrderByDescending(Function(l) l.Timestamp).First())
    Fiddle: https://dotnetfiddle.net/fV673t
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  8. #8
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    Change line 2 to…

    Code:
    Dim distinctBadges() As String = lines.Select(Function(l) l.SubString(0, 13)).Distinct.ToArray
    Then you can delete the next line…

    Code:
    distinctBadges = Array.ConvertAll(distinctBadges, Function(l) l.SubString(0, 13))

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    thanks paul.. I am still getting errors running the data above. I am getting an error about converting string to double.

  10. #10

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    thanks dday9 - I was able to get your code working, thanks for the help. The only issue I am seeing is when the date changes and the same badge# is found, it becomes very confused. I have attached the file I am using if you want to check it out. This file starts and ends with the same badge# as if it were a different date. The top section badge "0005269" ends with a PASS, and the bottom section on different date ends with a FAIL. The data output runs and shows 10/24/2021 as PASS. I need each date change to show PASS or FAIL for each Badge#. Thanks for your help

    Code:
    Imports System
    Imports System.Collections.Generic
    Imports System.IO
    Imports System.Linq
    Public Module Module1
        Public Sub Main()
    
            Dim parsedLines As New List(Of Line)
            For Each input As String In IO.File.ReadAllLines("C:\temp\esd.txt")
                Dim output As Line = Nothing
                If (Line.TryParse(input, output)) Then
                    parsedLines.Add(output)
                End If
            Next
    
            ' actual code here
            Dim results As IEnumerable(Of Line) = parsedLines.GroupBy(Function(l) l.Badge).Select(Function(g) parsedLines.Where(Function(l) l.Badge = g.Key).OrderByDescending(Function(l) l.Timestamp).First())
            For Each item In results
                Console.WriteLine("Badge: {0}, Result: {1}, Timestamp: {2}", item.Badge, item.Result, item.Timestamp)
    
                Dim strFile As String = "C:\temp\results.txt"
                Dim objWriter As New System.IO.StreamWriter(strFile, True)
                objWriter.WriteLine("Badge: {0}, Result: {1}, Timestamp: {2}", item.Badge, item.Result, item.Timestamp)
                objWriter.Close()
            Next
            Console.ReadLine()
        End Sub
    End Module
    
    Public Class Line
    
        Private _badge As String
        Public Property Badge() As String
            Get
                Return _badge
            End Get
            Set(value As String)
                If (_badge <> value) Then
                    _badge = value
                    OnBadgeChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnBadgeChanged()
            RaiseEvent BadgeChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event BadgeChanged(sender As Object, e As EventArgs)
    
        Private _result As String
        Public Property Result() As String
            Get
                Return _result
            End Get
            Set(value As String)
                If (_result <> value) Then
                    _result = value
                    OnResultChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnResultChanged()
            RaiseEvent ResultChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event ResultChanged(sender As Object, e As EventArgs)
    
        Private _timestamp As DateTime
        Public Property Timestamp() As DateTime
            Get
                Return _timestamp
            End Get
            Set(value As DateTime)
                If (_timestamp <> value) Then
                    _timestamp = value
                    OnTimestampChanged()
                End If
            End Set
        End Property
    
        Protected Overridable Sub OnTimestampChanged()
            RaiseEvent TimestampChanged(Me, EventArgs.Empty)
        End Sub
    
        Public Event TimestampChanged(sender As Object, e As EventArgs)
    
        Public Shared Function TryParse(input As String, ByRef output As Line) As Boolean
            output = Nothing
    
            ' cannot be empty
            If (String.IsNullOrWhiteSpace(input)) Then
                Return False
            End If
    
            ' must start with Badge# 
            If (Not input.StartsWith("Badge# ")) Then
                Return False
            End If
    
            ' must have a numeric badge number
            input = input.Substring("Badge# ".Length)
            Dim index As Integer = 0
            Dim badge As String = String.Empty
            Do While index < input.Length AndAlso Char.IsDigit(input(index))
                badge &= input(index).ToString()
                index += 1
            Loop
            If (String.IsNullOrWhiteSpace(badge)) Then
                Return False
            End If
    
            ' must have a comma followed by a space follwed by Test 
            input = input.Substring(badge.Length)
            If (Not input.StartsWith(", Test ")) Then
                Return False
            End If
    
            ' must have PASSED or FAILED
            input = input.Substring(", Test ".Length)
            Dim result As String = String.Empty
            If (input.StartsWith("PASSED")) Then
                result = "PASSED"
            ElseIf (input.StartsWith("FAILED")) Then
                result = "FAILED"
            Else
                Return False
            End If
    
            ' must have a space, followed by ON:, followed by a space
            input = input.Substring(result.Length)
            If (Not input.StartsWith(" ON: ")) Then
                Return False
            End If
    
            ' the remainder must be a valid DateTime in MM/DD/YYYY hh:mm:ss tt format
            input = input.Substring(" ON: ".Length)
            Dim timestamp As DateTime
            If (Not DateTime.TryParseExact(input, "MM-dd-yyyy hh:mm:ss tt", Nothing, Nothing, timestamp)) Then
                Return False
            End If
    
            ' everything passes
            output = New Line() With {
                .Badge = badge,
                .Result = result,
                .Timestamp = timestamp
            }
            Return True
        End Function
    
    End Class
    Attached Files Attached Files

  11. #11
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    Try this. Tested now...

    Code:
    Dim lines() As String = IO.File.ReadAllLines(path)
    Dim distinctBadges() As String = lines.Select(Function(l) l.Substring(0, 13)).Distinct.ToArray
    
    Dim lastLines As New List(Of String)
    For Each db As String In distinctBadges
        For x As Integer = 0 To lines.GetUpperBound(0) - 1
            If lines(x).StartsWith(db) And Not lines(x + 1).StartsWith(db) Then
                lastLines.Add(lines(x))
            End If
        Next
    Next
    This is assuming you want the last entry for every distinct badge. Using your text file, i found 12 lines...
    Last edited by .paul.; Oct 24th, 2021 at 04:00 PM.

  12. #12

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    Thanks paul... this does work now. My issue now is, how do I deal with separate dates? In my attached file to dday9 has Badge# 0005269 at the top and bottom of the file with different dates. One ended in PASS and the other ended in FAIL. How can that be dealt with?

  13. #13
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    Quote Originally Posted by mikeg71 View Post
    Thanks paul... this does work now. My issue now is, how do I deal with separate dates?
    Can you give an example of the text file? Do you want to take just the latest last line for each badge number, or the last line for each date for each badge number?

    Edit... I see the multiple dates in the text file. I still need to know which lines to select
    Last edited by .paul.; Oct 24th, 2021 at 06:11 PM.

  14. #14

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    Yes, in the text file I will need the last line of each badge # for each day.

  15. #15
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,481

    Re: Filter Lines of Text File based on Value AND Date

    Try this. I found 13 distinct badges over 3 different dates...

    Code:
    Dim lines() As String = IO.File.ReadAllLines("C:\Users\Paul\Desktop\esd - Copy.txt")
    Dim distinctBadges() As String = lines.Select(Function(l) l.Substring(0, l.IndexOf(","c))).Distinct.ToArray
    Dim days = lines.GroupBy(Function(l) Date.ParseExact(l.Substring(l.IndexOf(":"c) + 2, 10), "MM-dd-yyyy", Nothing)).ToArray
    
    Dim lastLines As New List(Of String)
    
    For Each distinctDay In days
        Dim dayLines() As String = distinctDay.ToArray
        For Each db As String In distinctBadges
            For x As Integer = 0 To dayLines.GetUpperBound(0) - 1
                If dayLines(x).StartsWith(db) And Not dayLines(x + 1).StartsWith(db) Then
                    lastLines.Add(dayLines(x))
                End If
            Next
        Next
        If Not lastLines.Last = distinctDay.Last Then
            lastLines.Add(distinctDay.Last)
        End If
    Next

  16. #16

    Thread Starter
    Addicted Member
    Join Date
    Jan 2021
    Posts
    178

    Re: Filter Lines of Text File based on Value AND Date

    ahhh thanks paul... this seems to be working well. thanks for the help and for your time, I greatly appreciate it.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width