|
-
Jan 25th, 2007, 09:40 AM
#1
Thread Starter
New Member
[02/03] Problem Sorting 2 D Array
I have a log file that I am trying to parse out and sort. Eventually i will print it out into xml. I have already pulled the file out of the log and read it into memory, i have also parsed the log file into a two dimensional array to be able to sort it. The reason i have to parse and put into a two d array is because of this:
07-01-15 00:00:00 00116: < RCPT TO:<[email protected]>
07-01-15 00:00:00 00120: Connection opened by pool-55-555-55-55. <blah>
07-01-15 00:00:00 00116: > 551 user does not exist
If you see in the file the third set of numbers is out of order.
00116
00120
00116
So what i have done is put those values into one side of the array and put the rest of the file into the other side. I now want to sort on column 0 and then write it back out to the file.
This is what the array would look like:
00116 | 07-01-15 00:00:00 00116: < RCPT TO:<[email protected]>
00120 | 07-01-15 00:00:00 00120: Connection opened by pool-55-555-55-5
00116 | 07-01-15 00:00:00 00116: > 551 user does not exist
The result would look like:
00116 | 07-01-15 00:00:00 00116: < RCPT TO:<[email protected]>
00116 | 07-01-15 00:00:00 00116: > 551 user does not exist
00120 | 07-01-15 00:00:00 00120: Connection opened by pool-55-555-55-5
The Code:
This function Reads the file in and splits it into the two d array:
VB Code:
Private Function OrderFile(ByVal fileLocation As String)
Dim fs As System.IO.FileStream
Dim sr As StreamReader
Try
'We have our variables, lets attempt to open it
fs = New System.IO.FileStream(fileLocation, _
FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite)
Catch
'Make sure that it exists, and if it doesn't display an error
MessageBox.Show("Count not open the file. Make sure '" & fileLocation & _
"' exists in the location shown.")
Finally
'It works, but we are going to try one last method
If fs.CanRead = True Then
'This is where we would read from the file
txtFileDisplay.Text = "File opened!"
'We are done with the file now, so close it
fs.Close()
End If
End Try
''Create the StreamReader and associate the filestream with it
Dim myReader As New System.IO.StreamReader(fileLocation)
''Read the entire text, and set it to a string
Dim sFileContents As String = myReader.ReadToEnd()
''Print it to the textbox
txtFileDisplay.Text = sFileContents
''Close everything when you are finished
myReader.Close()
' Split file data into array of lines
Dim textReader As New System.IO.StreamReader(fileLocation)
Dim WholeLine As String
Dim lines(,) As String
Dim r As Integer
Dim Count = 0 '
WholeLine = textReader.ReadLine()
While Not WholeLine.Length <= 0
ReDim Preserve lines(1, Count)
lines(0, Count) = WholeLine.Substring(18, 5)
r = WholeLine.Length
lines(1, Count) = WholeLine.Substring(0, r)
Count = Count + 1
WholeLine = textReader.ReadLine()
End While
Call QuickSort(lines, 0, 0, 0)
End Function
This is the Quick Sort to sort the two d array:
VB Code:
Sub QuickSort(ByVal vec, ByVal loBound, ByVal hiBound, ByVal SortField)
'==--------------------------------------------------------==
'== Sort a 2 dimensional array on SortField ==
'== ==
'== This procedure is adapted from the algorithm given in: ==
'== ~ Data Abstractions & Structures using C++ by ~ ==
'== ~ Mark Headington and David Riley, pg. 586 ~ ==
'== Quicksort is the fastest array sorting routine for ==
'== unordered arrays. Its big O is n log n ==
'== ==
'== Parameters: ==
'== vec - array to be sorted ==
'== SortField - The field to sort on (2nd dimension value) ==
'== loBound and hiBound are simply the upper and lower ==
'== bounds of the array's 1st dimension. It's probably ==
'== easiest to use the LBound and UBound functions to ==
'== set these. ==
'==--------------------------------------------------------==
Dim pivot(), loSwap, hiSwap, temp, counter
ReDim pivot(UBound(vec, 2))
loSwap = LBound(vec)
hiSwap = UBound(vec, 2)
'== Two items to sort
If hiBound - loBound = 1 Then
If vec(loBound, SortField) > vec(hiBound, SortField) Then Call SwapRows(vec, hiBound, loBound)
End If
'== Three or more items to sort
For counter = 0 To UBound(vec, 2)
pivot(counter) = vec(Int((loBound + hiBound) / 2), counter)
vec(Int((loBound + hiBound) / 2), counter) = vec(loBound, counter)
vec(loBound, counter) = pivot(counter)
Next
loSwap = loBound + 1
hiSwap = hiBound
Do
'== Find the right loSwap
While loSwap < hiSwap And vec(loSwap, Int(SortField)) <= pivot(SortField)
loSwap = loSwap + 1
End While
'== Find the right hiSwap
While vec(hiSwap, SortField) > pivot(Int(SortField))
hiSwap = hiSwap - 1
End While
'== Swap values if loSwap is less then hiSwap
If loSwap < hiSwap Then Call SwapRows(vec, loSwap, hiSwap)
Loop While loSwap < hiSwap
For counter = 0 To UBound(vec, 2)
vec(loBound, counter) = vec(hiSwap, counter)
vec(hiSwap, counter) = pivot(counter)
Next
'== Recursively call function .. the beauty of Quicksort
'== 2 or more items in first section
If loBound < (hiSwap - 1) Then Call QuickSort(vec, loBound, hiSwap - 1, Int(SortField))
'== 2 or more items in second section
If hiSwap + 1 < hiBound Then Call QuickSort(vec, hiSwap + 1, hiBound, Int(SortField))
Call PrintArray(pivot, 0, 0)
End Sub 'QuickSort
Swap Rows Used by QuickSort:
VB Code:
Sub SwapRows(ByVal ary, ByVal row1, ByVal row2)
'== This proc swaps two rows of an array
Dim x, tempvar
For x = 0 To UBound(ary, 2)
tempvar = ary(row1, x)
ary(row1, x) = ary(row2, x)
ary(row2, x) = tempvar
Next
End Sub 'SwapRows
Print New Array. This is where i am getting an error it looks like the array getting passed to this function isn't created correctly or filling it.
VB Code:
Sub PrintArray(ByVal vec, ByVal lo, ByVal hi)
'==-----------------------------------------==
'== Print out an array from the lo bound ==
'== to the hi bound. ==
'== ==
'==-----------------------------------------==
Dim sw As StreamWriter = File.CreateText("input.txt")
Dim i, j
lo = LBound(vec)
hi = UBound(vec)
For i = lo To hi
For j = 0 To UBound(vec)
sw.Write(vec(i, j))
Next
Next
sw.Close()
End Sub 'PrintArray
-
Jan 25th, 2007, 12:01 PM
#2
Re: [02/03] Problem Sorting 2 D Array
You could save yourself a lot of code...
First off - I don't have VB installed, so the code sample will be in C#.
The code makes use of the System.Collections.IComparer interface to compare two values - in this case, the string arrays. To do the actual sorting, the Array.Sort method has an overload that accepts an IComparer.
The comparer:
PHP Code:
/// <summary>
/// Can be used by the Array.Sort method to sort string array values by a specified index.
/// </summary>
public class StringArrayComparer: IComparer
{
private int _compareIndex;
/// <summary>
/// Gets or sets the array index that will be used to sort.
/// </summary>
public int CompareIndex
{
get { return _compareIndex; }
set { _compareIndex = value; }
}
/// <summary>
/// Compares the values at the <see cref="CompareIndex"/> index.
/// </summary>
/// <remarks>Assumes the <paramref name="x" /> and <paramref name="y" /> parameters are string arrays.</remarks>
/// <param name="x"></param>
/// <param name="y"></param>
/// <returns></returns>
public int Compare(object x, object y)
{
string[] xValues = (string[]) x;
string[] yValues = (string[]) y;
return (xValues[this.CompareIndex].CompareTo(yValues[this.CompareIndex]));
}
}
The test code:
PHP Code:
/// <summary>
/// Summary description for Class1.
/// </summary>
class Class1
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
string[][] values = new string[][]
{
new string[] {"0105", "Pom"},
new string[] {"0015", "Foo"},
new string[] {"0016", "Abc"},
new string[] {"0012", "Lkj"},
new string[] {"0001", "Tyu"}
};
StringArrayComparer comparer = new StringArrayComparer();
comparer.CompareIndex = 0;
Array.Sort(values, comparer);
for (int i = 0; i < values.Length; i++)
{
Console.Write("Row [{0}]: ", i.ToString());
for (int j = 0; j < values[i].Length; j++)
Console.Write("{0}\t", values[i][j]);
Console.Write("\n");
}
Console.WriteLine("\nDone!");
Console.ReadLine();
}
}
-
Jan 25th, 2007, 12:35 PM
#3
Re: [02/03] Problem Sorting 2 D Array
Maybe this isn't very accurate or even something you want to explore but why don't you put the numbers at the beginning of the string?? Load them into a list and use the Sort method that comes with it? You may want to check this out. I don't know if it would always be correct but maybe you could adapt it to help.
VB Code:
Dim myString As New List(Of String)
myString.Add("00120: 07-01-15 00:00:00: Connection opened by pool-55-555-55-55. <blah>")
myString.Add("00116: 07-01-15 00:00:00: > 551 user does not exist")
myString.Sort()
ListBox1.Items.AddRange(myString.ToArray)
-
Jan 25th, 2007, 12:50 PM
#4
Re: [02/03] Problem Sorting 2 D Array
With the text format you have in the text file and what you want to do, I think just a 1-d string array will get the job done.
Try this:
VB Code:
'Open the file & read all the contents
Dim reader As IO.StreamReader = IO.File.OpenText("path here")
Dim strContent As String = reader.ReadToEnd
reader.Close()
'Split the file contents by line to an array then sort it
Dim lines() As String = strContent.Split(ChrW(13))
Array.Sort(lines)
'Write back to the file
Dim writer As IO.StreamWriter = IO.File.CreateText("path here")
For i As Integer = 0 To lines.GetUpperBound(0)
writer.WriteLine(lines(i))
Next
writer.Close()
-
Jan 25th, 2007, 12:56 PM
#5
Re: [02/03] Problem Sorting 2 D Array
 Originally Posted by stimbo
Maybe this isn't very accurate or even something you want to explore but why don't you put the numbers at the beginning of the string?? Load them into a list and use the Sort method that comes with it? You may want to check this out. I don't know if it would always be correct but maybe you could adapt it to help.
VB Code:
Dim myString As New List(Of String)
myString.Add("00120: 07-01-15 00:00:00: Connection opened by pool-55-555-55-55. <blah>")
myString.Add("00116: 07-01-15 00:00:00: > 551 user does not exist")
myString.Sort()
ListBox1.Items.AddRange(myString.ToArray)
To Stimbo:
Oopssss.... He's using [02/03] so List isn't an option...
To Pricejt:
Regarding Stimbo code, you would use an Arraylist instead of List in [02/03]
-
Jan 25th, 2007, 01:21 PM
#6
Thread Starter
New Member
Re: [02/03] Problem Sorting 2 D Array
Thank you guys for all your feedback. axion_sa I am not very good with C# so dont know how easy that will be for me.
Stanav I tried implimenting your code but it didn't sort it. I am thinking that its not spliting on the right part of the file.
stimbo: I dont know about your method The log file i have probably has somewhere around 64,000 lines that i would have to loop through and add. But i don tkonw maybe that is the best way.
-
Jan 25th, 2007, 02:43 PM
#7
Re: [02/03] Problem Sorting 2 D Array
Ooopps indeed, I didn't even see the 02/03.
If there is that many lines then ICompare would definitely be better.
There's a really nice example of it (ICompare) somewhere on this forum. Do a search for it and you will probably come across it. I think one of the moderators posted it. You should be able to adapt it from that.
-
Jan 25th, 2007, 02:47 PM
#8
Re: [02/03] Problem Sorting 2 D Array
 Originally Posted by pricejt
Stanav I tried implimenting your code but it didn't sort it. I am thinking that its not spliting on the right part of the file.
That could be because the lines in your text file are terminated with a new line character instead of a cartridge return character.
Try replacing this line
VB Code:
Dim lines() As String = strContent.Split(ChrW(13))
With these lines and see if it works
VB Code:
Dim seperators As Char() = {ChrW(10), ChrW(13)}
Dim lines() As String = strContent.Split(seperators)
-
Jan 25th, 2007, 02:56 PM
#9
Thread Starter
New Member
Re: [02/03] Problem Sorting 2 D Array
Stanav: you are right that worked. However with the 64,000 lines it is very slow. Do you think that the iCompare would be a lot faster than this way?
-
Jan 25th, 2007, 02:57 PM
#10
Re: [02/03] Problem Sorting 2 D Array
**sigh
http://www.kamalpatel.net/ConvertCSharp2VB.aspx
VB Code:
Public Class StringArrayComparer
Implements IComparer
Private _compareIndex As Integer
Public Property CompareIndex() As Integer
Get
Return _compareIndex
End Get
Set (ByVal Value As Integer)
_compareIndex = value
End Set
End Property
Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer
Dim xValues() As String = CType(x, String())
Dim yValues() As String = CType(y, String())
Return (xValues(Me.CompareIndex).CompareTo(yValues(Me.CompareIndex)))
End Function
End Class
'----------------------------------------------------------------
' Converted from C# to VB .NET using CSharpToVBConverter(1.2).
' Developed by: Kamal Patel ([url]http://www.KamalPatel.net[/url])
'----------------------------------------------------------------
VB Code:
'Error: Converting Properties
Class Class1
<STAThread> _
Shared Sub Main(ByVal args() As String)
Dim values()() As String = New String()() {New String(){"0105", "Pom"}
,
New String()
{
"0015", "Foo"
}
,
New String()
{
"0016", "Abc"
}
,
New String()
{
"0012", "Lkj"
}
,
New String()
{
"0001", "Tyu"
}
}
Dim comparer As StringArrayComparer = New StringArrayComparer()
comparer.CompareIndex = 0
Array.Sort(values, comparer)
Dim i As Integer
For i = 0 To values.Length- 1 Step i + 1
Console.Write("Row [{0}]: ", i.ToString())
Dim j As Integer
For j = 0 To values(i).Length- 1 Step j + 1
Console.Write("{0}\t", values(i)(j))
Next
Console.Write("\n")
Next
Console.WriteLine("\nDone!")
Console.ReadLine()
End Sub
End Class
'----------------------------------------------------------------
' Converted from C# to VB .NET using CSharpToVBConverter(1.2).
' Developed by: Kamal Patel ([url]http://www.KamalPatel.net[/url])
'----------------------------------------------------------------
-
Jan 25th, 2007, 03:30 PM
#11
Thread Starter
New Member
Re: [02/03] Problem Sorting 2 D Array
Ok Stanav: I take that back it work really fast but the probelm is it dosen't sort it all correctly.
07-01-15 00:00:00 00116:
07-01-15 00:00:00 00116:
07-01-15 00:00:00 00120:
07-01-15 00:00:01 00109:
07-01-15 00:00:01 00109:
07-01-15 00:00:01 00116:
07-01-15 00:00:01 00116:
07-01-15 00:00:02 00109:
07-01-15 00:00:02 00109:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:05 00109:
07-01-15 00:00:05 00109:
07-01-15 00:00:05 00119:
07-01-15 00:00:05 00120:
07-01-15 00:00:06 00121:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00120:
07-01-15 00:00:07 00120:
07-01-15 00:00:07 00120:
You can see that it sorted them all but i need it to ignore the second set of numbers the "00:00:07" and just sort on that third set. Maybe concatinate the third number onto the front of the string. but i guess it would still compare the 2nd number too.
ex. 00109 is all over the place in this file
Last edited by pricejt; Jan 25th, 2007 at 03:37 PM.
-
Jan 25th, 2007, 03:42 PM
#12
Re: [02/03] Problem Sorting 2 D Array
 Originally Posted by pricejt
Ok Stanav: I take that back it work really fast but the probelm is it dosen't sort it all correctly.
07-01-15 00:00:00 00116:
07-01-15 00:00:00 00116:
07-01-15 00:00:00 00120:
07-01-15 00:00:01 00109:
07-01-15 00:00:01 00109:
07-01-15 00:00:01 00116:
07-01-15 00:00:01 00116:
07-01-15 00:00:02 00109:
07-01-15 00:00:02 00109:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:02 00120:
07-01-15 00:00:05 00109:
07-01-15 00:00:05 00109:
07-01-15 00:00:05 00119:
07-01-15 00:00:05 00120:
07-01-15 00:00:06 00121:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00119:
07-01-15 00:00:07 00120:
07-01-15 00:00:07 00120:
07-01-15 00:00:07 00120:
You can see that it sorted them all but i need it to ignore the second set of numbers the "00:00:07" and just sort on that third set. Maybe concatinate the third number onto the front of the string. but i guess it would still compare the 2nd number too.
ex. 00109 is all over the place in this file
With a little manipulation, you can get it to sort the way you want... Otherwise, just give me a few minutes to fix it
-
Jan 25th, 2007, 04:42 PM
#13
Re: [02/03] Problem Sorting 2 D Array
Try this now... I didn't test it (because I don't have the text file), but it should work though.
VB Code:
Private Function Foo(ByVal strInputFilePath As String, Optional ByVal strOutputFilePath As String = "") As Boolean
Try
'Open the file & read all the contents
Dim reader As IO.StreamReader = IO.File.OpenText(strInputFilePath)
Dim strContent As String = reader.ReadToEnd
reader.Close()
'Split the file contents by line to an array then sort it
Dim seperators As Char() = {Chr(10), Chr(13)}
Dim temps() As String = strContent.Split(seperators)
Dim strBldr As New System.Text.StringBuilder
Dim i As Integer = 0
For i = 0 To temps.GetUpperBound(0)
If temps(i).Trim().Length >= 23 Then
strBldr.Append(temps(i).Substring(18, 5) & temps(i) & ChrW(13))
End If
Next
Dim lines() As String = strBldr.ToString.Split(seperators)
Array.Sort(lines)
'Write back to the file
Dim outputPath As String = strInputFilePath
If strOutputFilePath <> "" Then
outputPath = strOutputFilePath
End If
Dim writer As IO.StreamWriter = IO.File.CreateText(outputPath)
For i = 0 To lines.GetUpperBound(0)
writer.WriteLine(lines(i).Substring(5))
Next
writer.Close()
Catch ex As Exception
MsgBox(ex.Message)
Return False
End Try
Return True
End Function
Last edited by stanav; Jan 25th, 2007 at 04:48 PM.
-
Jan 25th, 2007, 04:56 PM
#14
Thread Starter
New Member
Re: [02/03] Problem Sorting 2 D Array
I got the original one working by just parsing the thrid number to the front but when we threw an even bigger file at it we had memory errors.
I then tried the function you just sent and it threw an error that length cannot be less than zero
Guy im making this for also gave me a new requirement haha. After it groups those third numbers together it then has to order it by the second number which is a time code. Oh the joy.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|