Complex: Text File to Array and Get Min/Max Values
Hi All,
Would like to kindly see if there are suggestions on how to best open and read a text file in the following example format to extract the following information:
Max X Value
Min X Value
Max Y Value
Min Y Value
File Format Example is as follows: each file would have 10-100+ entries
[Header]
FunctionPlot=Parametric
X=85
Y=-340
Rotation=90
File=
X1=9211
Y2=2000
Param0=10
Param1=20
[Header]
FunctionPlot=Parametric
X=-122
Y=120
Rotation=180
File=
X1=20000
Y2=10000
Param0=10
Param1=20
I am currently experimenting with using an array to maintain performance and splitting into individual lines, and it's coming along but slowly. Was wondering if there are other code suggestions or ideas on how to best get these values that I may be overlooking...thanks for any insight!
Re: Complex: Text File to Array and Get Min/Max Values
From the looks of it you have to open the whole file and parse its contents to get what you want.
Re: Complex: Text File to Array and Get Min/Max Values
Do the X and Y values have to remain in pairs? Are you looking for X and Y values as in their positions after the X and Y values have been added to their X1 and Y1 values?
If the values do not have to remain in pairs, then reading in one X and one Y value at a time while keeping track of the MIN and MAX numbers is probably the fastest. This might look like:
Dim xMax as integer
Dim xMin as integer
Dim yMax as integer
Dim yMin as integer
Dim xRead as Integer
Dim yRead as Integer
Sub Main()
While(not end of file)
ReadXY()
CompareXY()
End While
End Sub Main
function ReadXY() as Array
...
End function
function CompareXY()
compare xRead and xMax if xRead is larger than xMax replace xMax
else compare xRead to xMin if xRead is smaller than xMin replace xMin
else do nothing with xRead
compare yRead and yMax if yRead is larger than yMax replace yMax
else compare yRead to yMin if yRead is smaller than yMin replace yMin
else do nothing with yRead
End Function
If the values have to stay in pairs then you would read the x and y s into arrays of size two in the ReadXY() function. The CompareXY() function would look for the lowest pair. For example if xRead was larger than xMax but yRead was not larger than yMax than the code would do nothing with that pair.
If you are looking for the x and y Min and x and y Max after adding them to X1 and Y1 then you should do math in the read function or a separate function before calling CompareXY()
Re: Complex: Text File to Array and Get Min/Max Values
Hi and thanks very much for the replies...! It looks like that is similar to what I am working through now...parsing the X/Y from the file into an array and manually comparing similar to your function call but man, it seems like such a "brute force" approach, I just didn't know if I was overlooking the obvious here.
Anyway, one suggestion that somewhere I read had an intriguing concept, something about you can sort the array and when you do that you would logically get the max and min value of the array - sounds interesting but I haven't had a chance to try that out yet!!! If anyone has any experience with this, please help! That would be a way I could have an XArray Sorted and a YArray Sorted, then easily get the max/min values from each...looking forward to other perspectives, thanks again!
1 Attachment(s)
Re: Complex: Text File to Array and Get Min/Max Values
For an ad-hoc data format like this pseudo-INI one you have there is really no getting around doing some manual parsing. What you do beyond that is another issue of course, and there are many options.
For a mere 100 records performance isn't really much of an issue no matter what approach you take. I think the choice really depends on what else you want to do with the data.
An array is the most primitive data structure in all of computing. Arrays are lightweight and very versatile, until you want to do anything very complex. Very quickly you find uses for hash tables and more sophisticated things such as a component that wraps a data structure with prebuilt and pretested logic to perform common functions.
The VB6 Collection is one of these. Windows includes a much more powerful example of this sort of thing though: the ADO Recordset. Since ADO 2.1 (1998 or so) it has been possible to use a client-side cursor Recordset as a kind of "Super Collection."
Constructing one of these is pretty simple:
Code:
Private Function FabricateRS() As ADODB.Recordset
Set FabricateRS = New ADODB.Recordset
With FabricateRS
.CursorLocation = adUseClient
With .Fields
.Append "FunctionPlot", adVarWChar, 50
.Append "X", adInteger
.Append "Y", adInteger
.Append "Rotation", adInteger
.Append "File", adVarWChar, 255
.Append "X1", adInteger
.Append "Y2", adInteger
.Append "Param0", adInteger
.Append "Param1", adInteger
End With
.Open
!X.Properties!Optimize = True
!Y.Properties!Optimize = True
End With
End Function
Optimizing the X and Y fields was done above since you wanted to sort and search on those fields. This causes the Recordset to maintain a high-speed index on those fields.
Loading the data is simple too:
Code:
Private Function FetchDataRS(ByVal FileName As String) As ADODB.Recordset
Dim intFile As Integer
Dim strLine As String
Dim strParts() As String
Set FetchDataRS = FabricateRS()
intFile = FreeFile(0)
Open FileName For Input As #intFile
With FetchDataRS
Do Until EOF(intFile)
Line Input #intFile, strLine
strLine = Trim$(strLine)
If Len(strLine) > 0 Then
If UCase$(strLine) = "[HEADER]" Then
.AddNew
Else
strParts = Split(strLine, "=")
.Fields(strParts(0)).Value = strParts(1)
End If
End If
Loop
.MoveFirst
End With
Close #intFile
End Function
Then it is simply a matter of using these routines:
Code:
Log "Loading data"
Set rsData = FetchDataRS("data.txt")
With rsData
Log "Data loaded: " & CStr(.RecordCount) & " records"
Log
.Sort = "X DESC"
Log "Max X Value = " & CStr(!X.Value)
.Sort = "X ASC"
Log "Min X Value = " & CStr(!X.Value)
.Sort = "Y DESC"
Log "Max Y Value = " & CStr(!Y.Value)
.Sort = "Y ASC"
Log "Min Y Value = " & CStr(!Y.Value)
.Close
End With
Set rsData = Nothing
Log
Log "Done!"
The advantages here only begin with the fact that we've parsed and stored all of the fields. We can also easily use the Find method as needed, and use data binding to display information in many ways (including via MSChart). We can also modify values, delete records, and add whole new records as required. It wouldn't take much to write the data back out to a similar disk file after updating.
Re: Complex: Text File to Array and Get Min/Max Values
dilettante, very interesting...I tried your code and of course it worked great. The only issue is that my file contents (and of course I didn't mention this!) are not always perfectly structured. For example, the beginning 20 or so lines would contain additional header info and need to be skipped, sometimes an X= value is not written out, sometimes a Y=value is not written, sometimes they both are not written out (and when the x and/or y are null they assume a 0 value), etc., That's when testing the code I was getting subscript out of range errors. I tried modifying the code, but was unable to successfully get the results I wanted...is it because just like databases the structure needs to be fully accounted for prior to the data load? If so, perhaps this solution may prove a little too rigid for my needs, but it damn sure is interesting and was very fast.
The more I think about it, the more I need a parser/function that just reads through the file and only pulls values where a [Header] section is located of course. Right now, I'm going to experiment with using an .INI function as an alternative to see if that might be practical. Thanks for any more suggestions and ideas anyone can offer!
Re: Complex: Text File to Array and Get Min/Max Values
Alright all, this is proving to be a lot more complex than originally anticipated. I have the entire file split into an array by VBCRLF. How can I then load that file array into a new array that only has the X= or Y=Values??? Basically, how can I build another array out of an existing array (leveraging a loop via InStr check to see if value is X=)....
If I can get another array say that has all of the X= values in there, then I have a function working that will get me both the min and max values...but I just can't feed it the right source data now...GRRR! So close!
Re: Complex: Text File to Array and Get Min/Max Values
OK, I'm revering back to an array and having success so far...it is not ideal but performance wise is pretty good, I know there is a lot of room for improvement, but wanted to share what I have so far:
Code:
Call FileLoadToArray(asFileContents, sFileName)
For lThisLine = 1 To UBound(asFileContents)
If InStr(asFileContents(lThisLine), "X=") Then
xarrayvalues = xarrayvalues & "," & Mid$(asFileContents(lThisLine), 3, Len(asFileContents(lThisLine)))
End If
Next
xarray = Split(xarrayvalues, ",")
'Get the MAX X Value
For lngPosition = LBound(xarray) To UBound(xarray)
If xarray(lngPosition) <> "" Then
Debug.Print "XARRAY(lngPosition) is: " & xarray(lngPosition)
If Val(xarray(lngPosition)) > Val(lngMaxDistance) Then
Debug.Print xarray(lngPosition) & " is > " & lngMaxDistance
lngMaxDistance = Val(xarray(lngPosition))
End If
End If
Next lngPosition
lngMinDistance = 50000
'Get the MIN X Value
For lngPosition = LBound(xarray) To UBound(xarray)
If xarray(lngPosition) <> "" Then
Debug.Print "XARRAY MIN lngPosition Read is: " & xarray(lngPosition)
If Val(xarray(lngPosition)) < Val(lngMinDistance) Then
Debug.Print xarray(lngPosition) & " is < " & lngMinDistance
lngMinDistance = Val(xarray(lngPosition))
End If
End If
Next lngPosition
Debug.Print "**********************************"
Debug.Print "lngMaxDistance is " & lngMaxDistance
Debug.Print "lngMinDistance is " & lngMinDistance
Debug.Print "**********************************"
1 Attachment(s)
Re: Complex: Text File to Array and Get Min/Max Values
Well here is another version that is more aggressive about tolerating "garbage" in the file. It also forces in specific default values for missing items.
If you compare it with the first one it shouldn't be too hard to make sense of. This version is using a Variant containing an array of Variant arrays to pass the "specs" for the file and the resulting Recordset. Variants are usually avoided like The Plague because they have performance problems, but that shouldn't matter much as they're being used here.
If you had megabyte files to process, and tons of them, it might be worth the headache of writing and debugging a bunch of code to do more efficient parsing. Then it might even make sense to use arrays to hold the data.
The tradeoff to get seriously better performance is that you end up with a very large amount of intricate logic that can be difficult to modify for another purpose. Often you can end up in this position within a single project when you find that requirements have changed.
Anyway... try feeding one of your "nasty" files to this version and see what happens.
Re: Complex: Text File to Array and Get Min/Max Values
dilettante, thanks man for sticking with this, I'll try this solution over the next few days, I'll be busy enjoying a Michigan lake my friend...thanks for helping your homeslice out... :)
Re: Complex: Text File to Array and Get Min/Max Values
Hope you enjoy it. We've had an odd summer but the last few days have been decent, though not as warm as normal.
Don't spend a lot of time on my solution above if you have the other way working. It's just another way of approaching the problem.