-
Sorting a large text file ..... waves to Hack :)
Based on the other thread with the same name which is marked as reolved ...
I'm now using this Quicksort routine to sort a text file thats nearly 2 meg in size.
It contains fixed length records.
I towkred before, but now the file is bigger I am getting "Input Past End Of File" error messages on the line of code highlighted below.
Anyone know why ?
VB Code:
' Now sort file into alphabetical order
Open "C:\att_value.txt" For Input As #1
[B]str1 = Split(Input$(LOF(1), 1), vbCrLf)[/B]
Close #1
QuickSort str1, LBound(str1), UBound(str1)
Open "C:\att_value.txt" For Output As #1
Print #1, Join(str1, vbCrLf)
Close #1
-
Re: Sorting a large text file
Its OK I found out why.
I had a blank line at the end of the file ! Doh !!!
-
Re: Sorting a large text file
More annoyingly ... I still get the problem ... and its starting to drive me mad !!!
Why does the Split command give me end of file errors ?!!?!
If I edit the file, press backspace to remove the offending line, save it, and then run my code .. it works fine.
If I then edit the file, go the end and press carriage return, in affect putting the blank line back, save it, then run my code ... it works ?!?!!??!?!?!?!?!!!
Can someone help .... I'm fast running out of ideas here :(
-
Re: Sorting a large text file ..... waves to Hack :)
It sounds like your sort routine is putting an EOF marker in your file.
Does your sort routine check for BOF and EOF?
-
Re: Sorting a large text file ..... waves to Hack :)
I'm using these 2 routines :
VB Code:
Private Sub QuickSort(C() As String, ByVal First As Long, ByVal Last As Long)
Dim Low As Long, High As Long
Dim MidValue As String
Low = First
High = Last
MidValue = C((First + Last) \ 2)
Do
While C(Low) < MidValue
Low = Low + 1
Wend
While C(High) > MidValue
High = High - 1
Wend
If Low <= High Then
Swap C(Low), C(High)
Low = Low + 1
High = High - 1
End If
Loop While Low <= High
If First < High Then QuickSort C, First, High
If Low < Last Then QuickSort C, Low, Last
End Sub
Private Sub Swap(ByRef A As String, ByRef B As String)
Dim T As String
T = A
A = B
B = T
End Sub
And then sorting the fields thus :
VB Code:
' Now sort file into alphabetical order
Open "C:\att_value.txt" For Input As #1
str1 = Split(Input$(LOF(1), 1), vbCrLf)
Close #1
QuickSort str1, LBound(str1), UBound(str1)
Open "C:\att_value.txt" For Output As #1
Print #1, Join(str1, vbCrLf)
Close #1
-
Re: Sorting a large text file ..... waves to Hack :)
Join(str1, vbCrLf) is putting vbCrLf at the right end of the string. If you don't want that blank line, join to a string variable, delete the last 2 characters (take Right$() the length of the string - 2) and Print the string variable.
-
Re: Sorting a large text file ..... waves to Hack :)
Its not getting as far sa the "Join" statement. Its falling over on the "Split" statement, with "Input past end of file".
-
Re: Sorting a large text file ..... waves to Hack :)
Alternatively, as the offending line is always the last line, is there a quick way of opening a file, and then doing the equivalent of a <CTRL/END> and a <BACKSPACE>, which would in effect delete the line ?
-
Re: Sorting a large text file ..... waves to Hack :)
that error usually occurs when the file has null characters in it. Try loading the file by opening it for Binary and placing it in a buffer - see the file threads in the FAQ for a code sample.
-
Re: Sorting a large text file ..... waves to Hack :)
Or fix the file in Notepad - backspace out the last, blank, line. Then the program should work from then on. The problem is that, when the program DOES get past the Join line, and saves the sorted file, it outputs that last blank line. From then on, you can't get past the Input function.
-
Re: Sorting a large text file ..... waves to Hack :)
Quote:
Originally Posted by Al42
Join(str1, vbCrLf) is putting vbCrLf at the right end of the string.
The Join() function is not putting vbCrLf at the end of the string. The Print statement is adding vbCrLf to the file after outputting the joined string. To avoid this, use a semicolon at the end of the statement.
VB Code:
Print #1, Join(str1, vbCrLf);
-
Re: Sorting a large text file ..... waves to Hack :)
Bushmobile I think you solved it !
Opening the file in binary before performing the split certainly stops an "end of file" error being thrown up, and the sort works fine.
The only "slight niggle" I am left with is that I now have a blank line at the top of the file rather than at the bottom, no doubt due to the sort.
Not sure if this is going to annoy the import routines that pick this file up, but I can't try them out until later.
Logophobic, I tried your method of suffixing a semicolon to try and prevent the blank line appearing, but that didn't seem to make any difference.
Anyway ... cheers guys. At least I'm not crashing now :)
-
Re: Sorting a large text file ..... waves to Hack :)
Hey TheBionicOrange, Happy New Year :wave:
As the result is in an Array (I think) the can you loop thru the Array and only output (Print to file) anything other than a Blank line?
-
Re: Sorting a large text file ..... waves to Hack :)
Hey Bruce :wave: happy new year to you too !
As for your second sentence ... ummm .... eh ? ha ha
If it helps .. I don't actually print any blank lines to my file. There just seemed to be one at the end when I edited it, although strangely its not shown if I throw the file into MS Word, only notepad or wordpad ?! I presume Word must strip out the null (or whatever it is) at the end automatically.
-
Re: Sorting a large text file ..... waves to Hack :)
$10 its a 'spare' Carriage return (during the Join).
To test, try removing the last Array element - ReDim Preserve str1(Ubound(str1) - 1)
-
Re: Sorting a large text file ..... waves to Hack :)
You owe me $10 ... although I would rather have it in pounds :)
The "blank line" is in the file BEFORE I even get to the join statement.
Its in there before I even get as far as the SPLIT statement.
-
Re: Sorting a large text file ..... waves to Hack :)
Why do you think I mentione dollars - and Aussie Dollars too. That equates to 5 pound ;)
If the blank line is in the base document, then (noting your Inputing as Binary) can you loop thru the Array and remove it prior to the Quick sort?
-
Re: Sorting a large text file ..... waves to Hack :)
Trying that now ..... (takes about 15 minutes to extract).
Watch this space ...
-
Re: Sorting a large text file ..... waves to Hack :)
Bu**er ! Looks like I need to take a hammer to my piggy bank !
You were quite right.
ReDim Preserve str1(Ubound(str1) - 1)
.. did indeed remove the blank line.
Bruce ... thank you very much for the effort. Also to everyone else who contributed to this thread.
You have no idea how annoyed I was at the flaming line yesterday !
p.s. I did try and add to your rating, but apparently I have to "spread it around a bit" first. Personally I think thats a pretty stupid rule !
-
Re: Sorting a large text file ..... waves to Hack :)
No problem. Good to see you back on the board.
However, will that 'extra' vbCrLf always be pressent? In any case you could check for the blank line (in the last element) before ReDimming the Array.
-
Re: Sorting a large text file ..... waves to Hack :)
Yes it almost inevitably will be, but putting the check in wouldn't hurt.
Thanks again for your help ... top man :)
Steve.
-
Re: Sorting a large text file ..... waves to Hack :)
I have to ask. Do you want to improve speed at all (is 15 minutes acceptable)?
-
Re: Sorting a large text file ..... waves to Hack :)
No .. 15 minutes in my book is NOT acceptable. Unfortunately, due to the crappy system we have here, thats about as fast as its going to get.
Its an ALPHA box running VMS. I'm tapping into it via middleware called "Attunity", which is not the most astounding piece of software I have ever had to deal with.
We are in a transitional period. One day (probably when I am retired) we will eventually be on Oracle, or at least something thats not made of wood !
15 minutes is only at this time due to the number of users on the system. If I ran it at 5pm its about 5 minutes. our Operations Department will be running it on the weekend, where that time will go down to about 2-3 minutes, which is much more acceptable.
-
Re: Sorting a large text file ..... waves to Hack :)
Ah, so you use an abacus too.
Anyhoo, all the best. ;)