|
-
Nov 18th, 2011, 08:52 PM
#1
Thread Starter
Addicted Member
[RESOLVED] Fastest way to read in a large text file (4GB+)
Hi
I seem to be hitting some limits on the Get statement when reading in a text file > 2GB in binary.
eg. Get #ff, byte position/number, buffer
When byte position is > 2,147,000,000 I get the error bad record number, I presume as the Get function was built using something similar to a long to hold the the byte number which has a maximum value of 2,147,000,000.
Unfortunately, I need to be able to read in large text files fast, and line input is just not up to the job due to speed issues, and byte arrays seem to max out >400MB and <500MB. Are there any other options?
All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.
The plural of sun is stars you Catholic turkeys.
-
Nov 19th, 2011, 12:14 AM
#2
Re: Fastest way to read in a large text file (4GB+)
I've been Google'ing around and can't find any hits for exceeding a limit using Get, although the documentation does state that the record number is a Long.
You could try the ADODB Stream Object
Code:
'
' Assumes a reference to Microsoft ActiveX Data Objects 2.8 Library
'
Dim st As ADODB.Stream
Dim strData As String
Set st = New ADODB.Stream
st.Type = adTypeText 'Text Data
st.LineSeparator = adCRLF 'Line Terminmator is CRLF
st.Charset = "ascii" 'Character set is ASCII
st.Open
st.LoadFromFile ("c:\MyApp\MyData.txt") ' Whatever your File Name is
'
' Read each record
'
Do Until st.EOS
strData = st.ReadText(adReadLine)
'
'
'etc
'
Loop
st.Close
Set st = Nothing
-
Nov 19th, 2011, 12:39 AM
#3
Thread Starter
Addicted Member
Re: Fastest way to read in a large text file (4GB+)
thanks for the reply Doogle, I just started googling and found this:
http://support.microsoft.com/kb/189981/en
so I'll have a play around with it first as it contains pure API calls without any dependency and might be faster,
there are probably some functions in the FileSystemObject that might be able to get around the 2GB limit also
eg http://support.microsoft.com/kb/186118
All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.
The plural of sun is stars you Catholic turkeys.
-
Nov 19th, 2011, 04:11 AM
#4
Re: Fastest way to read in a large text file (4GB+)
Have you seen this over in codebank?
-
Nov 19th, 2011, 07:37 AM
#5
Thread Starter
Addicted Member
Re: Fastest way to read in a large text file (4GB+)
Thanks for the link milk, that is how I ended up doing it (via API calls rather than FSO). 
I used the CreateFile, WriteFile, ReadFile and CloseHandle APIs to get past the 2GB limit, tested up to 8GB files no worries:
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
I found that I didn't need SetFilePointer as once I had set the byte array buffer size the WriteFile and ReadFile functions automatically took care of the file position for me. Also I specified FILE_FLAG_NO_BUFFERING And FILE_FLAG_WRITE_THROUGH in the dwFlagsAndAttributes argument in CreateFile when writing large files in preference to using FlushFileBuffers due to the performance hit noted by MS in the details regarding the FlushFileBuffers function:
"Due to disk caching interactions within the system, the FlushFileBuffers function can be inefficient when used after every write to a disk drive device when many writes are being performed separately. If an application is performing multiple writes to disk and also needs to ensure critical data is written to persistent media, the application should use unbuffered I/O instead of frequently calling FlushFileBuffers. To open a file for unbuffered I/O, call the CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from being cached and flushes the metadata to disk with each write."
Last edited by Witis; Nov 19th, 2011 at 08:23 AM.
All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.
The plural of sun is stars you Catholic turkeys.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|