Results 1 to 5 of 5

Thread: [RESOLVED] Fastest way to read in a large text file (4GB+)

  1. #1

    Thread Starter
    Addicted Member Witis's Avatar
    Join Date
    Jan 2011
    Location
    VB Forums Online Freedom Mode: Operational
    Posts
    213

    Resolved [RESOLVED] Fastest way to read in a large text file (4GB+)

    Hi

    I seem to be hitting some limits on the Get statement when reading in a text file > 2GB in binary.

    eg. Get #ff, byte position/number, buffer

    When byte position is > 2,147,000,000 I get the error bad record number, I presume as the Get function was built using something similar to a long to hold the the byte number which has a maximum value of 2,147,000,000.

    Unfortunately, I need to be able to read in large text files fast, and line input is just not up to the job due to speed issues, and byte arrays seem to max out >400MB and <500MB. Are there any other options?
    All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.

    The plural of sun is stars you Catholic turkeys.

  2. #2
    PowerPoster
    Join Date
    Jul 2006
    Location
    Maldon, Essex. UK
    Posts
    6,334

    Re: Fastest way to read in a large text file (4GB+)

    I've been Google'ing around and can't find any hits for exceeding a limit using Get, although the documentation does state that the record number is a Long.

    You could try the ADODB Stream Object
    Code:
    '
    ' Assumes a reference to Microsoft ActiveX Data Objects 2.8 Library
    '
    Dim st As ADODB.Stream
    Dim strData As String
    Set st = New ADODB.Stream
    st.Type = adTypeText        'Text Data
    st.LineSeparator = adCRLF     'Line Terminmator is CRLF
    st.Charset = "ascii"        'Character set is ASCII
    st.Open
    st.LoadFromFile ("c:\MyApp\MyData.txt")    ' Whatever your File Name is
    '
    ' Read each record
    '
    Do Until st.EOS
        strData = st.ReadText(adReadLine)
    '
    '
    'etc
    '
    Loop
    st.Close
    Set st = Nothing

  3. #3

    Thread Starter
    Addicted Member Witis's Avatar
    Join Date
    Jan 2011
    Location
    VB Forums Online Freedom Mode: Operational
    Posts
    213

    Re: Fastest way to read in a large text file (4GB+)

    thanks for the reply Doogle, I just started googling and found this:
    http://support.microsoft.com/kb/189981/en
    so I'll have a play around with it first as it contains pure API calls without any dependency and might be faster,
    there are probably some functions in the FileSystemObject that might be able to get around the 2GB limit also
    eg http://support.microsoft.com/kb/186118
    All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.

    The plural of sun is stars you Catholic turkeys.

  4. #4
    Cumbrian Milk's Avatar
    Join Date
    Jan 2007
    Location
    0xDEADBEEF
    Posts
    2,448

    Re: Fastest way to read in a large text file (4GB+)

    Have you seen this over in codebank?
    W o t . S i g

  5. #5

    Thread Starter
    Addicted Member Witis's Avatar
    Join Date
    Jan 2011
    Location
    VB Forums Online Freedom Mode: Operational
    Posts
    213

    Re: Fastest way to read in a large text file (4GB+)

    Thanks for the link milk, that is how I ended up doing it (via API calls rather than FSO).

    I used the CreateFile, WriteFile, ReadFile and CloseHandle APIs to get past the 2GB limit, tested up to 8GB files no worries:
    http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
    http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
    http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx
    http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx

    I found that I didn't need SetFilePointer as once I had set the byte array buffer size the WriteFile and ReadFile functions automatically took care of the file position for me. Also I specified FILE_FLAG_NO_BUFFERING And FILE_FLAG_WRITE_THROUGH in the dwFlagsAndAttributes argument in CreateFile when writing large files in preference to using FlushFileBuffers due to the performance hit noted by MS in the details regarding the FlushFileBuffers function:

    "Due to disk caching interactions within the system, the FlushFileBuffers function can be inefficient when used after every write to a disk drive device when many writes are being performed separately. If an application is performing multiple writes to disk and also needs to ensure critical data is written to persistent media, the application should use unbuffered I/O instead of frequently calling FlushFileBuffers. To open a file for unbuffered I/O, call the CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from being cached and flushes the metadata to disk with each write."
    Last edited by Witis; Nov 19th, 2011 at 08:23 AM.
    All men have an inherent right to life, the right to self determination including freedom from forced or compulsory labour, a right to hold opinions and the freedom of expression, and the right to a fair trial and freedom from torture. Be aware that these rights are universal and inalienable (cannot be given, taken or otherwise transferred or removed) although you do risk losing the aforementioned rights should you fail to uphold them e.g Charles Taylor; United Nations sources: http://www.un.org/en/documents/udhr/, http://www.ohchr.org/EN/Professional...ages/CCPR.aspx. Also Charles I was beheaded on the 30th of January of 1649 for trying to replace parliamentary democracy with an absolute monarchy, the same should happen to Dr Phil and Stephen Fry; source: http://www.vbforums.com/showthread.p...ute-Monarchism.

    The plural of sun is stars you Catholic turkeys.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width