Results 1 to 9 of 9

Thread: Splitting a large file into smaller files

  1. #1

    Thread Starter
    Code Monkey wild_bill's Avatar
    Join Date
    Mar 2005
    Location
    Montana
    Posts
    2,993

    Splitting a large file into smaller files

    We had a process that split large files into smaller files be reading and writing line by line. This fixed the out of memory exceptions we were having, but it took hours to complete. I rewrote this method, so it is way faster, and doesn't cause memory exceptions.

    vb Code:
    1. Private Function SplitFile( _
    2.     ByVal inputFileName As String, ByVal outputFileName As String, ByVal numberOfFiles As Integer) _
    3.     As List(Of String)
    4.         Dim returnList As New List(Of String)
    5.         Try
    6.             Dim outputFileExtension As String = IO.Path.GetExtension(outputFileName)
    7.             outputFileName = outputFileName.Replace(outputFileExtension, "")
    8.             Dim sr As New IO.StreamReader(inputFileName)
    9.             Dim fileLength As Long = sr.BaseStream.Length
    10.             Dim baseBufferSize As Integer = CInt(fileLength \ numberOfFiles)
    11.             Dim finished As Boolean = False
    12.             Dim fileCount As Integer = 1
    13.             Do Until finished
    14.                 Dim bufferSize As Integer = baseBufferSize
    15.                 Dim originalPosition As Long = sr.BaseStream.Position
    16.                 'find line first line feed after the base buffer length
    17.                 sr.BaseStream.Position += bufferSize
    18.                 If sr.BaseStream.Position < fileLength Then
    19.                     Do Until sr.Read = 10
    20.                         bufferSize += 1
    21.                     Loop
    22.                     bufferSize += 1
    23.                 Else
    24.                     bufferSize = CInt(fileLength - originalPosition)
    25.                     finished = True
    26.                 End If
    27.                 'write the chunk of data to a buffer in memory
    28.                 sr.BaseStream.Position = originalPosition
    29.                 Dim buffer(bufferSize - 1) As Byte
    30.                 sr.BaseStream.Read(buffer, 0, bufferSize)
    31.                 'write the chunk of data to a file
    32.                 Dim outputPath As String = outputFileName & fileCount.ToString & outputFileExtension
    33.                 returnList.Add(outputPath)
    34.                 My.Computer.FileSystem.WriteAllBytes( _
    35.                 outputPath, buffer, False)
    36.                 fileCount += 1
    37.             Loop
    38.         Catch ex As Exception
    39.             Console.Write(ex.ToString)
    40.         End Try
    41.         Return returnList
    42.     End Function

  2. #2
    New Member
    Join Date
    Jan 2008
    Posts
    3

    Exclamation Re: Splitting a large file into smaller files

    thanks for share.
    Can you tell me how can i merge(Join) small files after Splitting it

  3. #3

    Thread Starter
    Code Monkey wild_bill's Avatar
    Join Date
    Mar 2005
    Location
    Montana
    Posts
    2,993

    Re: Splitting a large file into smaller files

    This should work
    vb Code:
    1. Private Sub MergeFiles(ByVal inputDir As String, ByVal inputMask As String, ByVal outputPath As String)
    2.  
    3.     'store files in datatable with their created times to sort by later
    4.     Dim files As New DataTable
    5.     files.Columns.Add("filepath", GetType(String))
    6.     files.Columns.Add("creationtime", GetType(Date))
    7.     'find partial files
    8.     For Each f As String In IO.Directory.GetFiles(inputDir, inputMask)
    9.         files.Rows.Add(New Object() {f, IO.File.GetCreationTime(f)})
    10.     Next
    11.  
    12.     'make sure output file does not exist before writing
    13.     If IO.File.Exists(outputPath) Then
    14.         IO.File.Delete(outputPath)
    15.     End If
    16.  
    17.     'loop through file in order, and append contents to output file
    18.     For Each dr As DataRow In files.Select("", "creationtime")
    19.         Dim contents As String = My.Computer.FileSystem.ReadAllText(CStr(dr("filepath")))
    20.         My.Computer.FileSystem.WriteAllText(outputPath, contents, True)
    21.     Next
    22.  
    23. End Sub

  4. #4
    New Member
    Join Date
    Jan 2008
    Posts
    3

    Re: Splitting a large file into smaller files

    thanks for share.
    I can't use 2 funtion SplitFile and MergeFiles in VB 7 (VS2003) because VS2003 does not support System.Collections.Generic
    Can you give me a example these function use in VS2003

  5. #5

    Thread Starter
    Code Monkey wild_bill's Avatar
    Join Date
    Mar 2005
    Location
    Montana
    Posts
    2,993

    Re: Splitting a large file into smaller files

    Replace List(Of String) with System.Specialized.StringCollection

    Also, make sure to rewrite the My.Computer.FileSystem calls with system.io calls.

  6. #6
    Member
    Join Date
    Mar 2008
    Posts
    41

    Re: Splitting a large file into smaller files

    I need some help with this if anybody doesn't mind necromancy of old posts? I'm trying to make it with work with win forms in Vb2010.

    OR if anybody has a method of taking a HUGE (possibly gigabytes) log file and splitting it into 5 megabyte .txt files. Thank you all who read.

  7. #7

    Thread Starter
    Code Monkey wild_bill's Avatar
    Join Date
    Mar 2005
    Location
    Montana
    Posts
    2,993

    Re: Splitting a large file into smaller files

    Did you try the first method?
    That is the very essence of human beings and our very unique capability to perform complex reasoning and actually use our perception to further our understanding of things. We like to solve problems. -Kleinma

    Does your code in post #46 look like my code in #45? No, it doesn't. Therefore, wrong is how it looks. - jmcilhinney

  8. #8
    Member
    Join Date
    Mar 2008
    Posts
    41

    Re: Splitting a large file into smaller files

    Hi wild_bill. I've made some considerable progress on this now, thank you. Now I'm playing with a progress meter. The background worker thread is fine and reporting, but it's only reporting the number of files it's written, not the percentage of the entire thing. For example if I break an original file of 320mb into 5mb chunks, the progress meter will only give a maximum progress of 64, or a 100mb file into 10mb will only give a maximum progress of 10 when it's completed. So my progress bar is not much use, but I'll keep on with it...

  9. #9
    New Member
    Join Date
    Dec 2023
    Posts
    1

    Re: Splitting a large file into smaller files

    thanks for the code I had corruption in the part file so this is the code i used in the end:
    Code:
        Private Function SplitFile(
        ByVal inputFilePath As String, ByVal outputFolderPath As String, ByVal numberOfFiles As Integer) _
        As List(Of String)
    
            Dim returnList As New List(Of String)
    
            Try
                Dim inputFileExtension As String = Path.GetExtension(inputFilePath)
                Dim outputFileName As String = Path.GetFileNameWithoutExtension(inputFilePath)
                If Not outputFolderPath.EndsWith(Path.DirectorySeparatorChar) Then
                    outputFolderPath &= Path.DirectorySeparatorChar
                End If
    
                Using sr As New StreamReader(inputFilePath)
                    Dim fileLength As Long = sr.BaseStream.Length
                    Dim baseBufferSize As Integer = CInt(fileLength \ numberOfFiles)
                    Dim finished As Boolean = False
                    Dim fileCount As Integer = 1
    
                    Do Until finished
                        Dim bufferSize As Integer = baseBufferSize
                        Dim originalPosition As Long = sr.BaseStream.Position
                        sr.BaseStream.Position += bufferSize
    
                        If sr.BaseStream.Position < fileLength Then
                            While sr.Peek() >= 0 AndAlso sr.Read() <> 10
                                bufferSize += 1
                            End While
                        Else
                            bufferSize = CInt(fileLength - originalPosition)
                            finished = True
                        End If
    
                        sr.BaseStream.Position = originalPosition
                        Dim buffer(bufferSize - 1) As Byte
                        sr.BaseStream.Read(buffer, 0, bufferSize)
                        Dim outputPath As String = outputFolderPath & outputFileName & "_" & fileCount.ToString() & inputFileExtension
                        returnList.Add(outputPath)
                        File.WriteAllBytes(outputPath, buffer)
    
                        fileCount += 1
                    Loop
                End Using
    
            Catch ex As Exception
                Console.Write(ex.ToString)
            End Try
    
            Return returnList
        End Function

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width