-
Jan 8th, 2008, 05:17 PM
#1
Splitting a large file into smaller files
We had a process that split large files into smaller files be reading and writing line by line. This fixed the out of memory exceptions we were having, but it took hours to complete. I rewrote this method, so it is way faster, and doesn't cause memory exceptions.
vb Code:
Private Function SplitFile( _
ByVal inputFileName As String, ByVal outputFileName As String, ByVal numberOfFiles As Integer) _
As List(Of String)
Dim returnList As New List(Of String)
Try
Dim outputFileExtension As String = IO.Path.GetExtension(outputFileName)
outputFileName = outputFileName.Replace(outputFileExtension, "")
Dim sr As New IO.StreamReader(inputFileName)
Dim fileLength As Long = sr.BaseStream.Length
Dim baseBufferSize As Integer = CInt(fileLength \ numberOfFiles)
Dim finished As Boolean = False
Dim fileCount As Integer = 1
Do Until finished
Dim bufferSize As Integer = baseBufferSize
Dim originalPosition As Long = sr.BaseStream.Position
'find line first line feed after the base buffer length
sr.BaseStream.Position += bufferSize
If sr.BaseStream.Position < fileLength Then
Do Until sr.Read = 10
bufferSize += 1
Loop
bufferSize += 1
Else
bufferSize = CInt(fileLength - originalPosition)
finished = True
End If
'write the chunk of data to a buffer in memory
sr.BaseStream.Position = originalPosition
Dim buffer(bufferSize - 1) As Byte
sr.BaseStream.Read(buffer, 0, bufferSize)
'write the chunk of data to a file
Dim outputPath As String = outputFileName & fileCount.ToString & outputFileExtension
returnList.Add(outputPath)
My.Computer.FileSystem.WriteAllBytes( _
outputPath, buffer, False)
fileCount += 1
Loop
Catch ex As Exception
Console.Write(ex.ToString)
End Try
Return returnList
End Function
-
Jan 14th, 2008, 01:10 AM
#2
New Member
Re: Splitting a large file into smaller files
thanks for share.
Can you tell me how can i merge(Join) small files after Splitting it
-
Jan 14th, 2008, 10:43 AM
#3
Re: Splitting a large file into smaller files
This should work
vb Code:
Private Sub MergeFiles(ByVal inputDir As String, ByVal inputMask As String, ByVal outputPath As String)
'store files in datatable with their created times to sort by later
Dim files As New DataTable
files.Columns.Add("filepath", GetType(String))
files.Columns.Add("creationtime", GetType(Date))
'find partial files
For Each f As String In IO.Directory.GetFiles(inputDir, inputMask)
files.Rows.Add(New Object() {f, IO.File.GetCreationTime(f)})
Next
'make sure output file does not exist before writing
If IO.File.Exists(outputPath) Then
IO.File.Delete(outputPath)
End If
'loop through file in order, and append contents to output file
For Each dr As DataRow In files.Select("", "creationtime")
Dim contents As String = My.Computer.FileSystem.ReadAllText(CStr(dr("filepath")))
My.Computer.FileSystem.WriteAllText(outputPath, contents, True)
Next
End Sub
-
Jan 14th, 2008, 08:38 PM
#4
New Member
Re: Splitting a large file into smaller files
thanks for share.
I can't use 2 funtion SplitFile and MergeFiles in VB 7 (VS2003) because VS2003 does not support System.Collections.Generic
Can you give me a example these function use in VS2003
-
Jan 15th, 2008, 10:27 AM
#5
Re: Splitting a large file into smaller files
Replace List(Of String) with System.Specialized.StringCollection
Also, make sure to rewrite the My.Computer.FileSystem calls with system.io calls.
-
Feb 10th, 2014, 01:52 PM
#6
Member
Re: Splitting a large file into smaller files
I need some help with this if anybody doesn't mind necromancy of old posts? I'm trying to make it with work with win forms in Vb2010.
OR if anybody has a method of taking a HUGE (possibly gigabytes) log file and splitting it into 5 megabyte .txt files. Thank you all who read.
-
Mar 24th, 2014, 07:00 PM
#7
Re: Splitting a large file into smaller files
Did you try the first method?
That is the very essence of human beings and our very unique capability to perform complex reasoning and actually use our perception to further our understanding of things. We like to solve problems. -Kleinma
Does your code in post #46 look like my code in #45? No, it doesn't. Therefore, wrong is how it looks. - jmcilhinney
-
Mar 25th, 2014, 04:00 AM
#8
Member
Re: Splitting a large file into smaller files
Hi wild_bill. I've made some considerable progress on this now, thank you. Now I'm playing with a progress meter. The background worker thread is fine and reporting, but it's only reporting the number of files it's written, not the percentage of the entire thing. For example if I break an original file of 320mb into 5mb chunks, the progress meter will only give a maximum progress of 64, or a 100mb file into 10mb will only give a maximum progress of 10 when it's completed. So my progress bar is not much use, but I'll keep on with it...
-
Dec 4th, 2023, 04:24 PM
#9
New Member
Re: Splitting a large file into smaller files
thanks for the code I had corruption in the part file so this is the code i used in the end:
Code:
Private Function SplitFile(
ByVal inputFilePath As String, ByVal outputFolderPath As String, ByVal numberOfFiles As Integer) _
As List(Of String)
Dim returnList As New List(Of String)
Try
Dim inputFileExtension As String = Path.GetExtension(inputFilePath)
Dim outputFileName As String = Path.GetFileNameWithoutExtension(inputFilePath)
If Not outputFolderPath.EndsWith(Path.DirectorySeparatorChar) Then
outputFolderPath &= Path.DirectorySeparatorChar
End If
Using sr As New StreamReader(inputFilePath)
Dim fileLength As Long = sr.BaseStream.Length
Dim baseBufferSize As Integer = CInt(fileLength \ numberOfFiles)
Dim finished As Boolean = False
Dim fileCount As Integer = 1
Do Until finished
Dim bufferSize As Integer = baseBufferSize
Dim originalPosition As Long = sr.BaseStream.Position
sr.BaseStream.Position += bufferSize
If sr.BaseStream.Position < fileLength Then
While sr.Peek() >= 0 AndAlso sr.Read() <> 10
bufferSize += 1
End While
Else
bufferSize = CInt(fileLength - originalPosition)
finished = True
End If
sr.BaseStream.Position = originalPosition
Dim buffer(bufferSize - 1) As Byte
sr.BaseStream.Read(buffer, 0, bufferSize)
Dim outputPath As String = outputFolderPath & outputFileName & "_" & fileCount.ToString() & inputFileExtension
returnList.Add(outputPath)
File.WriteAllBytes(outputPath, buffer)
fileCount += 1
Loop
End Using
Catch ex As Exception
Console.Write(ex.ToString)
End Try
Return returnList
End Function
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|