i have a rich text box filled with words, say, 30,000 words for instance all on different lines. i need to be able to press a button and divide this richtext1.text into output of 30 text files with 1,000 words in each of them
i need it to work with any number of words, so say if there were 23.3k words in the text box it would make 24 files, and the fourth would have 300 words instead of 1000.
ive been toying around but unable to figure it out, any help it greatly appreciated!
im just using a rich text box because you can store more words in it than a regular textbox, but i need the contents of the richtextbox divided into .txt documents in increments of 1,000
Private Type StringArrayType
Words() As String
End Type
Private Sub Command1_Click()
Dim strArray() As String
Dim typFiles() As StringArrayType
Dim i As Long
Dim iMax As Long
Dim lngWord As Long
Dim lngFile As Long
Dim lngFileMax As Long
Dim lngLastSize As Long
' Split words into array
strArray = Split(richtext1.Text, vbNewLine)
iMax = UBound(strArray)
' Identify how many files will be needed
lngFileMax = iMax \ 1000
ReDim typFiles(lngFileMax)
lngFile = -1
' How many words in last file?
lngLastSize = (iMax + 1) Mod 1000
If lngLastSize = 0 Then lngLastSize = 999
' Copy words to temporary file arrays
For i = 0 To iMax
If lngWord = 0 Then
lngFile = lngFile + 1
With typFiles(lngFile)
If lngFile = lngFileMax Then ReDim .Words(lngLastSize) Else ReDim .Words(999)
End With
End If
typFiles(lngFile).Words(lngWord) = strArray(i)
If lngWord = 999 Then lngWord = 0 Else lngWord = lngWord + 1
Next
' Re-use strArray to hold joined word lists
ReDim strArray(lngFileMax)
For i = 0 To lngFileMax
With typFiles(i)
strArray(i) = Join(.Words, vbNewLine)
Erase .Words
End With
Next
Erase typFiles
' Create the files
For i = 0 To lngFileMax
SaveStringToFile App.Path & "\Words" & Format(i + 1, "00"), strArray(i)
Next
Erase strArray
End Sub
Public Sub SaveStringToFile(pstrFile As String, pstrText As String)
Dim FileNumber As Long
If Len(Dir(pstrFile)) <> 0 Then Kill pstrFile
FileNumber = FreeFile
Open pstrFile For Output As #FileNumber
Print #FileNumber, pstrText
Close #FileNumber
End Sub
Private Sub Command1_Click()
Dim strText As String
Dim lngFileCount As Long
Dim lngPos As Long
Dim lngCount As Long
Dim lngStart As Long
Dim fNum As Integer
strText = RichTextBox1.Text ' Copy the text to temporary variable
lngStart = 1 ' Begining of string
lngPos = InStr(1, strText, vbNewLine) ' Find first line break
Do Until lngPos = 0 ' Loop until all line breaks have been found
lngCount = lngCount + 1 ' Count the number of line breaks
If lngCount = 1000 Then ' If we have 1000 line breaks...
fNum = FreeFile
' Open a text file, appending lngFileCount for unique filenames
Open App.Path & "\Words" & Format(lngFileCount, "00") & ".txt" For Output As #fNum
' Print the text from lngStart up to the 1000th line break
Print #fNum, Mid$(strText, lngStart, lngPos - lngStart);
Close #fNum
lngFileCount = lngFileCount + 1 ' Increment lngFileCount
lngCount = 0 ' Reset count of line breaks
lngStart = lngPos + 2 ' Start at begining of next line
End If
lngPos = InStr(lngPos + 2, strText, vbNewLine) ' Find next line break
Loop
fNum = FreeFile
' Output the remainder of the string, if there is anything left
If lngStart <= Len(strText) Then
Open App.Path & "\Words" & Format(lngFileCount, "00") & ".txt" For Output As #fNum
Print #fNum, Mid$(strText, lngStart);
Close #fNum
lngFileCount = lngFileCount + 1
End If
MsgBox CStr(lngFileCount) & " files created"
End Sub
I´m trying to use for a different way but I have a problem,
I´m working with text files sizes, more or less 2, 4 gbs, so is impossible to load all the information at one time.
So I need the new files be divide in the number of lines I want but written by memory blocks of 64k (example).
Can any one help me?
Regards,
Last edited by anonimou; Nov 10th, 2009 at 06:52 PM.
I´m trying to use for a different way but I have a problem. I´m working with text files sizes, more or less 2, 4 gbs, so is impossible to load all the information at one time.
So I need the new files be divide in the number of lines I want but written by memory blocks of 64k (example).
Can any one help me?
Regards,
This is not going to be easy by any means. Both FileLen() and LOF() are limited in size by 2^31 characters.
I believe you need to rethink the entire problem and figure out why you really need to do this precise file size splitting. Why do the the first N files from the break out have to be exactly the same size?
The Files don´t need to have exactly the same size.
Example:
I have a file with 4 Gbs of information and I want to split this file in 10 files of 400 mgbs.
But I dón´t have any problem if the last file who will receive what left of the original file
close with 1mgb
Or instead of 10 files the program split in 9 or 11 files.
I olny need the split don´t break any line information and pass what left of that line to the next file.
Here is an attempt. You might find you need to fiddle with it some. I won't claim it is bug-free, but simple tests seem to show it working properly. As far as I know it never "abandons" any text from the end of the input file.
It assumes three things: ANSI input file (not Unicode), lines end with CRLF or LF, and that within a given CHUNK_SIZE of characters a line break will occur. Otherwise it may well split in the middle of a line. It should always split at a "line" boundary otherwise, not within lines.
On small files near DesiredPieces * CHUNK_SIZE in length it will probably produce fewer than DesiredPieces files. CHUNK_SIZE is 65536 unless you alter this. The "pieces" will be numbered like: Split001.txt, Split002.txt, etc. based on a supplied "pattern" for the output files of Split.txt.
It should work for input files of up to a few hundred GB.