-
Dec 5th, 2017, 01:43 PM
#1
Thread Starter
New Member
Replace string in a Huge file more > 4GB
Hi
i need a solution for Replace some string with another string in a Huge file more > 4GB or maybe more and more ...
if possible works fine with do event or another way for avoid from Not Responding problem.
i found some threads in forum about reading larg files but i cant making currectly.
thanks
-
Dec 5th, 2017, 02:05 PM
#2
Re: Replace string in a Huge file more > 4GB
post your code and I'm sure someone can modify / fix what you've tried.
-
Dec 5th, 2017, 02:43 PM
#3
Re: Replace string in a Huge file more > 4GB
Hi Astarali,
It seems like part of your problem is actually "finding" the string. Here's a binary file search that I often use:
Code:
Option Explicit
Public Function BinaryFileSearch(sFileSpec As String, sSearchString As String, Optional bCaseSensitive As Boolean = True, _
Optional lStartPosition As Long = 1, Optional lFoundPosition As Long, _
Optional lFileHandleToUse As Long = 0) As Boolean
' Returns true if sSearchString is found, else false.
' sSearchString can be no longer than 128.
' This will work even if Word or Excel has the file open.
' The lFoundPosition is a return argument.
' It returns the latest position before lStartPosition (if there isn't one after lStartPosition) or
' it returns the earliest position after lStartPosition.
Dim iFle As Long
Dim FileData As String
Dim FilePointer As Long
Dim FileLength As Long
Dim sFind As String
Dim iPos As Long
'
If Len(sSearchString) > 128 Then
Err.Raise 1234
Exit Function
End If
'
If lFileHandleToUse = 0 Then
If Not bFileExists(sFileSpec) Then Exit Function
iFle = FreeFile
On Error Resume Next
Open sFileSpec For Binary As iFle
If Err <> 0 Then
Close iFle
On Error GoTo 0
Exit Function
End If
On Error GoTo 0
'
If Len(iFle) = 0 Then Close iFle: Exit Function
Else
iFle = lFileHandleToUse ' The file MUST be opened BINARY for this to work.
End If
'
If bCaseSensitive Then
sFind = sSearchString
Else
sFind = LCase$(sSearchString)
End If
FileData = Space(1024)
FileLength = LOF(iFle)
FilePointer = lStartPosition
Do
If FilePointer > FileLength Then Exit Do
Get iFle, FilePointer, FileData
If Not bCaseSensitive Then FileData = LCase$(FileData)
iPos = InStr(FileData, sFind)
If iPos <> 0 Then
lFoundPosition = FilePointer + iPos - 1
If lFoundPosition >= lStartPosition Then
BinaryFileSearch = True
Exit Do
End If
End If
FilePointer = ((FilePointer + 1024) - Len(sFind)) + 1
Loop
If lFileHandleToUse = 0 Then Close iFle
End Function
Public Function bFileExists(fle As String) As Boolean
On Error GoTo FileExistsError
' If no error then something existed.
bFileExists = (GetAttr(fle) And vbDirectory) = 0
Exit Function
FileExistsError:
bFileExists = False
Exit Function
End Function
Now, I'm sure there are better ones out there, as I'm doing a lot of converting binary to Unicode and vice-versa. That's certainly one improvement that could be made. However, I have great confidence that this one works.
Now, after you've gotten your search string's position, it's then just an easy matter of re-opening your file with "Open sFileSpec For Binary", using your string's location and length, and moving things around a bit. If the search string and the replace string are the same size, you don't even have to do that.
Good Luck,
Elroy
EDIT1: However, since your files are so large, you will have to be careful to move things around in "chunks". You don't want to blow up your memory.
Last edited by Elroy; Dec 5th, 2017 at 02:46 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Dec 5th, 2017, 04:29 PM
#4
Re: Replace string in a Huge file more > 4GB
i would convert the string into binary (byte array) and do a instrB, this because we are looking in "HUGE" data files and binary search is faster.
also, if you read in "steps", the string could be between two reads, that means you need to adjust the reading, to start more or less from the last point to include string inbetween.
remember to convert the string, if low or upcase or unicode etc.
if unknown, you also need to adjust the searching to convert ascii to lower or upcase and that will also slow down the searching.
-
Dec 5th, 2017, 04:52 PM
#5
Re: Replace string in a Huge file more > 4GB
@Elroy. VB's file functions don't do 4GB files, correct?
FYI to anyone, here's a class created by dilettante for accessing large files
http://www.vbforums.com/showthread.p...File-I-O-Class
-
Dec 5th, 2017, 05:18 PM
#6
Re: Replace string in a Huge file more > 4GB
Originally Posted by baka
if you read in "steps", the string could be between two reads
Hi Baka. My function actually deals with this.
Originally Posted by LaVolpe
@Elroy. VB's file functions don't do 4GB files, correct?
Hi LaVolpe. Yep, this one I actually forgot about. Astarali, LaVolpe has some good file open/read routines for reading larger files. If I were doing this, I'd definitely check into his work.
Take Care,
Elroy
EDIT1: And yeah, I already mentioned that I was doing some unnecessary Unicode conversion.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Dec 5th, 2017, 05:51 PM
#7
Re: Replace string in a Huge file more > 4GB
my post wasn't to complain about your post Elroy, I just wanted to mention how this can be done to the OP, and the use of InStrB, as I have worked on this myself a few times. in my own class I use API that LaVople pointed out, together with IsWindowUnicode, CreateFileA & W.
also, searching strings, can be done in multiple ways,
is theres a specific structure? example, is theres zeros before and after the string? using 00 before and after the string can fasten the search as you only get exactly that string and not a part of a text and word, like we search for "hole" but we also get result for "whole" and everything else containing "hole", but with "0"-hole-"0" we will only get hole and that it. or something else?
you need to know, if unknown, then of course we do a simple string search and thats it. but if known, its better to follow the data structure.
to search can also be done byte by byte, this if theres different conditions, we search for the first byte and then we compare with all the variations,
example
if data(1) = sdata(x+1) then
if data(2) = sdata(x+2) or data(2) + 32 = sdata(x+2) then etc...
-
Dec 5th, 2017, 06:44 PM
#8
Re: Replace string in a Huge file more > 4GB
Originally Posted by baka
my post wasn't to complain about your post Elroy, I just wanted to mention how this can be done to the OP, and the use of InStrB, as I have worked on this myself a few times. in my own class I use API that LaVople pointed out, together with IsWindowUnicode, CreateFileA & W.
also, searching strings, can be done in multiple ways,
is theres a specific structure? example, is theres zeros before and after the string? using 00 before and after the string can fasten the search as you only get exactly that string and not a part of a text and word, like we search for "hole" but we also get result for "whole" and everything else containing "hole", but with "0"-hole-"0" we will only get hole and that it. or something else?
you need to know, if unknown, then of course we do a simple string search and thats it. but if known, its better to follow the data structure.
to search can also be done byte by byte, this if theres different conditions, we search for the first byte and then we compare with all the variations,
example
if data(1) = sdata(x+1) then
if data(2) = sdata(x+2) or data(2) + 32 = sdata(x+2) then etc...
Yeah, I didn't really take any offense. I was just pointing out for Astarali that that particular base had been covered.
And yeah, there really is more to consider here than Astarali is suggesting. Is the string we're searching for Unicode? If so, is it UCS-2, or possibly the expanded UTF-16, or possibly some other variant of Unicode? If it's ANSI, do we need to worry about a particular codepage? Do we always want to find a terminating null? These are all unanswered questions.
For me, I wrote that thing to search for ASCII strings primarily in Excel files. I just wanted a way to identify certain types of Excel files (created by my code, but with user-given names). It does a good job of that. However, I've never used it as a search-and-replace function.
Best Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
Dec 5th, 2017, 06:55 PM
#9
Re: Replace string in a Huge file more > 4GB
If I remember to check tomorrow, I can post some tidbits to help along. At my job, I developed a search (not replace) routine to find ASCII strings in very large files 4GB+ and it is pretty fast. As for replacing, that's relatively simple but requires double the disk space typically. Once string found, transfer all bytes up to that to another file, write the new string, and then transfer all bytes after the found string. Of course, if the string being replaced is larger than its replacement, one could write the changes to the same file and then truncate the file after shifting all remaining bytes. There's probably gotchas with truncating files, so I'll leave that to others for discussion.
-
Dec 5th, 2017, 07:28 PM
#10
Re: Replace string in a Huge file more > 4GB
you should be able to read and write simultaneously if the string is "exactly" the same size/length of the original.
so, if you search for "wrong" you could replace with "false" as both are 5 letters without the need to create a new file.
-
Dec 5th, 2017, 07:41 PM
#11
Re: Replace string in a Huge file more > 4GB
I didn't see where the character encoding was mentioned. Maybe the files are Unicode, EBCDIC, or who knows what?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|