Results 1 to 35 of 35

Thread: [RESOLVED] Fast replace in a large string

  1. #1

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Resolved [RESOLVED] Fast replace in a large string

    Hi

    I have some large text data in a string.
    The data comes from a file, the line separator is CRLF.
    It hold several houndred lines.
    Let's assume 10MB data.

    From a certain line I need to get the data, then modify it.
    The modified data can be shorter or longer than the unmodified data.

    Up to now I split the data by CRLF, modify the line, and join it afterwards.
    Easy and reliable.
    And slow.
    And uses more memory than wanted.

    I think there should be a better method without split etc.

    My idea is to find the xth CRLF and the xth+1 CRLF in the whole large string.
    This is also not efficient, as I have to compare every single character to find CRLF.

    In short:
    I look for a better idea.

    Thanks
    Karl

  2. #2
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,910

    Re: Fast replace in a large string

    Why not read the data line by line, modify the line(s) needed, and write it to a new file on the go?
    You can this do with buffered file IO

  3. #3
    PowerPoster ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    3,048

    Re: Fast replace in a large string

    Quote Originally Posted by Arnoutdv View Post
    Why not read the data line by line, modify the line(s) needed, and write it to a new file on the go?
    You can this do with buffered file IO
    +1 to that


    @Karl
    give this a try
    Code:
    Option Explicit
    
    Private Sub Command1_Click()
    Dim sPath As String
    Dim sContent As String
    sPath = "E:\Bible.txt"
    sContent = ReadTextFile(sPath, 0) ' lFormat -2 - System default, -1 - Unicode, 0 - ASCII
    
    sContent = Replace(sContent, "someText", "this is myNewText")
    
    'add more to search and replace
    'sContent = Replace(sContent, vbCrLf, " ")
    WriteTextFile sContent, sPath, 0
    
    End Sub
    
    Function ReadTextFile(sPath, lFormat)
        With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 1, False, lFormat)
            ReadTextFile = ""
            If Not .AtEndOfStream Then ReadTextFile = .ReadAll
            .Close
        End With
    End Function
    
    Sub WriteTextFile(sContent, sPath, lFormat)
        With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 2, True, lFormat)
            .Write sContent
            .Close
        End With
    End Sub
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  4. #4

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: Fast replace in a large string

    The approaches are good and easy.
    And my data originally comes from files.

    But the data has to be changed in memory.
    I don't want the file output.

    I'll do some tests using temporary files.
    Why not.

  5. #5
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,910

    Re: Fast replace in a large string

    Quote Originally Posted by Karl77 View Post
    The approaches are good and easy.
    And my data originally comes from files.

    But the data has to be changed in memory.
    I don't want the file output.

    I'll do some tests using temporary files.
    Why not.
    Up to now I split the data by CRLF, modify the line, and join it afterwards.
    You need to data as a big string in your application?

    Then read the file line by line, change whats needed, and add the line to a StringBuilder class.

  6. #6
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    706

    Re: Fast replace in a large string

    Use InStr with vbBinaryCompare. It's much faster than without.

  7. #7
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,731

    Re: Fast replace in a large string

    I would go with what qvb6 wrote,
    read the file as binary array
    use instrb to find the start and end position.
    save data(0 to start-1)
    save replaceddata()
    save data(end+1 to endfile)

  8. #8
    PowerPoster
    Join Date
    Sep 2012
    Posts
    2,083

    Re: Fast replace in a large string

    Hi Karl77, if you want the best answer, you'd better post specific code and example data.

    Questioning also requires skill.

    http://www.vbforums.com/showthread.p...han-JavaScript

    http://www.vbforums.com/showthread.p...=1#post5457373
    Last edited by dreammanor; Mar 24th, 2020 at 10:16 AM.

  9. #9
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,259

    Re: Fast replace in a large string

    Quote Originally Posted by baka View Post
    read the file as binary array
    use instrb to find the start and end position.
    You never post code for your suggestion (not even pseudo-code).
    And that makes it really hard, to take them seriously.

    Olaf

  10. #10
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,731

    Re: Fast replace in a large string

    well Karl77 is not a newbie, im sure he can figure it out what to do.
    he will learn even more if he need to do it himself.
    but I understand what u mean, if I where to suggest a very complex function, but im not, this is very basic and should be known by every vb6 programmer.

  11. #11
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,047

    Re: Fast replace in a large string

    Quote Originally Posted by Karl77 View Post
    It hold several houndred lines.
    This data sounds like a real dogs dinner.


    From a certain line I need to get the data, then modify it.
    The modified data can be shorter or longer than the unmodified data.

    Up to now I split the data by CRLF, modify the line, and join it afterwards.
    Is there just ONE line in the whole file that needs to be changed? The preceding makes it sound that way, but the subsequent discussion is assuming that lots of lines will change. There are likely improvements that can be made if the changes are just to one, or a small number of identifiable lines, out of the whole file.
    My usual boring signature: Nothing

  12. #12
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,601

    Re: Fast replace in a large string

    10 years ago, I'd have suggested you perform these String operations as actual memory operations. That is to say find the necessary offsets for the text you're interesting in changing and use memory operations like RtlMoveMemory to implement concatenations or simply change characters in place instead of having to reallocate entire buffers just to change a few characters. This kind of approach yields great performance boosts in String processing. However, this was easy when we could safely assume all text was 1 byte per character. We live in a Unicode world today and it's just not safe to treat Strings this way anymore.

    However, all is not lost. There are still a few "best practices" you could employ to boost the performance of String processing, even with complicated Unicode Strings. I'd like to see the actual code you're using. Perhaps we could help you refactor it a bit to get better performance.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  13. #13
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,259

    Re: Fast replace in a large string

    Quote Originally Posted by baka View Post
    ...this is very basic and should be known by every vb6 programmer.
    Since you were talking about:
    - "read(ing) the file as binary array" ... (I assume a VB-ByteArray?)
    - and "use instrb to find the start and end position"

    then this is not really common, and I'd like to see a concrete implementation from you
    (since it is a "basic task" in your opinion).

    Olaf

  14. #14
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,601

    Re: Fast replace in a large string

    Quote Originally Posted by Schmidt View Post
    Since you were talking about:
    - "read(ing) the file as binary array" ... (I assume a VB-ByteArray?)
    - and "use instrb to find the start and end position"

    then this is not really common, and I'd like to see a concrete implementation from you
    (since it is a "basic task" in your opinion).

    Olaf
    Actually I don't disagree with him to be honest. Reading a file as a byte array is pretty basic unless you're talking to someone who is very new to programming. I remember writing my own version of the DOS command, COPY when I was like 10 years old in QuickBasic. All it did was divide the file into chunks by reading it 32767 bytes at a time into an array which would then be written to another file using Get/Put with the Binary access mode. I think is safe to assume that any VB6 programmer past the "Hello World" stage would know how to do this.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  15. #15
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,259

    Re: Fast replace in a large string

    Quote Originally Posted by Niya View Post
    Actually I don't disagree with him to be honest.
    Reading a file as a byte array is pretty basic ...
    Sure, the ByteArray-reading is a must -
    but he also suggested the InstrB function as a vehicle to "find things within those bytes" fast(er).

    I was asking for concrete code (the combination of these two suggested things) -
    e.g. in the context of "linewise looping through the bytes, whilst using InstrB",
    because this is not really a common task for most devs...

    If you think you can write this up for me instead, then I'd appreciate it.

    Olaf

  16. #16
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,601

    Re: Fast replace in a large string

    Quote Originally Posted by Schmidt View Post
    Sure, the ByteArray-reading is a must -
    but he also suggested the InstrB function as a vehicle to "find things within those bytes" fast(er).

    I was asking for concrete code (the combination of these two suggested things) -
    e.g. in the context of "linewise looping through the bytes, whilst using InstrB",
    because this is not really a common task for most devs...
    Fair enough.

    Quote Originally Posted by Schmidt View Post
    If you think you can write this up for me instead, then I'd appreciate it.
    I'm not entirely sure it's a good idea to commit to this method. Reading the file as a byte array and using Instr to search may not be the best approach. I mean OP hasn't provided enough information. In my mind, the most important question is whether OP is dealing strictly with 1 byte per character ANSI Strings. We have to be very careful about treating Strings as byte arrays so casually if different String encodings are involved, even if it's just UTF-8. I'd prefer hold off on providing a concrete example of anything until OP provides all the details.

    OP alone knows what he is doing and perhaps baka's suggestion is exactly what he needs, we don't know. But if it's not then I'd prefer to see what code he already has so we can try to figure out why it's not performing the way he expects.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  17. #17
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,444

    Re: Fast replace in a large string

    Maybe a usecase for my CodeBank-Entry? --> http://www.vbforums.com/showthread.p...ght=ReplaceAny
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  18. #18
    Hyperactive Member
    Join Date
    Mar 2019
    Posts
    426

    Re: Fast replace in a large string

    Quote Originally Posted by qvb6 View Post
    Use InStr with vbBinaryCompare. It's much faster than without.
    Is that not the default compare mode?

  19. #19

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: Fast replace in a large string

    Quote Originally Posted by Niya View Post
    OP alone knows what he is doing and perhaps baka's suggestion is exactly what he needs, we don't know. But if it's not then I'd prefer to see what code he already has so we can try to figure out why it's not performing the way he expects.
    I didn't provide example data and no existing code, because I don't want noone to code for me.
    I was looking for a fresh idea.
    And the discussion enlighted me.

    All the byte array stuff is not the solution.
    The large string is already in memory.

    In the large string, there is a unique decribing line (a marker) before the data to be changed.
    This way I know the next line # which holds the data.

    Now I do it different:
    I detect the position of the marker in the whole string.
    Knowing this position, I go back and forth in the string to find the CRLFs.
    Knowing the positions of CRLF, a bit left$ and right$ isolates the static parts.

    Solved then with no major effort.
    I'll post example code when double-checked and finished.

  20. #20
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    706

    Re: Fast replace in a large string

    Quote Originally Posted by vbwins View Post
    Is that not the default compare mode?
    You are right. I meant to say that vbBinaryCompare is faster than vbTextComapre. vbSpeed shows it to be at least 8 times faster.

  21. #21
    Fanatic Member
    Join Date
    Feb 2019
    Posts
    706

    Re: Fast replace in a large string

    Quote Originally Posted by Karl77 View Post
    Knowing this position, I go back and forth in the string to find the CRLFs.
    Knowing the positions of CRLF, a bit left$ and right$ isolates the static parts.
    Is that marker at the beggining of a line? If so, you could do something like this:

    pos = InStr(s, vbCrLf & [Config])

    I used the above to make my own Unicode INI file parser, rather than relying on the OS. In the case above, I was looking for a [Config] section just after a new line, so there is no reason to search backward. VB6 has InStrRev to search back for the previous line if you need it.

  22. #22

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: Fast replace in a large string

    Solved.
    It is really very simple.
    Fast and easy.
    If someone know significantly faster, tell me.


    Code:
    m = "123" & vbCrLf & "blablabla\999: time\blablabla" & vbCrLf & "changeme" & vbCrLf & "\end\"
    
    'find the marker
    Posi = InStr(m, ": time\")
    If Posi = 0 Then
        Exit Function
    End If
    
    'now we have the marker position
    'find the next CRLF
    MLen = Len(m)
    For i = Posi To MLen
        temp = Mid$(m, i, 1)
        If temp = vbLf Then
            Lpos = i
            Exit For
        End If
    Next
    LeftPart = Left$(m, Lpos)
    'and the next one
    For i = Lpos + 1 To MLen
        temp = Mid$(m, i, 1)
        If temp = vbLf Then
            Rpos = i
            Exit For
        End If
    Next
    
    If Rpos = 0 Then 'then we are at the end of the string
        MidPart = Mid$(m, Lpos + 1, MLen)
        Nullpos = InStr(MidPart, Chr(0))
        If Nullpos > 0 Then MidPart = Left$(MidPart, Nullpos - 1)
        RightPart = vbNullString
    Else
        MidPart = Mid$(m, Lpos + 1, Rpos - Lpos - 2)
        RightPart = Mid$(m, Rpos, MLen)
    End If
    
    'update the middle part
    ...
    ...
    NewVal = NewVal & vbCrLf
    
    m = LeftPart & NewVal & RightPart

  23. #23
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,731

    Re: [RESOLVED] Fast replace in a large string

    you can use another InStr to find CrLf, and another to find the 2nd CrLf
    using: InStr(Posi, m, vbLf)
    when you got all 3 positions, you know what to do.

  24. #24

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by baka View Post
    you can use another InStr to find CrLf, and another to find the 2nd CrLf
    using: InStr(Posi, m, vbLf)
    That is a very good idea.
    Thanks for the hint!

  25. #25
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,444

    Re: Fast replace in a large string

    Quote Originally Posted by Karl77 View Post
    Solved.
    It is really very simple.
    Fast and easy.
    If someone know significantly faster, tell me.
    Uhh... besides checking every single character and throwing temp. strings around in memory?
    btw: faster than what?
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  26. #26
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,731

    Re: [RESOLVED] Fast replace in a large string

    this is about InStrB just to explain a bit about it.

    If you have a byte array, from a file, or a memory region you want to look into.

    sBuffer() as byte
    is where we have the data

    and
    Find() as byte
    is the specific pattern to look for

    it could be its not enough, maybe we know the pattern is "RZ1" and 3 unknown "???" and 3 known "YRA"

    what we do is put this in a loop

    Code:
    lpos = InStrB(1 + lpos, sBuffer, Find, vbBinaryCompare)
    and if lpos > 0
    we can check more, example: If sBuffer(lpos + 2) = Data(0) Then

    this example is from a "working" program that looks into the uncompressed data from the memory using ReadProcessMemory
    to get the dimension of a flash/swf. since theres no property in the flash component that can do that.

  27. #27
    Hyperactive Member Daniel Duta's Avatar
    Join Date
    Feb 2011
    Location
    Bucharest, Romania
    Posts
    397

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by Karl77 View Post
    Solved.
    It is really very simple.
    Fast and easy.
    If someone know significantly faster, tell me.
    In my opinion, if we talk about a robust solution, in terms of speed the Olaf's method for parsing huge files is hard to beat in vb6. It is based on ICSVCallback_NewValue function and a StringBuilder is used as buffer that is "released" on disk after each 1Mb size. I now there are many approaches but the mechanism from this post http://www.vbforums.com/showthread.p...les&highlight= helped me to parse a file with 22 milions of rows (740Mb) in 18 seconds. Moreover, it is not third party dependent as you might think at first sight due to a regfree capability included in the DirectCOM library.
    "VB code is practically pseudocode" - Tanner Helland
    "When you do things right, people won't be sure you've done anything at all" - Matt Groening
    "If you wait until you are ready, it is almost certainly too late" - Seth Godin
    "Believe nothing you hear, and only one half that you see" - Edgar Allan Poe

  28. #28
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,601

    Re: [RESOLVED] Fast replace in a large string

    I think OP's pretty much nailed it. After he incorporates baka's suggestion of using Instr to find the carriage return/Line feeds, I think it's safe to say his method is adequate. There are ways of optimizing further but if OP is satisfied, it's just over engineering at this point.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  29. #29
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,444

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by Niya View Post
    I think OP's pretty much nailed it. After he incorporates baka's suggestion of using Instr to find the carriage return/Line feeds, I think it's safe to say his method is adequate. There are ways of optimizing further but if OP is satisfied, it's just over engineering at this point.
    I agree to a point, but consider the subject of the thread: "Fast" replace in a large string.
    I don't see, how comparing each character or using InStr/InStrB can be fast, especially since the OP is using temp. string-variables throwing around in memory.
    I've just read Daniel's Link on Olaf's solution, and i don't think that we have to discuss Olaf's skills.

    Yeah, the goal of programming is:
    1) Make it work!
    2) If 1) then make it faster without breaking the working solution
    3) If 1) and 2) then Make it right! (aka Handling of "IT Layer 8"-scenarios)

    At least, that's my philosophy (Discussion of the order of my 3 steps not withstanding )
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  30. #30

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by Zvoni View Post
    consider the subject of the thread: "Fast" replace in a large string.
    I don't see, how comparing each character or using InStr/InStrB can be fast, especially since the OP is using temp. string-variables throwing around in memory.
    I didn't measure exactly, but I think the solution is fast enough.
    Instr is used 3 times.
    The comparision of single characters (as in my snippet) is gone.

    The temp$ could be avoided somehow, perhaps.
    The average data amount is around 5MB (1k to 10MB) per execution.
    The temp$ are cleared after execution.
    No memory wasted.

    My coding approaches are simple:

    1) make it work reliable
    2) keep it simple, readable and understandable
    3) spot the bottlenecks
    4) make it as fast as neccessary - not as possible

    In my case I was at 3) and 4).
    Which is solved now.

    I could work further on the performance.
    What for?
    It won't have a major impact on the overall performance of the real app function.

    Imagine this special 'fast replace' task takes 6msec.
    Then I optimize it to 3msec.

    Now let's execute it 20x.
    I won 60msec.

    Very good.
    But I won't notice...

  31. #31

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: [RESOLVED] Fast replace in a large string

    This discussion helped me to get the simple idea.

    ALL comments were helpful.
    Also baka's short hints.
    Which I followed partially.

    So thanks to all of you.

  32. #32
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,601

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by Zvoni View Post
    I agree to a point, but consider the subject of the thread: "Fast" replace in a large string.
    I don't see, how comparing each character or using InStr/InStrB can be fast, especially since the OP is using temp. string-variables throwing around in memory.
    I've just read Daniel's Link on Olaf's solution, and i don't think that we have to discuss Olaf's skills.

    Yeah, the goal of programming is:
    1) Make it work!
    2) If 1) then make it faster without breaking the working solution
    3) If 1) and 2) then Make it right! (aka Handling of "IT Layer 8"-scenarios)

    At least, that's my philosophy (Discussion of the order of my 3 steps not withstanding )
    If he is satisfied with it, I don't seen any need to optimize it further. Over engineering and premature optimization are huge time wasters when you need to get something up and running quickly.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  33. #33

    Thread Starter
    Fanatic Member
    Join Date
    Apr 2015
    Posts
    524

    Re: Fast replace in a large string

    Quote Originally Posted by Shaggy Hiker View Post
    This data sounds like a real dogs dinner.
    ROFL
    houndred... stupid me...

  34. #34
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,259

    Re: [RESOLVED] Fast replace in a large string

    Quote Originally Posted by baka View Post
    this is about InStrB just to explain a bit about it.

    If you have a byte array, from a file, or a memory region you want to look into.

    sBuffer() as byte ... is where we have the data

    Find() as byte ... is the specific pattern to look for
    Glad you posted at least something which resembles a real code-example...

    Quote Originally Posted by baka View Post
    ...what we do is put this in a loop

    Code:
    lpos = InStrB(1 + lpos, sBuffer, Find, vbBinaryCompare)
    Because your line above shows exactly the naive approach I've expected.

    It's wrong to write it like that... (and wrong to suggest to others, to use it like that).

    See, when you use it as you suggested - in a loop - then the approach of:
    - passing ByteArrays directly
    - without prior String-conversion into InstrB
    Is a huge performance-hog.

    The code-example below shows, how huge the performance-gap
    to "normal Instr-usage with normal WChar-BStrings" can be.
    (InstrB being about factor 1000 slower than normal Instr).

    Code:
    Option Explicit
    
    Private Declare Function QueryPerformanceFrequency& Lib "kernel32" (x@)
    Private Declare Function QueryPerformanceCounter& Lib "kernel32" (x@)
    
    Private sLogFileInput As String, bLogFileInput() As Byte, T@
    
    Private Sub Form_Load() 'prepare some simulated LogFile-Input in a String
      Dim i As Long, S(1 To 12000) As String, D As Date: D = Now
      For i = 1 To UBound(S)
        S(i) = D + i / 86400 & " ... some longer LogEntry-Line with some leading Timestamp... " & i
      Next
      sLogFileInput = Join(S, vbCrLf) 'the simulated Log-Content as String-Input
      bLogFileInput = StrConv(sLogFileInput, vbFromUnicode) 'and here the same content in a ByteArray
    End Sub
     
    Private Sub Form_Click()
      Print vbLf; "Input-Len: "; Format(Len(sLogFileInput) / 1024 ^ 2, "0.0MB")
      
      DoEvents: T = MsecTimer
      Print "ReadLines-InstrS: "; ReadLinesInstrS(sLogFileInput), Format(MsecTimer - T, "0.00msec")
      
      DoEvents: T = MsecTimer
      Print "ReadLines-InstrB: "; ReadLinesInstrB(bLogFileInput), Format(MsecTimer - T, "0.00msec")
    End Sub
    
    Function ReadLinesInstrS(sInput As String) As Long 'line-wise looping (returning the line-count)
      Dim Pos1 As Long, Pos2 As Long, CurLine As String
      
      Do While Pos2 < Len(sInput)
        Pos2 = InStr(Pos2 + 1, sInput, vbCrLf)
        If Pos2 = 0 Then Pos2 = Len(sInput) + 1
        
    '     cut-out the current line (in case that's needed)
    '     CurLine = Mid$(sInput, Pos1 + 1, Pos2 - Pos1 - 1)
        
        Pos1 = Pos2 + 1
        ReadLinesInstrS = ReadLinesInstrS + 1
      Loop
    End Function
    
    Function ReadLinesInstrB(bInput() As Byte) As Long 'line-wise looping (returning the line-count)
      Dim Find() As Byte: Find = StrConv(vbCrLf, vbFromUnicode)
      Dim Pos1 As Long, Pos2 As Long, CurLine As String
      Do While Pos2 < UBound(bInput) + 1
        Pos2 = InStrB(Pos2 + 1, bInput, Find)
        If Pos2 = 0 Then Pos2 = UBound(bInput) + 2
    
    '     cut-out the current line (in case that's needed)
    '     CurLine = StrConv(MidB$(bInput, Pos1 + 1, Pos2 - Pos1 - 1), vbUnicode)
    
        Pos1 = Pos2 + 1
        ReadLinesInstrB = ReadLinesInstrB + 1
      Loop
    End Function
    
    Function MsecTimer() As Currency 'a Timing-Helper
      Dim c@, frq@
      QueryPerformanceFrequency frq
      If QueryPerformanceCounter(c) Then MsecTimer = CCur(c / frq) * 1000@
    End Function
    FWIW, here's the corrected version of the InstrB-approach
    (now being faster than the "normal Instr").

    Code:
    Function ReadLinesInstrB(bInput() As Byte) As Long 'line-wise looping (returning the line-count)
      Dim sInp As String: sInp = bInput 'place the Bytes in a BString (to avoid implicit conversions)
      Dim sFnd As String: sFnd = StrConv(vbCrLf, vbFromUnicode) 'same here, ANSI-content goes directly into a String
      Dim Pos1 As Long, Pos2 As Long, CurLine As String
    
      Do While Pos2 < LenB(sInp)
        Pos2 = InStrB(Pos2 + 1, sInp, sFnd) 'now InstrB will not have to perform implicit conversions
        If Pos2 = 0 Then Pos2 = LenB(sInp) + 1
        
        'cut-out the current line (in case that's needed)
        'CurLine = StrConv(MidB$(sInp, Pos1 + 1, Pos2 - Pos1 - 1), vbUnicode)
    
        Pos1 = Pos2 + 1
        ReadLinesInstrB = ReadLinesInstrB + 1
      Loop
    End Function

    Olaf
    Last edited by Shaggy Hiker; Mar 27th, 2020 at 08:20 AM. Reason: Removed overly snide comment.

  35. #35
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,047

    Re: [RESOLVED] Fast replace in a large string

    A suggested approach need not include the optimal means of implementing such an approach. That happens all the time. Please leave personal animus out of it.
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width