[RESOLVED] Memory allocation issues
I´ve encountered memory allocation issues when I´ve trying to open 50 MB textfile in TextBox(Multiline). I´ve looked at project properties and textbox properties, but I was unable to find an appropriate option.
For example: I´ve tried to open 45 MB file in my app, but memory allocation goes up to 2 950 MB and then my system crashed with an out of memory error. Is there a way how to allocate (dynamically if it´s possible) more memory (i.e. 600 MB or so)?
Re: Memory allocation issues
You shouldn't need to. Memory allocation is handled by the program negotiating with the OS, and that's all that is usually required. In this case, you seem to be allocating FAR more than the file size, and you are talking about allocating far more than the file size. Most likely, there is something about your code that is causing you to make multiple copies of the file until the program crashes. Without seeing code, though, that's just a guess, but why else would you be allocating what appears to amount to nearly 20x the size of your file?
Re: Memory allocation issues
My code is simple: It performs hexadecimal conversion of specified file.
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim bytes As Byte() = IO.File.ReadAllBytes(Form1.TextBox1.Text)
Dim hex As String() = Array.ConvertAll(bytes, Function(b) b.ToString("X2"))
RichTextBox1.Text = (String.Join("", hex))
End Sub
I´ve tried to open another file - 30 MB. After 8 seconds, memory allocation goes up to 3 150 MB.
Re: Memory allocation issues
OK, so on the face of it you are creating three copies, two of which are at least twice the size of the byte array. First you read all the bytes into an array, which can't be smaller than the file. You then convert that byte array to a string array. That doesn't remove the byte array, it's still there, but now you have a second array of strings, which also can't be smaller than the original file, and is likely at least twice as large, and possibly four times as large (a byte is a byte, but a byte represented as a string has to be two characters, which are either one byte each or two bytes each, so the string array has to be either twice or four times the size of the byte array). Finally, you join that string array into one giant string and put it in the RTB. That joined string can't be smaller than the string array. In fact, you may be creating two different strings in this step, so there may be a fourth copy in there. The Join may create a string that is then copied into the RTB, in which case there's a fourth, temporary, intermediate.
That's a fair amount of memory being used up. The temporary objects, which are the byte array and the original string array, will not be destroyed, but they should be flagged for destruction once the method completes. While the method is running, you'd be using a HUGE amount of memory, but once that method is finished, it should be possible to recover that memory.
The clear solution is to avoid all the intermediates. I'm not sure what would be efficient for doing that, but ReadAllBytes seems like it would be the wrong way to start out. If you were to read the file byte by byte, you could convert each byte as it comes in. If you did that, and attached each to a string, that would be truly horrible, since strings are immutable, so each byte would require a new string as long as the old one plus the size needed for the next byte. That would be a staggering amount of memory. However, using a StringBuilder, which is essentially a mutable string, should remove that cost.
It's quite likely that others can suggest a more efficient approach than that, too. It's not a problem I have ever tried to solve. I can lay out the cost of the way you are doing it, simple though it is, but what the most efficient alternative is I can't say.
Re: Memory allocation issues
Quote:
Originally Posted by
Shaggy Hiker
That's a fair amount of memory being used up. The temporary objects, which are the byte array and the original string array, will not be destroyed, but they should be flagged for destruction once the method completes. While the method is running, you'd be using a HUGE amount of memory, but once that method is finished, it should be possible to recover that memory.
I was recently reading an article about Strings by someone who seems to know: see here (under the heading Memory Usage - the article title is about C# but that doesn't make any difference because it's really about .Net). Apparently each string carries an overhead of 20 bytes, so a 2 character string occupies 24 bytes. Since the code in post #3 tries to generate a 2 character String for each of the 30M or so bytes in the source file, the compiler is going to have to store 30M*24 bytes or roughly 720 Mbytes of string somewhere before it can create the Hex array. Assuming it needs as much again to create the array, that's 1,440 Mbytes needed. It doesn't explain the 3,150 Mbytes, presumably that's because the Garbage Collector doesn't clear up reserved memory until it needs to. But it's not surprising that the code doesn't work very well.
Out of curiosity, I just did a quick test on a 50 MByte file using a For-Next loop and a Stringbuilder. It took about 4.5 seconds to build the string. Feeding it into a RichtTextBox took about 10 times as long as that, 45 seconds, but no signs of memory distress. For what it's worth, here's the testing code I used:
Code:
Using ofd As New OpenFileDialog
If ofd.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim sw As Stopwatch = Stopwatch.StartNew
Dim bytes As Byte() = IO.File.ReadAllBytes(ofd.FileName)
Dim sb As New System.Text.StringBuilder()
For Each b As Byte In bytes
sb.Append(b.ToString("X2"))
sb.Append("; ")
Next
Dim str As String = sb.ToString
sw.Stop()
Console.WriteLine("Time to create " & str.length & " character String: " & sw.ElapsedMilliseconds & " ms.")
sw.Restart()
RichTextBox1.Text = str
sw.Stop()
Console.WriteLine("Time to fill RichTextBox: " & sw.ElapsedMilliseconds & " ms.")
End If
End Using
I was lazy enough to use ReadAllBytes. I'm not sure whether reading the file byte by byte would offer any advantage in the above code. The fact that it took 45 seconds to load the RTB suggests that there might be a better way to display data of interest.
BB
Re: Memory allocation issues
On my machine BB's code ran fine for 50MB files, but not for 100MB files. I created some code for 100MB files and it displays, but in other parts of the program(resize the RTB for example) I get OOM errors.
Can't think of a reason to show large files in a text box of any kind IMHO. If you are trying to do an editor you could do this in blocks.
Re: Memory allocation issues
That's a good point. Since BB showed that the cost of putting the text into the RTB was really high, and since you probably can't reasonably view a 50MB file (how many days do you want to be reading that?), there is probably room for optimization there.
Re: Memory allocation issues
My "test code" wasn't particularly accurate. See the edited code (marked red) in post #5. It was more correct to include sb.ToString in the first timing block. The file I was using for test purposes (actually 46,930,988 bytes) now gave these results:
-- generate string was slightly slower, 4.6 to 4.7 instead of about 4.5 sec.
-- display in RTB was considerably quicker, just over 32 s. instead of 45 sec.!
It just shows... well what?
Incidentally, I tried the code on a 233 Mbyte file and the results were 23 sec. to generate the string, 3 minutes to display, no errors on so-called Zen ultrabook with Win10, i7, 8Gb. Ah, computers can be so fickle...:rolleyes:.
EDIT: I just tried a 400 MB file and got an OutOfMemory error on StringBuilder.ToString. Voilà.
BB
Re: Memory allocation issues
@BB - what version of VS? I get this when I close the form. Running VS2017 here.
System.OutOfMemoryException
HResult=0x8007000E
Message=Exception of type 'System.OutOfMemoryException' was thrown.
Source=mscorlib
StackTrace:
at System.String.InternalSubString(Int32 startIndex, Int32 length)
at System.String.Substring(Int32 startIndex, Int32 length)
at System.Windows.Forms.RichTextBox.StreamOut(Int32 flags)
at System.Windows.Forms.RichTextBox.get_Rtf()
at System.Windows.Forms.RichTextBox.OnHandleDestroyed(EventArgs e)
at System.Windows.Forms.Control.WmDestroy(Message& m)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.TextBoxBase.WndProc(Message& m)
at System.Windows.Forms.RichTextBox.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
Re: Memory allocation issues
VS2013. I just tried it in VS2017 and 100Mb worked OK. But 230Mb threw an exception in VS2017 while trying to load the RTB:
Code:
Managed Debugging Assistant 'ContextSwitchDeadlock' : 'The CLR has been unable to transition
from COM context 0x88bd40 to COM context 0x88bc18 for 60 seconds. The thread that owns the
destination context/apartment is most likely either doing a non pumping wait or processing a very
long running operation without pumping Windows messages. This situation generally has a negative
performance impact and may even lead to the application becoming non responsive or memory
usage accumulating continually over time. To avoid this problem, all single threaded apartment
(STA) threads should use pumping wait primitives (such as CoWaitForMultipleHandles) and
routinely pump messages during long running operations.'
Does that mean anything to you? I refrain from pumping wait primitives, just getting out of bed is more than enough for me nowadays:o.
BB
Re: Memory allocation issues
That just means that something is taking so long that the CLR is getting nervous. You pretty much know what that is, too, because putting the data into the RTB is taking FOREVER in computer time.
Re: Memory allocation issues
I´ve tried this code for 200 KB file, but it takes too long to display (3 minutes and still nothing)
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim bytes As Byte() = IO.File.ReadAllBytes("C:\users\acer\desktop\file.txt")
Dim sb As New System.Text.StringBuilder()
For Each b As Byte In bytes
sb.Append(b.ToString("X2"))
sb.Append(" ")
Next
Dim str As String = sb.ToString
RichTextBox1.Text = str
End Sub
Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button3.Click
GC.Collect()
End Sub
Also, forcing garbage collector only freed 10KB as I seen in taskmgr. But, freeing unused memory AFTER loading the text isn´t that important for now, problem is that it takes too long TO DISPLAY. Problem persist even if I want simply load textfile without any additional operations. The same problem I have with notepad. I don´t know if these two problems are the same - most likely yes. How it can be resolved? I thought about resetting notepad´s registry settings to default, but if there´s simplier way, I´m listening...
Re: Memory allocation issues
Quote:
Originally Posted by
VB.NET Developer
I´ve tried this code for 200 KB file, but it takes too long to display (3 minutes and still nothing)
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim bytes As Byte() = IO.File.ReadAllBytes("C:\users\acer\desktop\file.txt")
Dim sb As New System.Text.StringBuilder()
For Each b As Byte In bytes
sb.Append(b.ToString("X2"))
sb.Append(" ")
Next
Dim str As String = sb.ToString
RichTextBox1.Text = str
End Sub
Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button3.Click
GC.Collect()
End Sub
Also, forcing garbage collector only freed 10KB as I seen in taskmgr. But, freeing unused memory AFTER loading the text isn´t that important for now, problem is that it takes too long TO DISPLAY. Problem persist even if I want simply load textfile without any additional operations. The same problem I have with notepad. I don´t know if these two problems are the same - most likely yes. How it can be resolved? I thought about resetting notepad´s registry settings to default, but if there´s simplier way, I´m listening...
Have you tried using a StopWatch and timing the duration various parts take? Reading the file, building the string or updating the RTF?
Have you tried just appending directly to the RTF control rather than building a string and assigning it to the text box in one go?
I wouldn't rely on taskmgr to give insights into .Net memory usage either, even if the GC frees up memory it might not be released back to windows anyway.
Re: Memory allocation issues
Quote:
Originally Posted by
PlausiblyDamp
Have you tried just appending directly to the RTF control rather than building a string and assigning it to the text box in one go?
Now it works as expected! Well, I always used ReadAllBytes instead of ReadAllText command. My data is only in text form. No memory issues now, it just loads 10 MB textfile pretty fast (5 seconds). Oh, how simple...