Dynamic arrays: what do you think about this scheme?
I have been thinking about the most convenient way to use dynamic arrays so that memory can be efficiently used. The following code is an example borrowed from an application I'm working on. Aside from the fact that it doesn't care about the total amount of computer memory it seems to work very nicely.
However, I'd like to hear the gurus' opinion: is it formally correct or does it have some conceptual flaw that I have overlooked?
Many 'gurus' would tell you that using ReDim is bad. But I'm not one of them. Without studying it too closely, your code looks fine to me.
It's been my experience when reading large files into arrays, that ReDimming an array for each iteration through a read loop is actually more efficient than initialising the array to some arbitrary number (in your case 1000). Weird but true.
The only thing I'd change would be to start the array at zero instead of one, and do this:
VB Code:
Do While Not EOF(ff)
Line Input #ff, TxtLin
' ReDim for each iteration
ReDim Preserve x(nl), y(nl)
nl=nl+1
Loop
However, don't take my word for it. I'd do some simple benchmarks comparing each method and see which yields the best performance.
Cheers
Pete
No trees were harmed in the making of this post, however a large number of electrons were greatly inconvenienced.
Originally posted by pnish It's been my experience when reading large files into arrays, that ReDimming an array for each iteration through a read loop is actually more efficient than initialising the array to some arbitrary number (in your case 1000). Weird but true.
Do you mean efficient in terms of time? Weird, indeed... I'll have to try some benchmarking as you suggest.
Btw, is there some efficiency reason for starting the array at 0? Generally speaking I tend to start arrays at 1 because I like the upper bound to be the same as the number of elements in the array (just a matter of personal taste).
Thanks for your answer + have a good day.
Last edited by krtxmrtz; May 21st, 2003 at 04:21 AM.
Originally posted by krtxmrtz Do you mean efficient in terms of time? Weird, indeed... I'll have to try some benchmarking as you suggest.
Btw, is there some efficiency reason for starting the array at 0? Generally speaking I tend to start arrays at 1 because I like the upper bound to be the same as the number of elements in the array (just a matter of personal taste).
Yes, efficient time wise. I initially had a requirement to read the contents of an 11mb file into an array (about 500,000 recs). ReDimming after each read loaded the file in around 90 seconds. Like you, I thought it would be better to initialise the array to some relatively large number and ReDim it as required. I tried it and it actually took longer!? Don't ask me why.
There's no efficiency reason (that I'm aware of) for starting an array at 0. It's the way I've always done it and its the only way in most other languages. A good reason to get into the habit now is that VB.Net doesn't give you a choice. All arrays are zero based. I guess it's only important if you're planning to go VB.Net at some time, otherwise, as you say it's a matter of personal taste.
Cheers & good luck
Pete
No trees were harmed in the making of this post, however a large number of electrons were greatly inconvenienced.
Originally posted by pnish A good reason to get into the habit now is that VB.Net doesn't give you a choice. All arrays are zero based.
That's bad news for me. I acquired the habit of starting at 1 simply because that's the way it used to be in Fortran that I still work with for some very special applications.
So, if I ever leap onto VB.Net I'll have to rewrite all my 1-based array software. It's not going to be just as simple as changing the dimensions, I'll have to re-think a large number of for/next and similar loops... Oh my!
Thanks for this interesting info. Good luck to you too.
You could do it using collections...
I think it may be a bit faster for large numbers of points as the ReDim statement can slow your array method down...
In the class Points, you can change the LoadData function to load x and y co-ordiantes from your file...
Using classes is more versitile...personally I am not a big fan of arrays...
Hope this helps...
If you have any questions, then give me a shout...
Originally posted by Wokawidget You could do it using collections...
I think it may be a bit faster for large numbers of points as the ReDim statement can slow your array method down...
As Woka says, you could accomplish the same thing using collections, but they're definitely not faster than using arrays, especially if you have a large number of data points.
ReDim certainly does slow things down a bit, but it's still a good way to do it when you don't know how many data points you're loading.
In my opinion, collections are more useful when working with objects unless you're working with a small set of data.
Another benefit in favour of Arrays is that it's extremely fast to iterate through them, whereas it's many times slower to do the same with a collection.
As I've said before, don't take my word for it. Do some tests and then decide which is best.
I don't mean to rain on your parade Woka, but I think arrays are a better choice for the type of app krtxmrtz is doing.
Pete
No trees were harmed in the making of this post, however a large number of electrons were greatly inconvenienced.
I completely agree with you. They probably are faster...although, having...
VB Code:
ReDim Preserve x(1 To MaxNum), y(1 To MaxNum)
...in the loop will start to slow down conciderably if loaded loads of data.
There is one other benefit of classes, and that's the fact that they can be easily customised and functionality added...using a UDT or an array doesn't give you this option...
Going back to the speed thing...if arrays took 50ms, and classes took 100ms, then I would use classes. The use is NOT going to notice that speed inpact...so I would trade it off with the fact that I can change my code far easier using classes....however...if it's part of a routine that has many other things that take 100ms, then these can add up to say 5 seconds...so if you use the fastest methods you can get it down to say 2 seconds...now in this case arrays would be better....
It really is a trade off between performance and scaleability...if it's just run once, on it's own, I would use classes...
Yeah, you're right. Speed's a relative thing. If it takes twice as long to iterate through a collection as it does to achieve the same with an array, but that time difference translates to a few ms, then what the heck, who's going to notice anyway.
But.... as krtxmrtz says in his initial post, he's loading a 'very large' number of x & y coordinates from a file. I don't know what he means by 'very large' but let's say 1,000,000 records.
I wrote a little app to time how long it takes to load & iterate a collection & an array which read 1,000,000 records from a text file.
Time to iterate through each (using a simple For loop):
array 0.43 seconds collection I gave up after 20 minutes and it had only got to record 115,000
Maybe 1,000,000 records is a bit over the top, I don't know, but it demonstrates that the larger the data set, the more convincing the argument for arrays becomes. Also, there no reason that krtxmrtz's code couldn't be wrapped up in a class as it is, but in this case I'm not sure it would provide any real benefit as the code's only purpose is to populate an array.
Anyway, enough said. Horses for courses.
Cheers
BTW I loved that rampaging badger. I'm glad we've only got devils here in Tasmania
Pete
No trees were harmed in the making of this post, however a large number of electrons were greatly inconvenienced.
What he's doing is a pretty standard technique for speeding up memory allocation in C. Grab free memory a bunch at a time, use what you need, then grab a bunch more when you need it. This is efficient since you don't need to do the (relatively) expensive process of acquiring memory as often. However, you never direcly acquire memory in VB, so the efficiency gain may not be there.
As pnish shows, VB must be doing something mighty weird with memory.