|
-
Jun 21st, 2000, 08:27 AM
#1
Thread Starter
Member
I am loading data from a CSV into a dynamic array that I size before reading the data into it. First thing I do is open the file and count the lines.
Open FileName For Input As #1
Do Until EOF(1)
Line Input #1, SomeVariable
Linecount = linecount + 1
Loop
Close #1
Then I ReDIM my array to the line count
Redim MyArray(linecount,5)
After that I reopen the file and load the array from the file. Although this works for now for a couple of hundred records I have some concerns when the data grows. This is for data import and is only temporary for loading into DB. So is there a way to count the lines without reading all of them? Then I can change the code to process 100 lines at a time until done. That way avoiding some huge array that chokes out of memory or something.
Thanks
-
Jun 21st, 2000, 08:58 AM
#2
Fanatic Member
Your worrying over nothing as:
You dim the variable as local in the procedure so every time a new line text is read in it overwrites the previous and all your keeping hold of the total count value which, if its an integer its only 2 bytes and if its a long then its four bytes
Your Code:- documented
Code:
'open file
Open FileName For Input As #1
'loop until end of file
Do Until EOF(1)
'read first/next textline into Somevariable (the buffer)
Line Input #1, SomeVariable
'increment counter
Linecount = linecount + 1
Loop
'close file
Close #1
So everytime a new line of text is read in, it will overwrite the existing value (as long as your not storing it in an array)
DocZaf
-
Jun 21st, 2000, 09:14 AM
#3
Thread Starter
Member
REDIM The Array
I know about that. It’s just with test files of 10,000 + lines, it draws a little power and time to read the file. I just wanted to know if that was the most efficient method. It seams a little excessive and wasteful.
Thanks for the reply
-
Jun 21st, 2000, 03:23 PM
#4
Hyperactive Member
I'm not sure if it would be faster, but you could alway avoid reading the lines, and simply use the split function. Here is what I mean:
Code:
Private Sub Command1_Click()
Dim fso, sTemp(1)
Dim newVar() As String
Dim myVar As String
Set fso = CreateObject("scripting.FilesystemObject")
Set sTemp(0) = fso.GetFile("c:\windows\desktop\testfile.txt")
Set sTemp(1) = sTemp(0).OpenAsTextStream(1, -2)
myVar = sTemp(1).ReadAll
newVar = Split(myVar, Chr(13) & Chr(10))
For x = 0 To UBound(newVar)
MsgBox newVar(x)
Next
End Sub
However, with 10,000 records, I'm not sure how much faster this would be, but at least you could avoid all the file accesses.
(Using VB6 SP3)
-
Jun 21st, 2000, 05:59 PM
#5
Fanatic Member
True, The filesystem object can be slow. The quickest way I know is this.
Pseudo code---
Open the file in binary mode
GET the file to a byte array (can open several MB per second
MYString = StrConv(ByteArray, VBUnicode) (pretty much instant)
The Split() function will redim the array for you. then just loop from zero to the Ubound()
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 21st, 2000, 06:15 PM
#6
Fanatic Member
I have done som eresearch into this a while ago, and avoid using the split command at all costs. It is extremley slow. It is better to write your own funcion.
The way to go is to read the whole file in one go, and count the number of vbCrLf in the string that is returned.
So do as paul says, or use the inputB function. Then use the instr function in a loop to count the number of new line characters.
Iain, thats with an i by the way!
-
Jun 21st, 2000, 08:14 PM
#7
Fanatic Member
I was going to write that too, but, from what I could gather the line count was just for splitting the function anyway so it would be better just to split than to count and then split.
The Split function makes a poor line counter but it's a good splitter.
If your intention is just to count lines then loop the instr but if that's just to loop your own split function then go straight for the split()
I'm a bit of a fan of just "going straight for the Split" 
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 21st, 2000, 08:22 PM
#8
Fanatic Member
No spliter is a poor spliter. As i said earlier it is very slow. I think, this is just a guess that it uses ReDim preserve evertime it finds a new character to splt on. That is why it is very slow.
Just like VB's InStrRev function. Why reverse the string then use InStr? Why just not start from the end of the string and loop backwards? Just a couple of examples of why you should write your own functions. Microsoft are crap at writing them.
It is fairly easy to write your own spliting function, and if you get stuck we are more than willing ot help out. Basically though, you use the InStr() and Mid$() functions with a couple of pointers.
Iain, thats with an i by the way!
-
Jun 21st, 2000, 08:58 PM
#9
Fanatic Member
Really?
I found the splitter to be a poor line counter like I said but I'd like to see some code to beat the split function. The MS knowedge base has some code (the code?) for split and join as they weren't in VB5 and they don't seem to run any better, you got a good function I'd like to do some tests...
Have you got anything you could post?
Paul
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 21st, 2000, 09:29 PM
#10
Fanatic Member
Paul,
Here ya go. It is only really noticable on very large strings.
Code:
Function mySplit(strToSplt As String, strSplitOn As String) As String()
Dim ip1 As Long, ip2 As Long
Dim strArray() As String
Dim iCount As Long
ip1 = 1: ip2 = 1
Do
ip2 = InStr(ip1, strToSplt, strSplitOn)
If ip2 = 0 Then
ip2 = Len(strToSplt) + 1
End If
If iCount Mod 100 = 0 Then
ReDim Preserve strArray(iCount + 100) As String
End If
strArray(iCount) = Mid$(strToSplt, ip1, ip2 - ip1)
ip1 = ip2 + Len(strSplitOn)
iCount = iCount + 1
Loop Until ip2 >= Len(strToSplt)
ReDim Preserve strArray(iCount - 1)
mySplit = strArray
End Function
Iain, thats with an i by the way!
-
Jun 22nd, 2000, 03:52 PM
#11
Fanatic Member
I have done some tests on the VB Split() command against the above code, and the results are quite impresive.
Also when i upped the charcter even more, the VB Split() function ran out of memory, and mine did not.
Code:
String length = 4,669,056 'a bit excesive i know
Number | VB Split() | My Split()
Of | Time | Time
Splits | |
---------------------------------------------
| |
712080 | 52.89 s | 2.44 s
| |
---------------------------------------------
| |
256032 | 11.64 s | 2.3 s
| |
---------------------------------------------
| |
50400 | 3.04 s | 1.22 s
| |
---------------------------------------------
Results from a 700mhz with 128mb ram
I have also writen a function that will match the InStrRev function in vb6.
The way Microsoft suggest you write it is a complete load of crap. There technique ran out of memory on the string i used above. This way of doing it found a string that i had put at the beginning of the string used above in 1.3 seconds.
Code:
Function myInStrRev(strStringToSearch As String, strFind As String, _
Optional iStart As Long) As Long
Dim ip1 As Long, ip2 As Long
Dim iLenStringToSearch As Long
'get the length of the string
iLenStringToSearch = Len(strStringToSearch)
'if the start is 0 then set the start to the length
'og the string
If iStart = 0 Then
iStart = iLenStringToSearch
End If
ip1 = 1
Do
ip2 = InStr(ip1, strStringToSearch, strFind)
If (ip2 > 0) And (ip2 < iStart) Then
'if ip2 is not zero and it is less than the
'place to start searching then set the function
'to return that position
myInStrRev = ip2
ElseIf ip2 = 0 Then
ip2 = iLenStringToSearch
End If
ip1 = ip2 + 1
Loop Until ip1 >= iStart
End Function
[Edited by Iain17 on 06-23-2000 at 06:41 AM]
Iain, thats with an i by the way!
-
Jun 22nd, 2000, 03:58 PM
#12
Fanatic Member
Impressive
Is the "Number of splits" the number of time the split function was called or the number of pieces the text was split into?
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 22nd, 2000, 04:03 PM
#13
Fanatic Member
The number of pieces the text was split into.
Iain, thats with an i by the way!
-
Jun 23rd, 2000, 12:22 AM
#14
PowerPoster
May be you need to download this sample project ro understant the API function that use to extend the TextBox control.
Sample Project
-
Jun 23rd, 2000, 12:29 AM
#15
Fanatic Member
Originally posted by Iain17
The number of pieces the text was split into.
hmmmm...
I've grabbed a copy of this code, I want to do some tests too. I always found split to be fast but you seem to have an order of magnitude on it.
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 23rd, 2000, 12:37 AM
#16
Fanatic Member
I like it!
I think I'll add an enum though to hint at the array size.
ie
Code:
Public Enum ArraySize
vbSmallArray = 50
vbMedArray = 100
vbLargeArray = 500
End Enum
Function myInStrRev(strStringToSearch As String, strFind As String, Size as ArraySize, Optional iStart As Long) As Long
Then use the enum in the Mod code:
"If iCount Mod Size = 0 Then"
Haven't really tested it to get the sizes and the name is not intuitive enough to be able to pick without knowing the function code, but you see where I'm going...
Nice piece of code
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Jun 23rd, 2000, 03:17 AM
#17
Frenzied Member
I must be missing something, but...
Why can't you just write to the array as you read the file instead of making 2 passes through it?
Use Redim Preserve inside the loop:
Code:
Open FileName For Input As #1
Do Until EOF(1)
Linecount = Linecount + 1
Redim Preserve MyArray(linecount,5)
Line Input #1, MyArray(linecount,5)
Loop
Close #1
Isn't this a good, easy way to do it? What am I missing?
-
Jun 23rd, 2000, 05:30 AM
#18
Thread Starter
Member
Why not Redim in the loop? Or just drop it in the array during the loop. Well that would open another whole topic. I really appreciate all the info and I have a few things to work on now. The over all picture is I am reworking output data from another program that no one knows anything about. Some guy wrote and no one has the source and he is gone. Anyway the data gets dropped over to a DB after the data is sorted and merged with another file. I had to make it viewable in a grid if someone wanted to verify the data before accepting the records. As for the Redim Preserve I believe that only works for the last dimension. Redim Preserve MyArray(5,linecount).
Thanks Again Everyone and Iain17
-
Jun 23rd, 2000, 01:36 PM
#19
Fanatic Member
Re: I must be missing something, but...
Seaweed,
The problem with that is that:
that you're rediming every line every line, which can be very slow if you have to do it a lot.
get binary is very quick to load files too
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Apr 7th, 2002, 11:39 PM
#20
Lively Member
OK... So atm I'm using VBA
I am trying to adapt this code to work in VBA the problem is VBA does not like the following:
Function mySplit(strToSplt As String, strSplitOn As String) As String().
So how do I get a split function to work in VBA??? Office 97
-
Apr 8th, 2002, 05:05 AM
#21
Fanatic Member
I'm not that familiar with Office VBA but in ASP you'd have to drop all the types as everything is a varient.
So try getting rid of all the "as string" and "as long" stuff
Paul Dwyer 
Network Engineer
Aussie In Tokyo
Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)
-
Apr 8th, 2002, 05:37 AM
#22
Retired VBF Adm1nistrator
I've not bothered reading the discussion, so forgive me if this has already been suggested, but why not use this approach :
VB Code:
Private Function numLinesInFile(ByVal strFile As String) As Long
Open strFile For Binary As #1
Dim strBuff As String: strBuff = Space(LOF(1))
Get #1, , strBuff
numLinesInFile = UBound(Split(strBuff, vbCrLf)) + 1
Close #1
End Function
Available from my "Various Tutorials" > "File I/O Functions" page
Microsoft MVP : Visual Developer - Visual Basic [2004-2005]
-
Apr 8th, 2002, 06:46 AM
#23
Addicted Member
Another light-hearted post from Guru 
-
Apr 8th, 2002, 06:49 AM
#24
Retired VBF Adm1nistrator
ah.
well Nigorr and paul282 bumped it by posting here so thats why I replied
Microsoft MVP : Visual Developer - Visual Basic [2004-2005]
-
Apr 8th, 2002, 06:57 AM
#25
Frenzied Member
VB Code:
Public Function txtLines(TextFile As String) As Long
Dim SplitLines() As String
Open TextFile For Input As #1
SplitLines = Split(Input(LOF(1), 1), vbNewLine)
txtLines = UBound(SplitLines) + 1
End Function
This help?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|