Help me for my hobby (statistical research), please
I have a file in .txt containing about 900 millions of these numbers (max until to 400) in lines as :
115 99 243 355 400
223 399 344 378 399
…………………………………..
113 78 344 676 233
I must search if 4 or 5 numbers of my line, given exist in file .txt and in what position are.
This is my line given: 348 344 12 343 99
My seriation (line) of 5 numbers given often changes and when I change my seriation I would know new position in the file.
My old code in Qbasic (years 1983) is strongly slow. I am searching a new and good algorithm in vb5 ( say that is better) for a code more speed.
Excuse me if my English is not perfect.
Thanks a lot
Peppos
If I understand you correctly then you are just looking for the position of a string of numbers within a text file. You would need to read the text file row by row as below:
Code:
iRowPtr = 0
' open the file
iFile = freefile
open filename for Input as #iFile
' sLine must be set to the numbers you are searching for
sLine = "1234"
while not eof(iFile)
File Input #iFile, sNumbers
iRowPtr = iRowPtr + 1
if instr(sNumbers, sLine) > 0 then
xValue = instr(sNumbers, sLine) ' xValue is the column and iRowPtr is the row
exit function
end if
wend
If you're going to be crazy, you have to get paid for it or else you're going to be locked up. -- Hunter Thompson
Often is 40 or 60 mega on hard disk.
Actually I am using this code in BBC Basic but is slow for searching:
PO= 4
B1%(K%,0) B1%(K%,1) B1%(K%,2) B1%(K%,3) B1%(K%,4) (chaine in the file)
rem : PO is the quantity of number to be searched in the file containing 500.000 until 900.000 chaines
a%() = 222, 112, 234, 900, 111 rem my chaine to be searched
FOR K% = 1 TO 900000
N% = 0
FOR I% = 0 TO 4
FOR J% = 0 TO 4
IF a%(I%) = B1%(K%,J%) N% += 1
NEXT
NEXT
IF N% >= PO PRINT B1%(K%,0) B1%(K%,1) B1%(K%,2) B1%(K%,3) B1%(K%,4) ;" ";KKK; " POS. ";K%
If you want a major speed increase then byte arrays is the way to go...
Easiest way with using byte arrays is to load the whole file into an array and search through that. Having a 40-60MB file in RAM is a lot, but for most modern computers it's not asking too much...the algorithm can always be rewritten to work with chunks of the file at a time...
Here is an example. To see how fast it really is, compile it to an EXE first and then run it. When it loads, select your file. Then enter the numbers in the textbox exactly as they appear in the file and click GO.
Thanks a lot for your reply. You are a kindly and well disposed person, God help you always !
I am not skilled to transform your code in .exe. Perhaps becauseI have an
old VB.
However I already have put the data in array, (cutting the quantity), but the speed increase a few.
Can you make for me your code and sent the code in exe? I thank you much.
Otherwise you can recommend me to purchase a computer more speed.?
But what ?. Someone has advised a PC with Vista with 8 giga Ram (HP). As your experience is good for my studies and my research ? Can really obtain more speed ? Or can you counsel other ?
Thanks ,Thanks
Giuseppe Villamaina
Peppos from Italy (Naples)
Also, are you using VB6? (not Learning Edition). You should be able to go to File menu -> Make EXE to compile the code into an EXE file. You don't need a fast computer to do this...
hello, thanks.
When I run your .exe the program tell me " Not in use the COMDLG32.OCX"
Now I try to have this COM..... and I'll write to you as soon as possible.
Probably I have old version VB 5 , for this reason does'nt function.
I'll try a new version.
I must search if 4 numbers of 5 numbers given of my line exist in file .txt and in what position are.
EXAMPLE:
This is my line given: 348 12 346 99 111
The code must found a line in which are 4 of my number given, i.e.
111 346 12 222 348 position n. ??? in the file
In order or not in order. Your code searches the entire string and not is good for me. Every number for my research is one value . This searching is the same of a searching on controlling the tickest of the game of the loto . It is not important the order of numbers, is important that in lines there are my 4 number that I search.
To simplify the searching I can have the file or the array also ordered crescent
( i.e.: 12 99 111 346 348) and also my line given can be ordered.
Can you help me still ?
The code is good for me open source for understanding and in .EXE
Thanks and regards
Peppos