|
-
Aug 20th, 2017, 09:27 AM
#41
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Schmidt
Well, then let's see what your recommendations will bring, when I alter it to your wishes...
Ok, done...
... as you wish, master...
Please forgive me, but I've found it this way in posting #10 (from a guy named "flyguille")... 
But Ok - I've adapted it now as commanded...
I hope the new code below is to your satisfaction now - but I have to tell you that the changes you demanded,
make it now run factor 2 slower than before... (about 3.3 µsec per call, compared to the former 1.6)
What now?
Wouldn't it be better, when you perhaps write "the ideal code" for the usage of your approach yourself -
and post it here (preferrably within Code-Tags, if that's not too much to ask...).
Code:
Option Explicit
Private LookupTable(0 To 255, 0 To 255, 0 To 255, 0 To 31) As Byte
Private Sub Form_Click()
Dim i As Long, Result As Boolean, T As Single
T = Timer
For i = 1 To 100000
Result = IsInUSA("121.122.123.124")
If Result Then
'do something here
End If
Next
Caption = Format$((Timer - T) * 10, "0.0") & " µSec per function-call"
End Sub
Function IsInUSA(RemoteIP As String) As Boolean
Dim st() As String
st = Split(RemoteIP, ".")
IsInUSA = LookupTable(Val(st(0)), Val(st(1)), Val(st(2)), Int(Val(st(3)) / 16) And (Val(st(3)) Mod 16))
End Function
Olaf
ok remove the INT(), btw I didn't know a Split() function, is slower than 7 VB string function which its parameters parsings and all. weird.
oh, BTW? why you chose MOD and / 16, if it is 8 , as per 8 bits per byte?
oh, and you didn't did the BIT inspection, just readed the whole byte as output!.
lets change the: To 31
Private LookupTable(0 To 255, 0 To 255, 0 To 255, 0 To 31) As Byte
and
IsInUSA = LookupTable(Val(st(0)), Val(st(1)), Val(st(2)), Val(st(3)), Val(st(4)) / 8) And (2^(Val(st(4)) Mod 8)
oh, now I see you didn't inspect the fourth value at all!
Ofcourse now it is a bit slower.
But prefferably a lookup table, and not rolling an array with many entries, including it doing binary search would be slower.
Oh, BTW, thanks by your participation.
Last edited by flyguille; Aug 20th, 2017 at 09:30 AM.
-
Aug 20th, 2017, 10:41 AM
#42
Re: multiple range checking algorithm vb6
 Originally Posted by flyguille
oh, and you didn't did the BIT inspection, just readed the whole byte as output!.
My little perfomance-test-loop was meant to open your eyes about the
significance of the overhead, the preparing of the Arrays Index-Parameters will cause
(in a lookup-table-based approach).
 Originally Posted by flyguille
Ofcourse now it is a bit slower.
Of course - because you now apply even more math, to prepare the Input-Parameters (the final Index-Parameters for the Array-Lookup).
On my machine it now takes (with your changes applied) 4.2 µsec per IPCheck-call.
 Originally Posted by flyguille
But prefferably a lookup table, and not rolling an array with many entries, including it doing binary search would be slower.
No matter how often you repeat it, your statement is still wrong.
Let's do a comparison again... looking at your approach...:
- no really working code-example exists
- it needs 512 MB of consecutive memory (which on many machines will choke, because the Mem-allocator will not find a large enough block for that)
- the IPLookup-Function takes about 4 µsec to return the Boolean-Result
- to finally allow only an info, whether the IP is an US-IP or not
The approach I've posted in #33 (updated yesterday, including optimizations and performancetests): http://vbRichClient.com/Downloads/IPRangesByCountry.zip
- is a fully working demo
- it needs about 16MB to hold the ~0.5Mio range-infos in a binary-searchable Key/Value-Pair-Class
- the IPLookup-Function takes (on average) 2 µsec to return the String-Result
- which is a 2-Char-Country-String that "covers all countries the world over"
Hmm, when I look at the above, I cannot find a single reason to go with your proposal...
Olaf
-
Aug 20th, 2017, 11:53 AM
#43
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Schmidt
My little perfomance-test-loop was meant to open your eyes about the
significance of the overhead, the preparing of the Arrays Index-Parameters will cause
(in a lookup-table-based approach).
Of course - because you now apply even more math, to prepare the Input-Parameters (the final Index-Parameters for the Array-Lookup).
On my machine it now takes (with your changes applied) 4.2 µsec per IPCheck-call.
No matter how often you repeat it, your statement is still wrong.
Let's do a comparison again... looking at your approach...:
- no really working code-example exists
- it needs 512 MB of consecutive memory (which on many machines will choke, because the Mem-allocator will not find a large enough block for that)
- the IPLookup-Function takes about 4 µsec to return the Boolean-Result
- to finally allow only an info, whether the IP is an US-IP or not
The approach I've posted in #33 (updated yesterday, including optimizations and performancetests): http://vbRichClient.com/Downloads/IPRangesByCountry.zip
- is a fully working demo
- it needs about 16MB to hold the ~0.5Mio range-infos in a binary-searchable Key/Value-Pair-Class
- the IPLookup-Function takes (on average) 2 µsec to return the String-Result
- which is a 2-Char-Country-String that "covers all countries the world over"
Hmm, when I look at the above, I cannot find a single reason to go with your proposal...
Olaf
ok, your code is faster you said? or it is just you want to get what country is as an extra feature which the thread creator didn't asks for.
I would like to understand your code.... but, uhhhhhhhhhhgggggr, largely unreadable. Didn't you know identation and structure even exists?.
Code:
Attribute VB_Name = "modIPCheck"
Option Explicit
Public IPRanges As cSortedDictionary
Public Sub InitRangeDictionaryFrom(FileName As String)
Set IPRanges = New_c.SortedDictionary
Dim BSrc() As Byte, B() As Byte, C As Long, i As Long, K As Currency
BSrc = New_c.FSO.ReadByteContent(FileName)
New_c.Crypt.LZMADeComp BSrc, B
C = (UBound(B) + 1) \ 3
For i = 0 To C - 1 Step 2
K = B(i + i) + 256@ * B(i + i + 1) + 65536@ * B(i + i + 2) + 16777216@ * B(i + i + 3)
IPRanges.Add K, B(C + C + i) + 256& * B(C + C + i + 1) 'add the Country-Item under Key K
Next
End Sub
Public Function CheckIPRange(sIP4 As String, Optional ByVal IncludeRangeInfo As Boolean) As String
Dim Key As Currency, i As Long, Item As Long, IPfrom As String, IPto As String
Key = IP4toKey(sIP4)
If IPRanges.Exists(Key) Then
Item = IPRanges(Key)
Else
i = IPRanges.IndexByKey(Key, True) - 1 '<- True allows an "unsharp" search (delivers the index where a non-existing Key would be sorted into "in-between")
Item = IPRanges.ItemByIndex(i)
End If
CheckIPRange = Chr$(Item And &HFF) & Chr$(Item \ &H100)
If IncludeRangeInfo Then
If i = 0 Then i = IPRanges.IndexByKey(Key)
IPfrom = KeytoIP4(IPRanges.KeyByIndex(i))
If i = IPRanges.Count - 1 Then IPto = "255.255.255.255" Else IPto = KeytoIP4(IPRanges.KeyByIndex(i + 1) - 1)
CheckIPRange = CheckIPRange & " [" & IPfrom & " to " & IPto & "]"
End If
End Function
Public Function IP4toKey(sIP4 As String) As Currency
Dim B() As Byte, i As Long, j As Long
B = sIP4
For i = 0 To UBound(B) Step 2
If B(i) = 46 Then IP4toKey = IP4toKey * 256 + j: j = 0 Else j = j * 10 + B(i) - 48
Next
IP4toKey = IP4toKey * 256 + j
End Function
Public Function KeytoIP4(ByVal Key As Currency) As String
Dim i&, j&: Static B(0 To 3) As Byte, S(0 To 29) As Byte
New_c.MemCopy VarPtr(B(0)), VarPtr(CCur(Key / 10000@)), 4
For i = 0 To 3
Do: S(28 - j) = (B(i) Mod 10) + 48: j = j + 2: B(i) = B(i) \ 10: Loop While B(i)
S(28 - j) = 46: j = j + 2
Next
KeytoIP4 = RightB$(S, j - 2)
End Function
Anyway I will give it a try to understand it.
-
Aug 20th, 2017, 11:59 AM
#44
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
I am beyond thankful for this robust discussion which gives me very useful information to get this done in the best way possible. Again would like to thank all involved, this vb6 group is just amazing!
-
Aug 20th, 2017, 12:28 PM
#45
Hyperactive Member
Re: multiple range checking algorithm vb6
 Originally Posted by flyguille
I would like to understand your code.... but, uhhhhhhhhhhgggggr, largely unreadable. Didn't you know identation and structure even exists?.
What the hell are you talking about? I just saw the code and the indentation is perfect
-
Aug 20th, 2017, 03:45 PM
#46
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Carlos Rocha
What the hell are you talking about? I just saw the code and the indentation is perfect 
well, watching it again seems ok the indentantion, but I didn't see the colon since the Microsoft MSX BASIC 2.0 era (1986)... 
EDIT: nahh, correcting me, I never will write a DO / LOOP while all in a single line, no structure, no indentation .
But, you know, if it works, it is just fine!. But don't present it in a team project if you don't want to be kicked at the third day.
Last edited by flyguille; Aug 20th, 2017 at 03:50 PM.
-
Aug 20th, 2017, 04:06 PM
#47
Re: multiple range checking algorithm vb6
 Originally Posted by flyguille
ok, your code is faster you said?
It sure *is* faster than your Lookup-Routine (which I've written according to your "instructions").
If you think you can do better than that, I'd surely want to see some code from *you*.
What we have currently, regarding your approach is only a "paper-tiger", there's no Copy&Pastable
demo - or even better - a zipped Demo-Project from you anywhere.
I for my part would surely like to see, how you fill up that huge Lookup-Array of yours properly.
To make it somewhat easier for you, I've now extracted the 88314 US-RangeRecords from the
DataSet my larger Demo is using, and one can download it here:
http://vbRichClient.com/Downloads/USRangesDetailed.zip
It contains a normal CSV-File (USRangesDetailed.csv, with IPfrom,IPto per Line) -
but also a Raw-CurrencyArray-File, called USRangesDetailed.bin, which one can load this way:
Code:
Private USRanges() As Currency
Sub InitRangeArray(FileName As String)
ReDim USRanges(0 To FileLen(FileName) \ 8 - 1)
Dim FNr As Long: FNr = FreeFile
Open FileName For Binary Access Read As FNr
Get FNr, , USRanges
Close FNr
End Sub
After the Init-Routine above was called, USRanges has an Ubound of 88314*2-1 and contains the sorted entries of the
88314 US-RangeRecords as numeric values, the IPFrom-entries sitting at the even Indexes, the IPTo-entries at the odd IndexPositions.
(mainly thought for those who want to try themselves at a Binary-Search-based algorithm against that Array).
Would surely like to see some code from you, which got the Lookup-Arrays filled with that detailed set of US-Ranges
(your choice, whether you read the data from the *.csv or the *.bin-file).
 Originally Posted by flyguille
...or it is just you want to get what country is as an extra feature which the thread creator didn't asks for.
I've mentioned that because my code is faster and takes less memory *despite* offering full country-resolution...
If I reduce the Size of my DataSet to "US-Ranges only", then I'd expect results far better than 1 µsec per IPCheck-call.
 Originally Posted by flyguille
I would like to understand your code.... but, uhhhhhhhhhhgggggr, largely unreadable.
...
Anyway I will give it a try to understand it.
Nothing holds you back to reformat it to your liking, it's not that much code after all... <shrug>
As for understanding it though, ...from how you post and interact with other developers (ignoring proof delivered by code) -
I'd think you are still quite young and inexperienced (comparably), so don't expect "immediate results" just by looking at it... 
Olaf
-
Aug 20th, 2017, 04:20 PM
#48
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Schmidt
It sure *is* faster than your Lookup-Routine (which I've written according to your "instructions").
If you think you can do better than that, I'd surely want to see some code from *you*.
What we have currently, regarding your approach is only a "paper-tiger", there's no Copy&Pastable
demo - or even better - a zipped Demo-Project from you anywhere.
I for my part would surely like to see, how you fill up that huge Lookup-Array of yours properly.
To make it somewhat easier for you, I've now extracted the 88314 US-RangeRecords from the
DataSet my larger Demo is using, and one can download it here:
http://vbRichClient.com/Downloads/USRangesDetailed.zip
It contains a normal CSV-File ( USRangesDetailed.csv, with IPfrom,IPto per Line) -
but also a Raw-CurrencyArray-File, called USRangesDetailed.bin, which one can load this way:
Code:
Private USRanges() As Currency
Sub InitRangeArray(FileName As String)
ReDim USRanges(0 To FileLen(FileName) \ 8 - 1)
Dim FNr As Long: FNr = FreeFile
Open FileName For Binary Access Read As FNr
Get FNr, , USRanges
Close FNr
End Sub
After the Init-Routine above was called, USRanges has an Ubound of 88314*2-1 and contains the sorted entries of the
88314 US-RangeRecords as numeric values, the IPFrom-entries sitting at the even Indexes, the IPTo-entries at the odd IndexPositions.
(mainly thought for those who want to try themselves at a Binary-Search-based algorithm against that Array).
Would surely like to see some code from you, which got the Lookup-Arrays filled with that detailed set of US-Ranges
(your choice, whether you read the data from the *.csv or the *.bin-file).
I've mentioned that because my code is faster and takes less memory *despite* offering full country-resolution...
If I reduce the Size of my DataSet to "US-Ranges only", then I'd expect results far better than 1 µsec per IPCheck-call.
Nothing holds you back to reformat it to your liking, it's not that much code after all... <shrug>
As for understanding it though, ...from how you post and interact with other developers (ignoring proof delivered by code) -
I'd think you are still quite young and inexperienced (comparably), so don't expect "immediate results" just by looking at it...
Olaf
nope, but have better things to do... calm down, it is just an ideas exchanges!.
EDIT: which have better things to do, is why I didn't bother in creating a zip with a vb project inside. Or timing it.
I don't know about you, but I am 41yo, with commercial software published and commercializing it, supporting it, with sells and marketing ppl, oh and a graphics designer just hired a month ago....
Anyway, programmer since 10yo..., doing video games in z80 and MSX BASIC since then...., spanish guy, so my english is not perfect.
But all your asumptions, about young, and unexperienced was wrong..., maybe tomorrow as holiday, I can pick or reformat your code to make it more readable and then try to understand why you claim it be fast include it having LOOPS, and FOR/NEXT , to do the IPCHECK, because I see those loops, don't understand how your code can be faster.
But sure, if country flag as return, and/ better, currency flag, sure it is better, more featureful. I am just not sure about faster.
Last edited by flyguille; Aug 20th, 2017 at 04:32 PM.
-
Aug 20th, 2017, 05:56 PM
#49
Re: multiple range checking algorithm vb6
 Originally Posted by flyguille
But all your asumptions, about young, and unexperienced was wrong...,
Nope, not really (and I wrote "comparably", if you care to look that up again).
I'm developing in C/C++ and VB6 for quite a bit longer than you (I'm in my fifties, but many Forum-Users here top that easily),
so you are (comparably) a "youngster" still. 
 Originally Posted by flyguille
maybe tomorrow as holiday, I can pick or reformat your code to make it more readable and then try to understand why you claim it be fast...
I dont "claim" it to be fast - it *is* fast and can be tested by anybody who cares to download, compile and run the Zip I've posted.
I can only encourage you, to post your own approach in a fully working code-example, so that *other* developers can look,
test and compare for themselves - currently it is only you who "claims" something (no real demo anywhere, just talking).
But as it seems currently, you will chicken out, not posting anything (which is a common pattern with challenged - ermm... "youngsters").
 Originally Posted by flyguille
... it having LOOPS, and FOR/NEXT , to do the IPCHECK, because I see those loops, don't understand how your code can be faster.
That sentence alone shows, how inexperienced you are with performance-related things in High-Level-Languages.
Here it is often the longer routines which are faster than the "conveniently written High-Level-OneLiners".
 Originally Posted by flyguille
I am just not sure about faster.
Well, I am - because I *timed* it already (and I usually try very hard, not to give wrong info in a Public Forum).
Olaf
-
Aug 21st, 2017, 08:42 AM
#50
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Schmidt
Nope, not really (and I wrote "comparably", if you care to look that up again).
I'm developing in C/C++ and VB6 for quite a bit longer than you (I'm in my fifties, but many Forum-Users here top that easily),
so you are (comparably) a "youngster" still.
I dont "claim" it to be fast - it *is* fast and can be tested by anybody who cares to download, compile and run the Zip I've posted.
I can only encourage you, to post your own approach in a fully working code-example, so that *other* developers can look,
test and compare for themselves - currently it is only you who "claims" something (no real demo anywhere, just talking).
But as it seems currently, you will chicken out, not posting anything (which is a common pattern with challenged - ermm... "youngsters").
That sentence alone shows, how inexperienced you are with performance-related things in High-Level-Languages.
Here it is often the longer routines which are faster than the "conveniently written High-Level-OneLiners".
Well, I am - because I *timed* it already (and I usually try very hard, not to give wrong info in a Public Forum).
Olaf
the only thing I can thinks, about why my methods, according to you is slower than yours, is that as it is 512MB lookup table, and tested with IP ramdonly generated, maybe MAYBE it is being penalty by the lookup table not being in the CPU's L1/L2 CACHÉ. Other way, how a simple array lookup will be slower than LOOPEd functions in a binary search.
You told that mine approach is 3us per IP check?
that is like a lot for modern CPUs, like 333K of runs per second. 333000, .hmmm there is a penalty somewhere.
I will give yours and mine a try... btw, don't forget it is just an idea exchange, nothing personal. So, take it easy!.
-
Aug 22nd, 2017, 01:52 AM
#51
Re: multiple range checking algorithm vb6
 Originally Posted by flyguille
...the only thing I can thinks, about why my methods, according to you is slower than yours, is that as it is 512MB lookup table,
No, although such a huge "en-bloc" memory-allocation is generally a bad idea,
it is not the array-access which is slow.
 Originally Posted by flyguille
Other way, how a simple array lookup will be slower than LOOPEd functions in a binary search.
As pointed out to you already several times - it is not the final array-acces which is slow -
it is all the operations which come *before* the array-access (preparing the correct Indexes) which make it slow.
 Originally Posted by flyguille
You told that mine approach is 3us per IP check?
Yes - why didn't you test this already?
I've posted code for that.
To write it out more clearly again:
When you have a 4-dimensional LookupArray you will have to provide it with 4 Indexes in the end (to perform the array-read)
(let's call these Indexes a,b,c,d - and assume that they are explicit Integer-Variables at Function-level)...
So, before you do your final (and fast) Lookup as: FunctionResult = LookupArr(a, b, c, d)
you have to calculate the values of a, b, c and d (starting with everything you do on the Input-Param, since entering the Function).
That's where the 3µsec are caused in your case.
- Split (as a super-high-level-function) is of course "slower" than
- a few Instr/Mid$ calls (which are themselves higher-level-functions of the vbRuntime),
- which again are slower than a direct looping over a ByteArray (which avoids any calls into the vbRuntime)
That this is "surprising" to you baffles me a bit...
(giving your background in ASM, one would think that you know how compilers work -
and how they interact with a given runtime-library which provides certain language-features).
 Originally Posted by flyguille
...btw, don't forget it is just an idea exchange, nothing personal. So, take it easy!.
Well, if you'd put yourself into my shoes for a moment, you'd see that my "getting impatient" is explainable
(given the fact, that I'm basically repeating myself for a few postings already).
As for testing (and learning) - here's another example I've put together,
which does a direct Binary-Search on a Currency-Array for only the US-Range
http://vbRichClient.com/Downloads/IP...USDetailed.zip (not requiring a vbRichClient5-reference in your project).
Due to the reduced range of records (only ~88000 for the US), it is of course faster than the "all countries version" I've posted earlier -
now needing only about 0.12 µsec per IPCheck-FunctionCall (when native compiled).
All of this "performance-discussion" is (whilst hopefully being educational) of course largely academic for the case of the OP.
Any approach which delivers the result after 100 µsec or so, would be entirely sufficient I'd say.
Olaf
-
Aug 22nd, 2017, 08:39 AM
#52
Fanatic Member
Re: multiple range checking algorithm vb6
 Originally Posted by Schmidt
No, although such a huge "en-bloc" memory-allocation is generally a bad idea,
it is not the array-access which is slow.
As pointed out to you already several times - it is not the final array-acces which is slow -
it is all the operations which come *before* the array-access (preparing the correct Indexes) which make it slow.
Yes - why didn't you test this already?
I've posted code for that.
To write it out more clearly again:
When you have a 4-dimensional LookupArray you will have to provide it with 4 Indexes in the end (to perform the array-read)
(let's call these Indexes a,b,c,d - and assume that they are explicit Integer-Variables at Function-level)...
So, before you do your final (and fast) Lookup as: FunctionResult = LookupArr(a, b, c, d)
you have to calculate the values of a, b, c and d (starting with everything you do on the Input-Param, since entering the Function).
That's where the 3µsec are caused in your case.
- Split (as a super-high-level-function) is of course "slower" than
- a few Instr/Mid$ calls (which are themselves higher-level-functions of the vbRuntime),
- which again are slower than a direct looping over a ByteArray (which avoids any calls into the vbRuntime)
That this is "surprising" to you baffles me a bit...
(giving your background in ASM, one would think that you know how compilers work -
and how they interact with a given runtime-library which provides certain language-features).
Well, if you'd put yourself into my shoes for a moment, you'd see that my "getting impatient" is explainable
(given the fact, that I'm basically repeating myself for a few postings already).
As for testing (and learning) - here's another example I've put together,
which does a direct Binary-Search on a Currency-Array for only the US-Range
http://vbRichClient.com/Downloads/IP...USDetailed.zip (not requiring a vbRichClient5-reference in your project).
Due to the reduced range of records (only ~88000 for the US), it is of course faster than the "all countries version" I've posted earlier -
now needing only about 0.12 µsec per IPCheck-FunctionCall (when native compiled).
All of this "performance-discussion" is (whilst hopefully being educational) of course largely academic for the case of the OP.
Any approach which delivers the result after 100 µsec or so, would be entirely sufficient I'd say.
Olaf
I understand you, just I didn't know split() & VAL() was so slow. Include with in-stack parameter parsing.
Just asking about VB6.
Is there a penalty about calling a VBA functions? than a call to a function within module , or a class module in the VB project? I saw you are using a class aswell, because my knowledge is that both are __STDCALL. Unless things like NEAR / FAR CALL still exists in VB6/modern O.S.,.... but IIRC it not more exists in 64bits/32bits non paged memory mapping, right?.
Again, I thinks, IF TESTED with ramdomly IP CHECKS, 512MB don't fits in L1 or L2 cache, and every time the CPU must read the DIIMMs. Which is a penalty, as well as at LOADING TIME, to allocate 512MB you are right, is a bit overkill for loading time. Those two things are the CONS.
Now, may I will use your method, it comes handy, but right now, not time to check.
Again I didn't tested, because having a huge inbox stack right now, and I don't have time, just chat around this VBforum for rest a bit. I has an ETA of 2 months.
Last edited by flyguille; Aug 22nd, 2017 at 08:45 AM.
-
Sep 18th, 2017, 03:38 PM
#53
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
DEXWERX, can you tell me where you got this data.
Is this all USA ranges?
Thanks!
DUH... later found the credits of where you got the ip's in your zip... sorry
Last edited by axisdj; Sep 18th, 2017 at 04:42 PM.
-
Sep 18th, 2017, 04:41 PM
#54
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
Ok, after initially getting the solution going with the 7000 ranges (using the true/false memory mapped array) I originally was working I have realized that is not going to be suitable accuracy, too many ips are in USA and not in that range SO...
@Schmidt , I thoroughly looked at your sample, and I think you have a mis-calcualtion. Each look-up takes about 1 second on my system... way too slow for looking up 400+ ips per second. Not sure What I am doing wrong, but the Timer call in VB6 is in seconds and this is what I get:
59625.39 ''lookup started
59626.57 ''' lookup done
1.179688 '' difference in time seconds
So here is what I thought.
I found a list with decimal converted ranges of about 19,000 entries. Like this:
2466840576 2466906111 US United States
If I had those ranges in a hiIp() and lowIP() array I could traverse the array up or down to look if it is in range. I will test that for speed tonight, but what I am thinking is how could I shorten the search cycle somehow by knowing where to start the search.
One way I thought would be to use the 'Like' operator for the first four numbers in the ranges and add the results in a collection, then check ranges there.
Any other thoughts.
{added later}
Just found the 'Filter' function, which would return an array , maybe I can filter if the first four numbers matches , then range check those results.
Last edited by axisdj; Sep 18th, 2017 at 05:22 PM.
Reason: later idea
-
Sep 19th, 2017, 04:53 AM
#55
Re: multiple range checking algorithm vb6
 Originally Posted by axisdj
@Schmidt , I thoroughly looked at your sample, and I think you have a mis-calcualtion. Each look-up takes about 1 second on my system... way too slow for looking up 400+ ips per second. Not sure What I am doing wrong, but the Timer call in VB6 is in seconds and this is what I get:
59625.39 ''lookup started
59626.57 ''' lookup done
1.179688 '' difference in time seconds
Seems you didn't look as thouroughly at the example as you should have,
since the timing was measuring a Loop for the total of 1Mio IP-Lookup-calls...
Here's the code-snippet again (which even had a comment, where the amount of calls was stated):
Code:
For i = 1 To 1000000 '1Mio calls
Result = CheckIPRange(sIP4, Idx)
If Result Then
End If
Next
And I assume, that you ran this in the IDE, because if you compile it natively,
you should get below 1sec for the total amount of time which the 1Mio calls need in total.
(FYI, 1Mio calls, when you measure them in seconds - will translate to µsec per single-call.)
Olaf
-
Sep 19th, 2017, 07:44 AM
#56
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
Was a long day yesterday, did not realize that loop was a stress test, somehow thought it was searching entries for the match.
Thank you Olaf for the clarification,sorry for the mis-understanding.
Think I will implement your approach - the LZMA compression, my clients wont mind a 2.5MB reference file.
Thank you
-
Sep 19th, 2017, 08:35 AM
#57
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
So I've re-formatted the above CSV in a Binary-Format, which takes this "gapelessness" into account,
storing a single range-record in only 6 Bytes, which reduces the Raw-Binary-File to about 2.5MB
(432616 Records * 6 Bytes-per-Record = 2,595,696).
Can you give me more detail on getting this done. I have tested the old compressed file, and because it is using data from early august it is out of date with some false positives.
What I would need is an understanding of how to get the original file monthly, then format it to fit into the sample you sent. The extra compression is not really necessary for me.
I think having more detail on how you sorted the data will help but some clarification on how you set/retrieve keys. Does each possible IP address have a entry into the Ip sorted dictionary?
[later]
Ok I think I get it.. dear lord you are beyond genius. You convert the IP to a Currency(because long can't hold enough numbers) then you look at the populated dictionary and call IndexByKey(key,true) which returns a hypothetical index.. (WOW) Then I assume you call ItemByIndex which uses the hypotheical index to get the closest index of an actual item? which in turn gives the country code, which is stored in Binary and decode. (OMG) you go deep...
Any clarification on the above would be great.
I would like to understand the way the database needs to be represented, because I may need to interchange, and often update reference database. I am finding many are not consistent between each other. A way to import a standard start,end,country csv file may be best.
Thanks
Last edited by axisdj; Sep 19th, 2017 at 11:17 AM.
Reason: more details.
-
Sep 25th, 2017, 10:47 AM
#58
Thread Starter
Hyperactive Member
Re: multiple range checking algorithm vb6
Ok,
just wanted to let Olaf know I figured it out mostly.
Thanks again for the guidance.
WP
-
Sep 25th, 2017, 05:15 PM
#59
Re: multiple range checking algorithm vb6
 Originally Posted by axisdj
just wanted to let Olaf know I figured it out mostly.
Thanks again for the guidance.
Sorry for posting that late (shifted it "down my stack", because I didn't find the CSV-parsing-code anymore,
which produced the Binary for the LZMA-Blob, guess I've deleted it "for good" whilst "cleaning up" the original example).
Here again a fresh version (into a Class, named e.g. cUpdateFromCSV )
Code:
Option Explicit
Implements ICSVCallback
Private CSV As cCSV, IPs() As Byte, Countries() As Byte
Public Sub ConvertToBinaryFile(CSVFileName As String, BinFileName As String)
ReDim IPs(4 * 10 ^ 6) 'Space for ~1Mio records (currently used will be only about half as many)
ReDim Countries(2 * 10 ^ 6) 'same here (though the country-info will be only 2bytes per record)
Set CSV = New_c.CSV
CSV.ParseFile CSVFileName, Me
ReDim Preserve IPs(4 * CSV.RowsParsed - 1)
ReDim Preserve Countries(2 * CSV.RowsParsed - 1)
With New_c.FSO.CreateFileStream(BinFileName)
.WriteFromByteArr IPs 'write the IP-Bytes first into the stream (for better compressability,...
.WriteFromByteArr Countries '... they are followed by all the country-bytes in a separate block)
End With
End Sub
Private Function ICSVCallback_NewValue(ByVal RowNr As Long, ByVal ColNr As Long, B() As Byte, ByVal BValStartPos As Long, ByVal BValLen As Long) As Long
Select Case ColNr
Case 0 'Start-IP
Static S As String, SArr() As String
S = CSV.GetStringValue(B, BValStartPos, BValLen)
If InStr(S, ":") Then ICSVCallback_NewValue = 1: Exit Function 'stop parsing, when we reach the IPV6-section
SArr = Split(S, ".")
IPs(4 * RowNr + 3) = SArr(0): IPs(4 * RowNr + 2) = SArr(1): IPs(4 * RowNr + 1) = SArr(2): IPs(4 * RowNr + 0) = SArr(3)
Case 2 'Country-Code (we can simply copy the two Bytes over directly from the Input-Stream, no string-conversion needed)
Countries(2 * RowNr + 0) = B(BValStartPos)
Countries(2 * RowNr + 1) = B(BValStartPos + 1)
End Select
End Function
Usage then (after downloading the monthly updated CSV-Files from: https://db-ip.com/db/download/country
Code:
With New cUpdateFromCSV
.ConvertToBinaryFile App.Path & "\dbip-country-2017-09.csv", App.Path & "\IPRangesByCountry.bin"
End With
The Inital-Reading-Routine could be adapted to support both (LZMA-compressed and uncompressed formats) over the FileEnding like this:
Code:
Public Sub InitRangeDictionaryFrom(FileName As String)
Set IPRanges = New_c.SortedDictionary
Dim BSrc() As Byte, B() As Byte, C As Long, i As Long, K As Currency
If LCase$(Right$(FileName, 4)) = "lzma" Then 'decompress the ByteContent from LZMA first
BSrc = New_c.FSO.ReadByteContent(FileName)
New_c.Crypt.LZMADeComp BSrc, B
Else 'we assume uncompressed, raw binary format and read that into B directly
B = New_c.FSO.ReadByteContent(FileName)
End If
C = (UBound(B) + 1) \ 3
For i = 0 To C - 1 Step 2
K = B(i + i) + 256& * B(i + i + 1) + 65536 * B(i + i + 2) + 16777216@ * B(i + i + 3)
IPRanges.Add K, B(C + C + i) + 256& * B(C + C + i + 1) 'add the Country-Item under Key K
Next
End Sub
HTH (and yes, all your other assumptions were quite correct).
Olaf
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|