VB6: Best way to find all permutations of a given string in a file - Contest
Aloha,
This is a Spin-off from another thread where this all began. What we want to do here is take the file given here and search it for every permutation of the string asdf (This file will be updated occasionally). For this test all 8000 known entries in the 5mb file are in lower case. This does not mean that there are not more for the file was created by a random method and as you know random means random so it is possible that there are more. The file format is true binary. This means that all characters are in the file from 0-255. The representation of the permutations are in ASCII format. NO UNICODE or any other format is represented in the file.
I have also included in the zip file a file that contains the index for every permutation that is known in the file and what the permutation is. This file is supplied so that you can verify your work. Remember there may be more permutation in the file than the known ones... and the unknown one may span the 64k byte boundary.
Once all the entries are in I will create a different type of file that will contain some variations of the permutations and test the entries on that file just to keep everything honest. I will see about getting a prize of some sort for the winner.
The winners will be determined by several methods:
1 - Speed
2 - Understandability (ease of verification)
3 - Readability (how the code was written and commented)
4 - Method used
5 - Originality
Notes:
File must be read and processed in 64kb chunks to allow any size file to be scanned.
Permutation Variations:
Variable length.
Fixed length.
No repeating characters.
Repeating characters.
Your program need not be able to handle all of the given variations but it is a list that you could work from.
I would like to have several category of winners because speed does not always determine the needed solution (some methods won't be able to generate the exact log file but will still be correct). But you still need to generate a type of log file to verify your claim.
I would like to have this contest (challenge) run for about three months to give everyone a chance to work the problem. It would be nice if your code was self verifying (some methods won't allow this). This means that it prints out a report of where it found the permutation, what permutation it was and a count of each permutation. in the format of File Offset, Permutation in sorted order. It would be really nice to have the program check against the provided list to verify itself for the known permutations. I have also supplied the elements in the index file in csv format for this purpose.
Code should be sent ready to execute (please check your submission and make sure that all forms, modules and classes load from the same folder as the project). Code should have timing output using the GetTickCount API so that all timings are consistent and MUST use Option Explicit in all forms/modules/classes. All output should be in the format of the file FindStrSearch included in the zip file. Screen output is also appreciated.
The test file for download is too big for VBForums so you will need to download it from here Permutations File (Updated 4/18/2007).
This file is a simple test with all known permutations totally within each 64k boundary.
After 2 months are up I will deliver a more complex and final file that has more challenges. All code MUST be VB6 code but can use Windows API's. No external code or calls other than VB6 and Windows API's will be accepted. This means no custom dll's written in other languages. You can use any original references and components supplied to support VB6 from Microsoft.
You have until July 14, 2007
Code well. If you have some suggestions please let me know...
DO NOT POST YOUR ENTRIES!!!!
WE WANT ALL ENTRIES TO BE ORIGINAL WORK.
All entrants must let me know if they want to enter prior to submitting project...
You will need to email your entries to contest001 at randem dot com.
Source code only.
All code submissions that qualify will be compiled into a project and released on the *********** website as well as in the CodeBank Forum of VBForums for others to download, view and use. No rights to any code will be retained by the submitter. Submitters will be given complete credit for their submission in all postings.
Note: Please use your VBForums name in the submission. I have no idea of who you are without the cross reference.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Let me get this straight. We have to create a program that searches the given text file for every "asdf" found, then have it write a file (or print a page, or display in textbox?) containing the offset of every "asdf" found?
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I guess these "contests" should be called "challenges", because each time I tried to pull up a contest in the last two years I was told there will be no contests on these forums (until some other somewhat official contest will come up that still isn't coming). That's why I started off a challenge on this forum instead of a contest, to avoid getting told the same thing all over again.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I read over the explanation a couple times and I'm still not 100% sure what we're supposed to do. Maybe because I'm tired.
Are we supposed scan the binary file and find all permutations of asdf?
What should the function return? The number of permutations found?
It's supposed to write a log of each permutation (and it's index?) found?
What if the user wants to find permutations of something other than asdf?
Re: VB6: Best way to find all permutations of a given string in a file - Contest
You're just writing a program (not a function) that counts the number of permutations. The one I wrote counts them, orders them, and saves them with the permutation and the location. It also found 8136 permutations in the list.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
You are supposed to scan the binary formatted file for any permutation of "asdf". permutations of "asdf" are the strings that are planted in the file to find.
A permutation of "asdf" is ALL four characters in any order sequentially.
The program is supposed to be able to self verify, meaning that it will match against the supplied list to verify that it found all the "KNOWN" permutations from the list. There may be "UNKNOWN" permutations in the file also. For if your program cannot pass this basic test, it will not handle the next more complex test file.
Write a log of all permutations and the offset into the 64k byte boundary of where it was found along with the count of the times each permutation was found in the file.
If your code is coded correctly it will not matter what the permutation is or the size of the permutation that it needs to find (within reason).
Re: VB6: Best way to find all permutations of a given string in a file - Contest
The file will be a true binary formated file, it will have characters 0-255. No other encoding will be present. No unicode or any other format will be present. The characters "asdf" will be the ASCII representation of these characters.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
They ABSOLUTELY should be counted. As I stated the KNOWN permutations are totally within each 64k boundary. The UNKNOWN permutations can be anywhere, even spanning 64k byte boundaries.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Are also partial foundings on existing finds counted? Ie. if there is asdfa in the file, will this count as two permutations? Or asdfasdf counts as five or only two?
And no, I haven't had time to check the files and probably won't have that time in the nearest future.
Please post some of the code you need help with (it makes it easier to help you)
If your problem has been solved then please mark the thread [RESOLVED].
Don't forget to Rate this post
"Pinky, you give a whole new meaning to the phrase, 'counter-intelligence'."-The Brain-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
why are you looking for permutations in files isn't that 2 contests, 1st for fastest file searching and the 2nd for fastest way to calculate permutations? you really just want fastest permutations eg calculate the permutations per second from an application.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Originally Posted by learning c
why are you looking for permutations in files isn't that 2 contests, 1st for fastest file searching and the 2nd for fastest way to calculate permutations? you really just want fastest permutations eg calculate the permutations per second from an application.
The contest is the contest. Why is not the question. it's not for the fastest way to calculate permutations nor is it for the fastest way to load a file. Those are personal options.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quick Question - What sort of times are you looking for? Average I mean. I have some code going for around 7mins atm. Just wondering how severe I need to be to get closer to the speedy times.
Edit:
Doh! Spotted at the bottom of your text file - 672ms (bloody wow!) since mine is 420000ms... I have a bit of work to do
Last edited by Ecniv; May 25th, 2007 at 04:56 AM.
Reason: Spotting at the bottom of the text file
Feeling like a fly on the inside of a closed window (Thunk!)
If I post a lot, it is because I am bored at work! ;D Or stuck...
* Anything I post can be only my opinion. Advice etc is up to you to persue...
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Another question:
I've used this challenge as a first project in .net express.
What files do I need to send to you?
Also: The search file results text you posted has 8070 combinations - I've matched that here, but I could've sworn in the VBA code I wrote it came back with more... Is there definitive amount to find or do we go by your count?
Feeling like a fly on the inside of a closed window (Thunk!)
If I post a lot, it is because I am bored at work! ;D Or stuck...
* Anything I post can be only my opinion. Advice etc is up to you to persue...
Re: VB6: Best way to find all permutations of a given string in a file - Contest
guys don't compare your times to his time. It is on a different (and possibly multi-core workstation) computer. Compare them to your own previous times.
Also i don't see how anyone could time disk access when you could just read the entire file into a string buffer (no matter what the contents), which is presumably what he did if he did an instr search.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Well, I wouldn't say... I'm running at 120 ms and this is a single core laptop. However I don't output anything.
randem: Do we have to know what is the found permutation for output? I can output the position, but due to optimization I don't know the combination itself unless I make things more complicated. Couldn't we just output the positions?
Also, how about harddisk caching? The file is small enough to remain in cache for a short while, which means that the first time you run a code you get ~60 ms slower results and the next run coming shortly after is faster as the data comes from cache.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Ecniv,
You need to send everything that makes the project run so that I can see exactly what you did.
Merri,
The found permutation would be helpful but if you found a way to definitely identify the permutation and not know which one it is that is something that could be good and would like to be seen.
as Lord Orwell states do not compare your times with mine for ultimately the times will change when I run all the code on a standardized computer. That is where all the timing and such counts. The permutation search should be where your timing is. How you read the file is totally up to the individual and not included in the timing.
The count that I was shown in the example I posted is the least amount of permutations that are in the file. It was given as an example for if you do not find at least that many permutations your code has flaws and should be reworked.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Originally Posted by randem
Code should have timing output using the GetTickCount API so that all timings are consistent.
So, I've decided to fiddle with this after all. I would like to suggest using the QueryPerformanceCounter API function for more accurate timing, because GetTickCount is only accurate to about 16 milliseconds.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I just wrote a prog that does what's required in about 1/2 hour (I mean it took me 1/2 hour to write :-P) and takes a little under 2 seconds to find all permutations (in IDE, *not* compiled yet)...and Randem, your "counts" of each value in the file (at the end of FindStrIndex.txt) are wrong...what's the difference between FindStrIndex and FindStrSearch?
Edit: My code doesn't save the results in sorted order...it does per permutation...shouldn't take much longer to make it sort the data before writing it though :-P
Edit 2: Got it down to 1.68 seconds (1688 ticks, compiled) with 141 ticks (0.15 secs) to sort and save afterwards...although my output seems to be different to the output you gave as an example...and that's with me running other programs, I then set program priority to realtime and got 1547 with +125 for saving and it'd probably be better if I closed all the other progs running :-P
I love helping noobs with their VB problems (probably because, as an amateur programmer, I am only slightly better at VB than them :-)) but if you SERIOUSLY want to get help for free from a community such as VBForums, you have to first have a grounding (basic knowledge) in VB6, otherwise you're way too much work to help...You've got to give a little if you want to get help from us, in other words!
And we DON'T do your homework. If your tutor doesn't teach you enough to help you make the project without his or her help, FIND A BETTER TUTOR or try reading books on programming! We are happy to help with minor things regarding the project, but you have to understand the rest of it if you want our help to be useful.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Oops, I noticed I had my computer in power saving mode, which meant I was running at 800 MHz... changes things a little, it wasn't hard disk cache after all, just the processor jumping up to 1800 MHz when running the code quickly enough.
randem: I can figure out the actual permutation, but the way I store the data while processing is unoptimal for converting the information in to what is currenly required in the output file. In the other hand, if we know the positions and we have the file, we don't really need that information: it is useful for your validation, but it doesn't help us at this point.
So it is entirely unrequired to output that data: only positions matter. This is why I'd suggest a simple file where only the start position of each permutation is outputted, separated by CRLF. No special formatting and I'd suggest the order also to be free if file outputting is included within timing (it'd be unfair to include sorting).
I have a new question as well: should we output the text file while looking for permutations or should we output it once timing is finished? At the moment I have included the file output in my time. Which is... equal to amount of solving about 25000 very easy sudokus using my sudoku solver. Anyone interested can have merry time finding out how fast it is.
I can tell my time under IDE as it is pretty poor: roughly 1300 ms.
Oh, and GetTickCount would be all too unaccurate for my code...
Code:
'clsTime.cls
Option Explicit
Private m_Freq As Double
Private m_Start As Currency
Private m_Stop As Currency
Private Declare Function QueryPerformanceFrequency Lib "kernel32" (lpFrequency As Currency) As Long
Private Declare Function QueryPerformanceCounter Lib "kernel32" (lpPerformanceCount As Currency) As Long
Public Function GetRemainingSeconds(ByVal CurrentValue As Long, ByVal EndVal As Long) As Currency
Dim dblCurrent As Double
If CurrentValue < 1 Then CurrentValue = 1
If EndVal > CurrentValue Then
QueryPerformanceCounter m_Stop
dblCurrent = (CDbl(m_Stop - m_Start) / m_Freq)
GetRemainingSeconds = CCur((dblCurrent / CurrentValue * EndVal) - dblCurrent)
End If
End Function
Public Function GetTime() As Double
QueryPerformanceCounter m_Stop
GetTime = CDbl(m_Stop - m_Start) / m_Freq
End Function
Public Sub Start()
QueryPerformanceCounter m_Start
End Sub
Private Sub Class_Initialize()
Dim curFreq As Currency
QueryPerformanceFrequency curFreq
m_Freq = CDbl(curFreq)
End Sub
Code:
Dim Q As clsTime
' initialize class
Set Q = New clsTime
' start timing
Q.Start
' return time in milliseconds
MsgBox Fix(Q.GetTime * 1000)
' once done...
Set Q = Nothing
Re: VB6: Best way to find all permutations of a given string in a file - Contest
It seems everyone is only considering speed. As I stated in the original post speed is only one area. Speed means NOTHING if your output is INCORRECT!!! Fast and wrong NEVER beats slow and correct!!!
The only timing is finding the permutations. Anything else you do outside of that is up to you. I repeat SPEED IS ONLY ONE PART OF THE TOTAL CONTEST!
I love helping noobs with their VB problems (probably because, as an amateur programmer, I am only slightly better at VB than them :-)) but if you SERIOUSLY want to get help for free from a community such as VBForums, you have to first have a grounding (basic knowledge) in VB6, otherwise you're way too much work to help...You've got to give a little if you want to get help from us, in other words!
And we DON'T do your homework. If your tutor doesn't teach you enough to help you make the project without his or her help, FIND A BETTER TUTOR or try reading books on programming! We are happy to help with minor things regarding the project, but you have to understand the rest of it if you want our help to be useful.
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Well, when it all comes down to it I would rather have something that takes two weeks and is 100% correct all the time than something that takes only a few seconds and is on 50% correct! Sometimes 99% is not good enough.
One can take something that is 100% correct and speed it up. Taking something that operates fast and wrong... Making it faster is of no use!
The output file is your validation... Anyone can make a claim!