-
1 Attachment(s)
VB6: Best way to find all permutations of a given string in a file - Contest
Aloha,
This is a Spin-off from another thread where this all began. What we want to do here is take the file given here and search it for every permutation of the string asdf (This file will be updated occasionally). For this test all 8000 known entries in the 5mb file are in lower case. This does not mean that there are not more for the file was created by a random method and as you know random means random so it is possible that there are more. The file format is true binary. This means that all characters are in the file from 0-255. The representation of the permutations are in ASCII format. NO UNICODE or any other format is represented in the file.
I have also included in the zip file a file that contains the index for every permutation that is known in the file and what the permutation is. This file is supplied so that you can verify your work. Remember there may be more permutation in the file than the known ones... and the unknown one may span the 64k byte boundary.
Once all the entries are in I will create a different type of file that will contain some variations of the permutations and test the entries on that file just to keep everything honest. I will see about getting a prize of some sort for the winner.
The winners will be determined by several methods:
1 - Speed
2 - Understandability (ease of verification)
3 - Readability (how the code was written and commented)
4 - Method used
5 - Originality
Notes:
File must be read and processed in 64kb chunks to allow any size file to be scanned.
Permutation Variations:
Variable length.
Fixed length.
No repeating characters.
Repeating characters.
Your program need not be able to handle all of the given variations but it is a list that you could work from.
I would like to have several category of winners because speed does not always determine the needed solution (some methods won't be able to generate the exact log file but will still be correct). But you still need to generate a type of log file to verify your claim.
I would like to have this contest (challenge) run for about three months to give everyone a chance to work the problem. It would be nice if your code was self verifying (some methods won't allow this). This means that it prints out a report of where it found the permutation, what permutation it was and a count of each permutation. in the format of File Offset, Permutation in sorted order. It would be really nice to have the program check against the provided list to verify itself for the known permutations. I have also supplied the elements in the index file in csv format for this purpose.
Code should be sent ready to execute (please check your submission and make sure that all forms, modules and classes load from the same folder as the project). Code should have timing output using the GetTickCount API so that all timings are consistent and MUST use Option Explicit in all forms/modules/classes. All output should be in the format of the file FindStrSearch included in the zip file. Screen output is also appreciated.
The test file for download is too big for VBForums so you will need to download it from here Permutations File (Updated 4/18/2007).
This file is a simple test with all known permutations totally within each 64k boundary.
After 2 months are up I will deliver a more complex and final file that has more challenges. All code MUST be VB6 code but can use Windows API's. No external code or calls other than VB6 and Windows API's will be accepted. This means no custom dll's written in other languages. You can use any original references and components supplied to support VB6 from Microsoft.
You have until July 14, 2007
Code well. If you have some suggestions please let me know...
DO NOT POST YOUR ENTRIES!!!!
WE WANT ALL ENTRIES TO BE ORIGINAL WORK.
All entrants must let me know if they want to enter prior to submitting project...
You will need to email your entries to contest001 at randem dot com.
Source code only.
All code submissions that qualify will be compiled into a project and released on the *********** website as well as in the CodeBank Forum of VBForums for others to download, view and use. No rights to any code will be retained by the submitter. Submitters will be given complete credit for their submission in all postings.
Note: Please use your VBForums name in the submission. I have no idea of who you are without the cross reference.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Does our solution need to be in InStr() form? As in, send it the following four parameters in order?
Starting position
Text to search
Text to find (in any order)
Options (Either a Case Sensitive flag or vbText/vbBinary/vbDatabase)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
No the solution can and should be in any workable form that works well and easy to use to accomplish the task.
I hope to see many variations on how this can be done.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
He said... "Do not post your entries"
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Ellis Dee,
Do you have something to submit?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Let me get this straight. We have to create a program that searches the given text file for every "asdf" found, then have it write a file (or print a page, or display in textbox?) containing the offset of every "asdf" found?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
That is correct... But I never said the given file was text.... It's Binary.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by randem
That is correct... But I never said the given file was text.... It's Binary.
When you say "permutations" of "asdf," does that include "sdfa," "fads" etc ?
Never mind. I read the index and got the answer.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Shouldnt this go in the Contests Forum?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I guess these "contests" should be called "challenges", because each time I tried to pull up a contest in the last two years I was told there will be no contests on these forums (until some other somewhat official contest will come up that still isn't coming). That's why I started off a challenge on this forum instead of a contest, to avoid getting told the same thing all over again.
I guess contests are now ok again...
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I read over the explanation a couple times and I'm still not 100% sure what we're supposed to do. Maybe because I'm tired.
Are we supposed scan the binary file and find all permutations of asdf?
What should the function return? The number of permutations found?
It's supposed to write a log of each permutation (and it's index?) found?
What if the user wants to find permutations of something other than asdf?
:confused:
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
You're just writing a program (not a function) that counts the number of permutations. The one I wrote counts them, orders them, and saves them with the permutation and the location. It also found 8136 permutations in the list.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
You are supposed to scan the binary formatted file for any permutation of "asdf". permutations of "asdf" are the strings that are planted in the file to find.
A permutation of "asdf" is ALL four characters in any order sequentially.
The program is supposed to be able to self verify, meaning that it will match against the supplied list to verify that it found all the "KNOWN" permutations from the list. There may be "UNKNOWN" permutations in the file also. For if your program cannot pass this basic test, it will not handle the next more complex test file.
Write a log of all permutations and the offset into the 64k byte boundary of where it was found along with the count of the times each permutation was found in the file.
If your code is coded correctly it will not matter what the permutation is or the size of the permutation that it needs to find (within reason).
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
randem, In the code I sent you, I left a ReDim in there that doesn't need to be there... sorry.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by randem
That is correct... But I never said the given file was text.... It's Binary.
And the distinction is? ;)
Can we presume that "asdf" will be ascii encoded?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
The file will be a true binary formated file, it will have characters 0-255. No other encoding will be present. No unicode or any other format will be present. The characters "asdf" will be the ASCII representation of these characters.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by randem
Ellis Dee,
Do you have something to submit?
Yes, but what I sent you isn't what you're looking for. I'll submit a real solution in the next couple weeks.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
You ask for each permutation within each 64k block.
Should permutations found on a boundary (ie partially in two blocks) not be counted?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
They ABSOLUTELY should be counted. As I stated the KNOWN permutations are totally within each 64k boundary. The UNKNOWN permutations can be anywhere, even spanning 64k byte boundaries.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Are also partial foundings on existing finds counted? Ie. if there is asdfa in the file, will this count as two permutations? Or asdfasdf counts as five or only two?
And no, I haven't had time to check the files and probably won't have that time in the nearest future.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
asdfa is two permutations.
-
1 Attachment(s)
Re: VB6: Best way to find all permutations of a given string in a file - Contest
In case anyone is wondering, the output from your submission should look something like this file. It would help in the verification process.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by randem
No external code or calls other than VB6 and API's will be accepted.
Quick question...Does this mean we can't use something like, the FileSystemObject??
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
That is a VB call... The statement refers to custom C subroutines and dll and things of that nature. Any native VB or OS supported calls can be used.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
why are you looking for permutations in files :ehh: isn't that 2 contests, 1st for fastest file searching and the 2nd for fastest way to calculate permutations? you really just want fastest permutations eg calculate the permutations per second from an application.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by learning c
why are you looking for permutations in files :ehh: isn't that 2 contests, 1st for fastest file searching and the 2nd for fastest way to calculate permutations? you really just want fastest permutations eg calculate the permutations per second from an application.
The contest is the contest. Why is not the question. it's not for the fastest way to calculate permutations nor is it for the fastest way to load a file. Those are personal options.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quick Question - What sort of times are you looking for? Average I mean. I have some code going for around 7mins atm. Just wondering how severe I need to be to get closer to the speedy times.
Edit:
Doh! Spotted at the bottom of your text file - 672ms (bloody wow!) since mine is 420000ms... I have a bit of work to do :)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
That is what your submission is all about...
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Another question:
I've used this challenge as a first project in .net express.
What files do I need to send to you?
Also: The search file results text you posted has 8070 combinations - I've matched that here, but I could've sworn in the VBA code I wrote it came back with more... Is there definitive amount to find or do we go by your count?
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
guys don't compare your times to his time. It is on a different (and possibly multi-core workstation) computer. Compare them to your own previous times.
Also i don't see how anyone could time disk access when you could just read the entire file into a string buffer (no matter what the contents), which is presumably what he did if he did an instr search.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Well, I wouldn't say... I'm running at 120 ms and this is a single core laptop. However I don't output anything.
randem: Do we have to know what is the found permutation for output? I can output the position, but due to optimization I don't know the combination itself unless I make things more complicated. Couldn't we just output the positions?
Also, how about harddisk caching? The file is small enough to remain in cache for a short while, which means that the first time you run a code you get ~60 ms slower results and the next run coming shortly after is faster as the data comes from cache.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Ecniv,
You need to send everything that makes the project run so that I can see exactly what you did.
Merri,
The found permutation would be helpful but if you found a way to definitely identify the permutation and not know which one it is that is something that could be good and would like to be seen.
as Lord Orwell states do not compare your times with mine for ultimately the times will change when I run all the code on a standardized computer. That is where all the timing and such counts. The permutation search should be where your timing is. How you read the file is totally up to the individual and not included in the timing.
The count that I was shown in the example I posted is the least amount of permutations that are in the file. It was given as an example for if you do not find at least that many permutations your code has flaws and should be reworked.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Quote:
Originally Posted by randem
Code should have timing output using the GetTickCount API so that all timings are consistent.
So, I've decided to fiddle with this after all. I would like to suggest using the QueryPerformanceCounter API function for more accurate timing, because GetTickCount is only accurate to about 16 milliseconds.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I just wrote a prog that does what's required in about 1/2 hour (I mean it took me 1/2 hour to write :-P) and takes a little under 2 seconds to find all permutations (in IDE, *not* compiled yet)...and Randem, your "counts" of each value in the file (at the end of FindStrIndex.txt) are wrong...what's the difference between FindStrIndex and FindStrSearch?
Edit: My code doesn't save the results in sorted order...it does per permutation...shouldn't take much longer to make it sort the data before writing it though :-P
Edit 2: Got it down to 1.68 seconds (1688 ticks, compiled) with 141 ticks (0.15 secs) to sort and save afterwards...although my output seems to be different to the output you gave as an example...and that's with me running other programs, I then set program priority to realtime and got 1547 with +125 for saving and it'd probably be better if I closed all the other progs running :-P
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Oops, I noticed I had my computer in power saving mode, which meant I was running at 800 MHz... changes things a little, it wasn't hard disk cache after all, just the processor jumping up to 1800 MHz when running the code quickly enough.
randem: I can figure out the actual permutation, but the way I store the data while processing is unoptimal for converting the information in to what is currenly required in the output file. In the other hand, if we know the positions and we have the file, we don't really need that information: it is useful for your validation, but it doesn't help us at this point.
So it is entirely unrequired to output that data: only positions matter. This is why I'd suggest a simple file where only the start position of each permutation is outputted, separated by CRLF. No special formatting and I'd suggest the order also to be free if file outputting is included within timing (it'd be unfair to include sorting).
I have a new question as well: should we output the text file while looking for permutations or should we output it once timing is finished? At the moment I have included the file output in my time. Which is... equal to amount of solving about 25000 very easy sudokus using my sudoku solver. Anyone interested can have merry time finding out how fast it is.
I can tell my time under IDE as it is pretty poor: roughly 1300 ms.
Oh, and GetTickCount would be all too unaccurate for my code...
Code:
'clsTime.cls
Option Explicit
Private m_Freq As Double
Private m_Start As Currency
Private m_Stop As Currency
Private Declare Function QueryPerformanceFrequency Lib "kernel32" (lpFrequency As Currency) As Long
Private Declare Function QueryPerformanceCounter Lib "kernel32" (lpPerformanceCount As Currency) As Long
Public Function GetRemainingSeconds(ByVal CurrentValue As Long, ByVal EndVal As Long) As Currency
Dim dblCurrent As Double
If CurrentValue < 1 Then CurrentValue = 1
If EndVal > CurrentValue Then
QueryPerformanceCounter m_Stop
dblCurrent = (CDbl(m_Stop - m_Start) / m_Freq)
GetRemainingSeconds = CCur((dblCurrent / CurrentValue * EndVal) - dblCurrent)
End If
End Function
Public Function GetTime() As Double
QueryPerformanceCounter m_Stop
GetTime = CDbl(m_Stop - m_Start) / m_Freq
End Function
Public Sub Start()
QueryPerformanceCounter m_Start
End Sub
Private Sub Class_Initialize()
Dim curFreq As Currency
QueryPerformanceFrequency curFreq
m_Freq = CDbl(curFreq)
End Sub
Code:
Dim Q As clsTime
' initialize class
Set Q = New clsTime
' start timing
Q.Start
' return time in milliseconds
MsgBox Fix(Q.GetTime * 1000)
' once done...
Set Q = Nothing
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
It seems everyone is only considering speed. As I stated in the original post speed is only one area. Speed means NOTHING if your output is INCORRECT!!! Fast and wrong NEVER beats slow and correct!!!
The only timing is finding the permutations. Anything else you do outside of that is up to you. I repeat SPEED IS ONLY ONE PART OF THE TOTAL CONTEST!
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
However, correct results are nothing if your program takes 2 weeks to output them :-P
Speed is important, but it takes a backseat to accuracy :-)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Well, when it all comes down to it I would rather have something that takes two weeks and is 100% correct all the time than something that takes only a few seconds and is on 50% correct! Sometimes 99% is not good enough.
One can take something that is 100% correct and speed it up. Taking something that operates fast and wrong... Making it faster is of no use!
The output file is your validation... Anyone can make a claim!
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
smUX,
If you say the counts are wrong. Where is the output file to prove this claim. This is what I am after Proof not just claims. Speed is secondary.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Also when you submit please use your VBForums name in the submission. I have no idea of who you are without the cross reference.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
<assuming you were commenting my post>
But positions are perfectly enough for validation! My results are correct and my method is 100% solid and working; I wasn't speaking about incorrect results at any point. I don't see how you get more correctness by stating, although not literally, that outputting the found permutation alongside position is an absolute requirement.
I get the correct amount of results.
I get the correct positions.
Why I'd need to output more than that?
</assuming you were commenting my post>
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
The question was asked what is the difference between FindStrIndex and FindStrSearch:
Well the difference are:
FindStrIndex is the output of the file used to create the permutations in the created file.
FindStrSearch is the output from the program used to find the permutation in the created file.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Merri,
The original statement in the original post reads:
I would like to have several category of winners because speed does not always determine the needed solution (some methods won't be able to generate the exact log file but will still be correct).
Meaning if you generate just the offsets that can be verified. That too can be deemed as correct.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Your results (that you provided in the file FindStrSearch.txt) are 100% wrong...rename the file to output.txt and test it with my permutation checker I wrote into my code...all it does is grab the index given in the file and checks for the 4 digits at the end at that exact position...if it doesn't find, it's a fail...mine are 100% :-P
BTW, make sure that the first permutation is the first line, so delete everything before it...otherwise it won't show you :-P
(Edit: Ignore the "100% wrong" bit if you haven't read further down...they're right, but Randem is using 0 as the first byte position, and I was using 1...now both match :-))
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Okay, to help people with this a little, I've split my "checker" from the main program and can give that out for people to use themselves...it's a simple enough program, and it expects the filename for output to be output.txt and to have the format:
Code:
00000154,fsad
00000169,safd
00000263,fasd
00000839,asdf
00001416,dfsa
00004139,dsaf
00005068,sdfa
00005719,dfas
00006915,asdf
00008984,fads
00009248,fasd
00009852,fdsa
00010081,dsfa
00011051,fdsa
00011640,dasf
00012397,dsfa
00013533,afsd
00013623,sadf
00014368,fsad
00015131,safd
00015764,fasd
00017022,fsad
00017473,asfd
00018037,dafs
00018041,sdfa
Currently it crashes if there's no null line (1 vbcrlf) after the last one because that's where it looks to see where the end is...I wrote it for me, so sue me :-P
The file is available at http://download.yousendit.com/F543F2913B9BA07F and it is in a RAR and is available in both compiled and source so you can see for yourself how it works. I have also attached RAR/ZIP version to this post
Randem, I am sure you'll enjoy being able to check the file results with an actual checker that works :-P
(Edit: Deleted the checker...they've been reposted with a minor change to allow for an offset value...randem apparently wants the first byte to be considered 0, <sarcastic>thanks for telling me</sarcastic> :-P)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
smUX: switch to output according first character is at zero position. VB's string functions are base one.
-
2 Attachment(s)
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Eh? Nothing in the original post suggested to use any base value...first byte = 1 as far as I am concerned :-P
I see what you are saying though, Randem's results would be right if I took that into account (Edit: Confirmed...I allowed for the offset and Randem's results were 100% correct)
So I've uploaded a modified checker which deals with this offset
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
I've now submitted my code as I didn't see anything I could improve in that version without writing an entirely new code, so I guess this'll be it for me now.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Same here...I can't see any way at all of improving the speed of my code...I have got it down to 1.5 seconds compiled with other stuff running (+0.125 seconds to sort and save the output file) and there *REALLY* isn't any easy way of improving on the code IMO, very little that can actually be done to boost speed :-)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Oh, there is a lot you could do ^_^
I wonder if the code is submitted is the shortest code that has been submitted, what I did was fairly simple.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
BTW: Offsets always start at zero, Positions start at one.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
To help you guys out some I will make a few comments in the folowing post about code that could help you but I will not use names or references.
1 - It would probably help speed things up if things were defined and all variable were not variants.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
It probably wasn't to clear when I mentioned the 64kb boundaries was that the file is to be read in 64kb chunks to process so that it can handle any size file (I will change that, my bad). The final file you may not be able to read the whole file into memory.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
That effectively forces to include the file loading into timing :)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
All things being equal yes. But everyone will have exactly the same issue.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Yeah, although it doesn't matter much: it only cost me 5 ms or so; I can throw a guess that depending on how others have made their programs, they'll get more timeloss.
Fixing my program wasn't really a problem at all thanks to the way I solved the original problem :)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Same here, I think I can recode my prog to work within a 64k boundary easy enough...however, a boundary like that *will* slow things down :-P
Edit: Interestingly enough, the boundary *GREATLY* sped things up although at the moment the code refuses to work properly. I am getting the right number of permutations but the actual offset values don't seem to be right for some reason and I've no idea why...but as soon as I get the code sorted, it looks good...takes under 0.7 seconds to do the calculations, which is less than half the time it took before :-)
Edit again: Well, it's working 100% now (had to mess about with offset data for each successive block I load in...eventually got it right :-)) and I have written it so the blocksize is variable (you can choose the blocksize to load in each time) and this actually *greatly* improves speed. Four 16k blocks are actually faster than one 64k block, and I've seen speeds of between 500-600ms overall (without closing programs and without using realtime priority)...using realtime priority I got it below 300ms :-)
(all times in this second edit are *without* saving...add 125ms for saving)
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Code:
The following errors occurred when this message was submitted:
1. smUX has chosen not to receive private messages or may not be allowed to receive private messages. Therefore you may not send your message to him/her.
Because I can't contact in other ways, I put it here.
Edit
Then when coming back I notice he is banned.
-
Re: VB6: Best way to find all permutations of a given string in a file - Contest
Yeah, I got that too. Does the words "Banned" under his name have anything to do with this??? Geessh! What happened there???