|
-
Apr 30th, 2005, 04:50 PM
#1
Thread Starter
Hyperactive Member
Unscrambling Scrambled Words
Basically, my program is a way you can unscramble scrambled words. For example, words that are in the dictionary, "paphy" could be "happy". This is more of an "idea" optimization and not code... I do have the code, and if you need, I can let you see, but I just think it's not the code that's slowing it down, but rather how I do it.
By the way, it's really slow on words like 5 characters let's say... It takes like a minute seconds. But, for larger words, it takes like either a LONG time like five minutes or freezes and goes "Not Responding". I'd like to get ALL words to 30 seconds or less.
So, what I do is this:
- You type the scrambled word in the textbox, "paphy".
- It sees how long "paphy" is, which is 5 characters.
- It goes through a text list of a dictionary and removes ALL words that are NOT 5 characters.
- Then, it goes through EACH word of the "stripped down" dictionary list and checks for EACH word if EACH charcter of "paphy" matches with one of the words. It does this for EACH word on the list...
I think it's mainly the last step which causes it to be slow, but I can't think of any way how I could change it. Now obviously it's not complete because using my "algorithim" there could be another word that has an "a p h y" in it, but only one "p", for example. This is basically an ALPHA or Pre-BETA version for my program, so that's why I havn't completed it yet.
Can anyone think of a better way to do this? And I don't think this really matters, because it's just an idea, but this is being done in VB6... that might be a little bit why it is slow. Then again, I'm sure VB6 can handle it. I will be porting this to C/C++ soon, so that should help.
Thanks in advance and sorry for typing so much!
Last edited by alacritous; Apr 30th, 2005 at 04:55 PM.
-
Apr 30th, 2005, 04:58 PM
#2
Re: Unscrambling Scrambled Words
-
Apr 30th, 2005, 05:41 PM
#3
Thread Starter
Hyperactive Member
Re: Unscrambling Scrambled Words
I don't really understand... How would I implement that into working with words?
And even then, in general, I don't really get it... I'll keep re-reading it, but maybe your words can help me understand it better.
I also just though of doing multi tasks at once... I forgot who here set up a whole thing in VB to do it, but I'll search for it if I decide to try that. Multi threading I believe it is.
-
Apr 30th, 2005, 06:07 PM
#4
Re: Unscrambling Scrambled Words
There was VB code in there in case you missed it. It's on the bottom.
-
Apr 30th, 2005, 07:09 PM
#5
Re: Unscrambling Scrambled Words
I would put all the words into a database and let the database engine do the work of finding suitable words - it's what it was designed for, and will probably be more efficient than anything you could do yourself (depending on the database you can use).
You wouldn't need to remove innapropriate words, as you wouldn't even see them. The SQL to get the data would be something like this:
Code:
Select [word]
From [dictionary]
Where len([word]) = 5
AND [word] like '%p%p%'
AND [word] like '%a%'
AND [word] like '%h%'
AND [word] like '%y%'
you picked a good example word, as it highlights the issue of repeat letters - hence the first like clause having two P's in it.
No matter what method you use, I dont think multi-threading would help much (unless you have multiple processors) as you still need to do the same amount work. ps: WokaWidget has some great examples on this site for multi-threading if you want to try it out
-
Apr 30th, 2005, 07:10 PM
#6
Re: Unscrambling Scrambled Words
This guy gives you a dictionary, but no source code.
Perhaps you could get a few ideas?
http://software.magneticpole.com/words/
-
Apr 30th, 2005, 07:21 PM
#7
Thread Starter
Hyperactive Member
Re: Unscrambling Scrambled Words
Forget this...
http://www.ssynth.co.uk/~gay/anagram.html
Way better than mine... Owell at least I know mine works, just slow, but with my own algorithim! heh
On to another idea... aragh
-
May 2nd, 2005, 04:43 PM
#8
Thread Starter
Hyperactive Member
Re: Unscrambling Scrambled Words
I changed my idea and now I'm going to make it for Handhelds (Palm Pilots)... You guys probably don't care, and the thread is kind of dead... but yeah I thought I'd mention that.
-
May 3rd, 2005, 01:39 PM
#9
Re: Unscrambling Scrambled Words
Cool, my idea of a Database is probably still valid, it is for PocketPC's as they have cut down versions of Access(pre-installed) and SQL Server.
-
Oct 23rd, 2005, 08:26 AM
#10
Fanatic Member
Re: Unscrambling Scrambled Words
You could use DoEvents to prevent it from freezing up
-
Oct 23rd, 2005, 02:44 PM
#11
Re: Unscrambling Scrambled Words
For the original idea you were describing, I would do something like this:
- sort characters into bytewise order (happy -> ahppy, paphy -> ahppy...)
- go through the dictionary list: first check the length and then sort the characters and compare
- if the comparison is valid, you're all done and happy; otherwise go on until you've gone through the whole dictionary
To make it faster, you could precalculate the results for all dictionary items. Then you could do just the comparison part, which is very fast if you know what to use to do the comparison. In string mode the fastest would probably be:
VB Code:
If LenB(strKeyword) = LenB(strDictionaryItem) Then
If InStr(1, strKeyWord, strDictionaryItem, vbBinaryCompare) = 1 Then
' a match!
End If
End If
Well, atleast the length check is important, I'm not sure if InStr can beat the normal strKeyword = strDictionaryItem.
And using a database would be very fast as databases are optimized for stuff like this. With two tables you can do it like this:
TABLE1: ID, Keyword
ID is autoincreasing number, keyword is the presorted string (for example, ahppy)
TABLE2: ID, SortID, Word
ID is autoincreasing number, SortID is the ID pointer to matching TABLE1 item, word is the actual word (ie. happy)
Then you just look for a matching item in TABLE1, get the ID and get results from the TABLE2. Since database stuff is optimized for querying, it would be really fast. This database structure would also save memory compared to one table solution.
-
Oct 24th, 2005, 09:58 AM
#12
Re: Unscrambling Scrambled Words
One technique is to assign an irrational number to each letter of the alphabet, then multiply all the letters in the word together. Then just search the dictionary for words with an identical number. Its a bit like hashing really but you have to pick the right numbers. I did this for a short dictionary about 5 years ago and the performance was excellent. If you sort your dictionary in numerical order than you'll barely have any overhead at all. It certainly won't give you any noticible delay.
I don't live here any more.
-
Oct 24th, 2005, 10:05 AM
#13
Re: Unscrambling Scrambled Words
For example
A = 0.1577
B = 0.9754
C = 0.8546
D = 0.9748
E = 0.3485
F = 0.6475
...
...
Therefore "FACE" = 0.0304114
Search your dictionary for words with the same number (store the number instead of calculating it on the fly). And you should get the following results:
FACE
CAFE
Because they both have the number 0.0304114 attached to them.
I don't live here any more.
-
Oct 29th, 2005, 12:05 AM
#14
Fanatic Member
Re: Unscrambling Scrambled Words
Another way of doing it, with some memory overhead, is keep your dictionary, but for every word you also attach to it: the letters of the word sorted.
I.e. for FACE you link to ACEF, same as CAFE => ACEF. Naturally, two anagrams will have the same linked word.
Then, you can sort your *whole* dictionary by these sorted words, and perform the binary search on these.
sql_lall 
-
Oct 29th, 2005, 02:04 AM
#15
Re: Unscrambling Scrambled Words
Already suggested by me only three posts above.
-
Oct 29th, 2005, 04:47 AM
#16
-
Oct 29th, 2005, 05:47 AM
#17
Re: Unscrambling Scrambled Words
Actually, exactly! 
I don't understand O(n) and O(log n) as mathematical theories aren't my thing. But as far as I can see, you suggested the same thing only adding the sorting of the dictionary so I don't see why to replicate the same idea all over again introducing it as if it were a completely new suggestion (by "Another way of doing it"): the only new thing was the fact about sorting. Which is a rather basic method to speed this kind of thing up and as an idea comes quite automatically to about anyone (or then I've done too much optimization stuff already so it has become too much of a selfexplanary thing).
Of course then we can go even faster and more optimized: sort primarily by length and then alphabetically by string keyword. And then store a precalculated index for certain length and then for a certain starting letter. That would speed things much more as you could get the starting search point closer to the dictionary results. If this is what you were after with your message, it really didn't shine out of it.
Continuing on topic for more... speedwise, I don't know how this would compare to wossname's "hashing" style. It doesn't sound completely reliable to me, but that is as I don't understand how it really works in practise without trying to do it first; or without reading much more about irrational numbers and seeing more practical samples. Sometimes it would be great if English were my native language so I could figure out complex explanations faster and without relying to a dictionary and Google searches for the words I can't find from the dictionary.
-
Oct 29th, 2005, 05:56 AM
#18
Re: Unscrambling Scrambled Words
This and a PDA would make you own at Scrabble.
Bill
-
Oct 29th, 2005, 08:05 AM
#19
Fanatic Member
Re: Unscrambling Scrambled Words
O(n) means that if you have n words, you'll have some constant factor * n things you have to check. O(log n) means that if you have n words, you'll have some constant factor * log(n) things to check. If you have a dictionary with 1,000,000 words, O(n) means about 1 million checks, whereas O(log n) is around 20 
O(log n) is possible if you apply the algorithm of *Binary Search* to scan through a sorted list. So long as it is sorted (i.e. don't even need to take length into account, just use any consistent sort), you can binary search for an O(log n) retrieval.
Yes, there are other optimisations - one good one would be a weighted binary search. I.e. once the letters are sorted, more will start with 'a' than with 'b', and more with 'b' than with 'c' etc..., so you can speed up your binary search a tiny bit, but it'll only be the difference between 20 operations and about 16 or 17, so possibly not worth it. The key is not looking through every entry in the dictionary.
wossname's hash is simply a numerical method of the above procedure. Instead of using identifying strings, you use floating-point numbers. This'll be slightly faster because:
a) You don't have to sort the letters of each word, and
b) You're dealing with a single number, not a (longer) string
Again, once you get the real number hashes, you sort them and use binary search for retrieval.
The downsides (there are always some) are:
a) Floating point numbers in computers are limited by precision.
b) It's not a perfect hash (I don't think, not with computer constraints on the numbers) although the collisions will be extremely infrequent, if at all.
c) You can't use a weighted binary search as easily.
sql_lall 
-
Nov 10th, 2005, 05:15 AM
#20
Re: Unscrambling Scrambled Words
You don't need to rely on weighted binary search anyway, say you've got 100,000 words, the most comparisons you'll ever make is seventeen.
Weighted binary search will not be of any benefit.
I don't live here any more.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|