|
-
Feb 22nd, 2010, 01:11 PM
#6
Thread Starter
Fanatic Member
Re: String metrics performance
thanks for the answer Shaggy.
I know that String comparisons are the slowest one, but I thind it is possible to do what I want in a matter of minutes. I don't mind waiting 15min if it can save me hours in the end.
To answer your first point, I first load the data from a simple SELECT DISTINCT query into a listbox. Then for each entry of my list box, I normalize the company name (I take away accents, ponctuations etc.) and add it to an arraylist, which I think is fast enough. What is really long is the next step which is the comparison using the algorithm.
For now what I did to minimize the number of operations is that I never compare the same two string, so the first one is compared with the 24999 other, the second one with the 24998 others.... So I basically cut the number of operations in two.
What I thought I could do next is order the strings alphabeticaly and compare a string only with the other strings that start with the same character. I might lose a couple of matches but as these are all company names it shouldn't be so bad as Most of the time people don't do typo on first character 
thanks again.
Alex
.NET developer
"No. Not even in the face of Armageddon. Never compromise." (Walter Kovacs/Rorschach)
Things to consider before posting.
Don't forget to rate the posts if they helped and mark thread as resolved when they are.
.Net Regex Syntax (Scripting) | .Net Regex Language Element | .Net Regex Class | DateTime format | Framework 4.0: what's new
My fresh new blog : writingthecode, even if I don't post much.
System: Intel i7 920, Kingston SSDNow V100 64gig, HDD WD Caviar Black 1TB, External WD "My Book" 500GB, XFX Radeon 4890 XT 1GB, 12 GBs Tri-Channel RAM, 1x27" and 1x23" LCDs, Windows 10 x64, ]VS2015, Framework 3.5 and 4.0 
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|