thanks for the answer Shaggy.

I know that String comparisons are the slowest one, but I thind it is possible to do what I want in a matter of minutes. I don't mind waiting 15min if it can save me hours in the end.

To answer your first point, I first load the data from a simple SELECT DISTINCT query into a listbox. Then for each entry of my list box, I normalize the company name (I take away accents, ponctuations etc.) and add it to an arraylist, which I think is fast enough. What is really long is the next step which is the comparison using the algorithm.

For now what I did to minimize the number of operations is that I never compare the same two string, so the first one is compared with the 24999 other, the second one with the 24998 others.... So I basically cut the number of operations in two.

What I thought I could do next is order the strings alphabeticaly and compare a string only with the other strings that start with the same character. I might lose a couple of matches but as these are all company names it shouldn't be so bad as Most of the time people don't do typo on first character

thanks again.