-
Sep 4th, 2009, 08:15 AM
#1
Thread Starter
Member
Checking for similar duplicates
Hi,
I have a web site where users enter company names to use in the rest of the app. However, recently I've noticed that similar duplicates are appearing, e.g. someone will enter EastTec Solicitors another will enter EastTec Solicitors Ltd someone else will enter EastTec Solictors (missed the i out in Solicitors), when there should only be one entry of EastTec Solicitors. What is the best way of checking the database for entries similar to what they have entered? How would you about checking for spelling mistakes as well like the Solicitors one?
Cheers.
-
Sep 4th, 2009, 08:50 AM
#2
Re: Checking for similar duplicates
Don't use a textbox.
Seriously, this is an issue that comes up all the time in DB programming. If you have a text field where somebody CAN add abbreviations, typos, alternate spellings, etc., then they will. You might be able to identify the bulk of the errant strings using RegEx or Contains (Contains EastTec might work in your example), but you won't get them all. Therefore, the only certain way to handle this is to design the program in such a way that the user has to select from a list. If there is no such list, then your problem becomes vastly more difficult, as you pretty much have to accept the fact that there will be duplicates.
My usual boring signature: Nothing
-
Sep 4th, 2009, 09:29 AM
#3
Thread Starter
Member
Re: Checking for similar duplicates
I've got to use a textbox as certain users can add to the list... They should be checking themselves anyway but you what users are like...
-
Sep 4th, 2009, 11:01 AM
#4
Re: Checking for similar duplicates
You can try my fuzzy logic class. This is what it was designed for.
http://www.vbforums.com/showthread.php?t=540094
That is the very essence of human beings and our very unique capability to perform complex reasoning and actually use our perception to further our understanding of things. We like to solve problems. -Kleinma
Does your code in post #46 look like my code in #45? No, it doesn't. Therefore, wrong is how it looks. - jmcilhinney
-
Sep 4th, 2009, 02:37 PM
#5
Re: Checking for similar duplicates
I'll advice you to put a listbox/combobox with all the company names filled in to let the user select from that list. Add an additional item "Others..." as the last entry in the list and if the user selects that just show him a textbox to enter whatever he wants to. You may need a bit of javascript for this, but I assure you that's worth the effort.
-
Sep 4th, 2009, 02:54 PM
#6
Re: Checking for similar duplicates
I would like to emphasize what Pradeep suggested.
I've had a similar situation to what you are describing, where people have to pick locations. Some locations may have been used before, while others would be totally new. As you might imagine, no situation is worse for spelling than locations, as people abbreviate directions (S, S., South, etc.), names, and suffixes (rd, Rd, Road, etc.). Since I was working with streams, I could have S. Fork, South Fork, S Fk, SFk, S. Fk, and MANY MANY others.
Validating this would be a total and utter nightmare. Therefore, I gave the user a list of all the known locations, but gave them an option to add a new location if it wasn't already on the list. In my case, if they gave me a new location, I was alerted to the fact, and could check those. Since this only happened a couple times a year, and since I had to add a bunch of other information for any new locations, notification made perfect sense. In your case, having you approve/alter/reject any typed in items might be unreasonable, but the key point is still the same: To the greatest extent possible, don't let users type in text for fields that might be searchable!!! The only fields I let people type into without some kind of oversight mechanism is comment fields. Other than that, I go out of my way to guide their entry.
My usual boring signature: Nothing
-
Sep 5th, 2009, 10:47 AM
#7
Thread Starter
Member
Re: Checking for similar duplicates
Cheers for your help. I do what Pradeep says already, i.e. there is a combobox with all the company names listed and an option to add another if the one they want is not there, but still the users add duplicates, I think that they just can't be bothered to look through the list... I might give that fuzzy logic class from wild_bill a go, I just want to check if there are similar entries already there and show the user them so then they'll see that the company is already on the list...
-
Sep 5th, 2009, 06:35 PM
#8
Re: Checking for similar duplicates
There is an old saying: "Against stupidity, the gods themselves contend in vain."
Laziness could be substituted for stupidity.
My usual boring signature: Nothing
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|