Postal Address Comparison
I’m looking for ideas on analyzing addresses to determine if they are the same so that I can combine groups of individuals with the same address together and assign them a single “household id”. As you can imagine with all of the variations of address data there are several possibilities for each one, such as Drive, Dr, etc which make this interesting.
I was thinking that breaking the address into parts and then comparing them might be the way to go. Such as the street number, street name, city, state, and zip then omit parts of the address such as drive, blvd, rd, etc. Basically I’d be splitting the address string on spaces. There are still problems with this though because of possible formatting differences such as multiple word street names and prefixes like North, N, etc.
If this sounds like a good method what would be a reliable way to accomplish this?
Thanks,
Brad
Re: Postal Address Comparison
If you have the ability I would recommend correctly formatting your data on insert to ensure consistency. If a user entered Dr for example you could prompt the user to confirm that Dr is short for Drive.
This will allow you to more rapidly query your data.
Good Luck!
Re: Postal Address Comparison
Unfortunately the data is input using an application which we can’t edit/update. :( There are currently about 30,000 addresses in the database so going back and standardizing them would be an absolute last resort.
Thanks for the input.
Re: Postal Address Comparison
Just start writing All the cases out in if statements.. heres a sample...
VB Code:
Dim Address As String
Address = ("1234 E. My Dr.") & " " 'space added to make end have a space, so we
'can search for strings of " Dr " or " Dr. " and not pull up any streets that start with "Dr"
If InStr(Address, " Dr. ") Then
Address = Strings.Replace(Address, " Dr. ", " Drive ")
MsgBox(Address)
End If
If InStr(Address, " Dr ") Then
Address = Strings.Replace(Address, " Dr ", " Drive ")
MsgBox(Address)
End If
If InStr(Address, " E ") Then
Address = Strings.Replace(Address, " E ", " East ")
MsgBox(Address)
End If
If InStr(Address, " E. ") Then
Address = Strings.Replace(Address, " E. ", " East ")
MsgBox(Address)
End If
Whether " E. " or " E " is entered, or whether " Dr " or " Dr. " is present, the address will come out as "1234 East My Drive" ... now its just a matter of writing out the rest of the cases...
msgboxes were just used so you can paste the code, and see the changes as it happened when you ran the example...