Results 1 to 12 of 12

Thread: Compare Database

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Sep 2002
    Location
    Edmonton, Alberta Canada
    Posts
    192

    Compare Database

    Not sure if this is the right place, but I'm pretty stuck. I have 2 pretty large list ... like 1 million plus records & 10 Million plus.

    One of them has a bit older data but I want to compare it to the new one and confirm the data.

    The new one contains the 10 Million records.

    Would importing into a both into a DB and using a VB compare string take forever?

    Is there an efficient way?

  2. #2
    Software Carpenter dee-u's Avatar
    Join Date
    Feb 2005
    Location
    Pinas
    Posts
    11,127

    Re: Compare Database

    What comparison do you need? ID, Name?
    Regards,


    As a gesture of gratitude please consider rating helpful posts. c",)

    Some stuffs: Mouse Hotkey | Compress file using SQL Server! | WPF - Rounded Combobox | WPF - Notify Icon and Balloon | NetVerser - a WPF chatting system

  3. #3
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974

    Re: Compare Database

    And very importantly - what database system are they currently stored in? Are they on the same server?

    Using VB to compare is likely to be extremely slow in comparison to the other options which are available.

  4. #4

    Thread Starter
    Addicted Member
    Join Date
    Sep 2002
    Location
    Edmonton, Alberta Canada
    Posts
    192

    Re: Compare Database

    I'm not using a specific database right now. I am aware that access is probably not powerfull enough for this so I was thinking mySQL.

    Same Server?.... No just on my PC. Are you saying this will not be powerfull enough?

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Sep 2002
    Location
    Edmonton, Alberta Canada
    Posts
    192

    Re: Compare Database

    Comparison is full address: #, Street, City/Town, Zip

  6. #6
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974

    Re: Compare Database

    How is the data currently stored?

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Sep 2002
    Location
    Edmonton, Alberta Canada
    Posts
    192

    Re: Compare Database

    Currently it is just in a csv file. I was going to import into Access or MySQL.

  8. #8
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974

    Re: Compare Database

    Ok, I haven't used it before but from other peoples comments I would recommend importing into MySQL, and use SQL statements to do a comparison, eg:

    Matching addresses:
    Code:
    SELECT table1.[#], table1.Street, table1.[City/Town], table1.Zip
    FROM table1
    INNER JOIN table2
    WHERE table1.[#] = table2.[#]
    AND table1.Street = table2.Street
    AND table1.[City/Town] = table1.[City/Town]
    AND table1.Zip = table2.Zip
    Non-matching addresses:
    Code:
    SELECT table1.[#], table1.Street, table1.[City/Town], table1.Zip
    FROM table1
    WHERE Not Exists(
        SELECT Street 
        FROM table2
        WHERE table1.[#] = table2.[#]
        AND table1.Street = table2.Street
        AND table1.[City/Town] = table1.[City/Town]
        AND table1.Zip = table2.Zip )
    Note that the field names of "#" and "City/Town" probably wont be valid (or will cause issues), so I would recommend changing them.

  9. #9
    Fanatic Member kaffenils's Avatar
    Join Date
    Apr 2004
    Location
    Norway
    Posts
    946

    Re: Compare Database

    SQL Server has a function called BINARY_CHECKSUM. It calculates a checksum for a row, and can be used to detect changes.

    I don't know if mySQL has a similar function, but if there is you could use it to create a checksum for each row in the two tables and compare the checksum value for each matching primary key in the two tables.
    So instead of matching each column in the query (sloooooooooooow), you only match an integer value (muuuuuuch faster ).

  10. #10

    Thread Starter
    Addicted Member
    Join Date
    Sep 2002
    Location
    Edmonton, Alberta Canada
    Posts
    192

    Re: Compare Database

    Ok - I think this is beyond me. Is there anyone or anywhere I can get a quote from on doing the following:

    List 1: Approx. 1.3 Million Addresses
    List 2: Approx. 10 - 15 Million Addresses

    Note: List 1 is an outdated marketing list. List 2 is current address information.

    I need to scrub List 1 against List 2 to verify that the addresses (Suite, Street No., Street Name, City, State, Zip) are still correct.

    I would like the results to show which addresses have been confirmed on list 2.

    Thanks for any guidance!

  11. #11
    Fanatic Member
    Join Date
    Jan 2005
    Location
    In front of this pc.
    Posts
    580

    Re: Compare Database

    This kinda points you in another direction but it sounds to me like something for which Perl was designed to handle and it would probably run in about the same amount of time that it would take to import list 2 into a db.
    What bug? That's not a bug. It's an undocumented feature!

  12. #12
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: Compare Database

    With the additional information you have given...

    Put the 10 million rows (new address file) into a SQL database. You need all kinds of indexes on key columns - like city and street.

    The reason you put these into a database like SQL is so that you can find them quickly. It doesn't much matter if it's MS SQL, mySQL or probably even ACCESS (although that would certainly be my last choice).

    Now the matching part - where you take the 1.3 million addresses from the "old" marketing list. There is no reason for them to ever go into SQL. You want to process them in VB or some other language - grab a row from that CSV text file and use ADO to find the matching row in the DB. You might have to make more then one hit on the DB to find a match - being more or less specific as you go through it.

    I see no real reason to load the 1.2 million addresses into a DB...

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width