Results 1 to 11 of 11

Thread: Fastest way to compare two datatables?

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Jan 2013
    Posts
    144

    Fastest way to compare two datatables?

    I have two datatables. Each datatable contains 125 columns and about 30,000 rows. I currently am using the following method to compare them for differences:

    01. DT1TableAdapter.Fill(DataSet1.DT1)
    02. DT2TableAdapter.Fill(DataSet1.DT2)
    03. BindingSource1.MoveFirst
    04. BindingSource2.MoveFirst
    05. Compare current row in BindingSource1 with current row in BindingSource2
    06. If there are differences add the current two rows to a temporary datatable.
    07. BindingSource2.MoveNext
    08. Goto step 5 until end of BindingSource2 rows.
    09. BindingSource1.MoveNext
    10. Goto step 5 until end of BindingSource1 rows.
    11. Print a report contains the different rows. (The report has the temporary datatable as a datasource.)
    12. End

    I have a fast PC and this method takes about 17 min to finish! Is that OK? if not, what is the fastest method to achieve the same goal?

    Thanks

  2. #2
    PowerPoster SJWhiteley's Avatar
    Join Date
    Feb 2009
    Location
    South of the Mason-Dixon Line
    Posts
    2,256

    Re: Fastest way to compare two datatables?

    Not knowing what the data is then there's no other way.

    However, data tables are not the best objects to perform a comparison on (and, again, not knowing your comparison rules, it's difficult to come up with a faster way). You probably want to convert to fundamental objects - integers, etc. - and perform a comparison that way. Or, rather, don't fill a table adapter, or even a dataset (but it depends where your data comes from).
    "Ok, my response to that is pending a Google search" - Bucky Katt.
    "There are two types of people in the world: Those who can extrapolate from incomplete data sets." - Unk.
    "Before you can 'think outside the box' you need to understand where the box is."

  3. #3
    PowerPoster
    Join Date
    Mar 2002
    Location
    UK
    Posts
    4,780

    Re: Fastest way to compare two datatables?

    Maybe you should show us your actual code. It could be that you are looping through every column in the row for failure, and not stopping that row early when you find the first failure. Maybe you are testing every row in 1 table against every row in the other table instead of row 1 vs 1 and then 2 vs 2 etc.

    Also, why are you wrapping the datatable in a tableadapter, you can just use the datatable in your comparisons. I wrote a compare app to find duplicates and differences between 2 datatables, however my data was only checking a set number of columns to find a unique value. I could use sorting and add a key to better filter the data for speed. This is a data-aware scenario that SJW describes.

  4. #4
    Frenzied Member
    Join Date
    May 2014
    Location
    Central Europe
    Posts
    1,372

    Re: Fastest way to compare two datatables?

    it could also be possible that your requirement can completely be done on the database without loading all the data into datatables at all. but who knows... give us some more info.

  5. #5
    Frenzied Member Bulldog's Avatar
    Join Date
    Jun 2005
    Location
    South UK
    Posts
    1,950

    Re: Fastest way to compare two datatables?

    I've used the CompareDataTables in this post and it seems fast, but then again I only have small'ish tables (32x20). http://www.vbforums.com/showthread.p...rn-differences


    • If my post helped you, please Rate it
    • If your problem is solved please also mark the thread resolved

    I use VS2015 (unless otherwise stated).
    _________________________________________________________________________________
    B.Sc(Hons), AUS.P, C.Eng, MIET, MIEEE, MBCS / MCSE+Sec, MCSA+Sec, MCP, A+, Net+, Sec+, MCIWD, CIWP, CIWA
    I wrote my very first program in 1979, using machine code on a mechanical Olivetti teletype connected to an 8-bit, 78 instruction, 1MHz, Motorola 6800 multi-user system with 2k of memory. Using Windows, I dont think my situation has improved.

  6. #6
    PowerPoster i00's Avatar
    Join Date
    Mar 2002
    Location
    1/2 way accross the galaxy.. and then some
    Posts
    2,388

    Re: Fastest way to compare two datatables?

    If the column names vary a lot you could compare these first without loading any data ... and then compare the data for columns that are only in both - eliminating some of the data you need to load and compare?

    ... would reduce memory too!

    Kris

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Jan 2013
    Posts
    144

    Re: Fastest way to compare two datatables?

    Here's my code:
    Code:
    Dim myCON As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\myDB.mdb;Jet OLEDB:Database Password=xxxxxxxxxx"
    
    DT1TableAdapter.Connection.ConnectionString = myCON
    DT1TableAdapter.Connection.Open()
    DT2TableAdapter.Connection.ConnectionString = myCON
    DT2TableAdapter.Connection.Open()
    
    DT1TableAdapter.Fill(DataSet1.DT1)
    DT2TableAdapter.Fill(DataSet1.DT2)
    
    DT1BindingSource.MoveFirst()
    DT2BindingSource.MoveFirst()
    
    For i = 0 To DT1BindingSource.Count – 1
    
        C_ID1 = DT1BindingSource.Current("ID")                
        C_AC1 = DT1BindingSource.Current("AC")
    
        DT2BindingSource.Filter = "ID=" & "'" & C_ID1 & "'"
    
        C_AC2 = DT2BindingSource.Current("AC")
    
        If C_AC1 <> C_AC2 Then
           Dim DRV As DataRow = DirectCast(DT1BindingSource.Item(i), DataRowView).Row
           DataSet1.TempDT.Rows.Add(DRV.ItemArray)
        End If
        DT1BindingSource.MoveNext()
    Next
    End If
    … load the report (TempDT as datasource)
    Please note that I use 'AC' column as a comparison factor because if it is different then all other columns are different as this column represents the total of all other columns' calculations. Also, please note that I need to print all the columns in the report and that's why I load them all with the TableAdapter.Fill

    Thanks
    Last edited by cPubis; Apr 10th, 2015 at 08:17 AM.

  8. #8
    PowerPoster
    Join Date
    Feb 2012
    Location
    West Virginia
    Posts
    14,205

    Re: Fastest way to compare two datatables?

    Sounds like you would be better off using a query that filter out those where the AC value matches. That should be a lot faster than a nested loop approach

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Jan 2013
    Posts
    144

    Re: Fastest way to compare two datatables?

    Quote Originally Posted by DataMiser View Post
    Sounds like you would be better off using a query that filter out those where the AC value matches. That should be a lot faster than a nested loop approach
    Sorry, I forgot to mention that DT1 and DT2 are in-memory datatables. They are a result of complex join queries of three access tables each. So if you suggest using datareaders for the query to create the temp datatable (which is required for the report) then I'm sorry to tell you that it wont work because a datareader cannot be used to fill an in-memory datatables with data it reads from another in-memory datatable! In this comparison I'm tied by too many factors I must take care of otherwise I wont reach my goal which is printing a report with the differences. I mean, if I achieved much better performance but couldn't be able to print the report then it's a fail! Thanks.

  10. #10
    Still learning kebo's Avatar
    Join Date
    Apr 2004
    Location
    Gardnerville,nv
    Posts
    3,757

    Re: Fastest way to compare two datatables?

    If you have the tables in memory, the you can filter them using the DataView Class
    Process control doesn't give you good quality, it gives you consistent quality.
    Good quality comes from consistently doing the right things.

    Vague general questions have vague general answers.
    A $100 donation is required for me to help you if you PM me asking for help. Instructions for donating to one of our local charities will be provided.

    ______________________________
    Last edited by kebo : Now. Reason: superfluous typo's

  11. #11
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: Fastest way to compare two datatables?

    I really do not understand why you are using a BindingSource to iterate through/search the DataTables. It may not matter much, but I would suspect that that is a slower way to access the information than directly accessing the table rows themselves.

    Here is an example of what I mean.

    Code:
    Sub test()
       Dim t1 As New DemoTable
       Dim t2 As New DemoTable
       Dim diffs As DataTable = t1.Clone() ' set table structure
       t2.Rows(1)("AC") = 989
       t2.Rows(3)("AC") = 900
    
       For Each row As DataRow In t1.Rows
          Dim matches As DataRow() = t2.Select(String.Concat("[ID]='", row("ID").ToString, "'"))
          If matches.Length > 0 AndAlso _
             row.Field(Of Int32)("AC") <> matches(0).Field(Of Int32)("AC") Then
             diffs.Rows.Add(matches(0).ItemArray) 
          End If
       Next
    End Sub
    
    Friend Class DemoTable
       Inherits DataTable
       Public Sub New()
          With Me.Columns
             .Add("ID", GetType(String))
             .Add("AC", GetType(Int32))
          End With
          With Me.Rows
             .Add("fred", 23)
             .Add("bill", 55)
             .Add("joe", 312)
             .Add("sue", 1100)
          End With
       End Sub
    End Class

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width