[RESOLVED] VB6 / Access - Removing Duplicate Records-VBForums
Results 1 to 6 of 6

Thread: [RESOLVED] VB6 / Access - Removing Duplicate Records

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Jul 2006
    Location
    Maldon, Essex. UK
    Posts
    6,334

    Resolved [RESOLVED] VB6 / Access - Removing Duplicate Records

    Ok, I'm no Database Expert.... I assume I'm in the correct forum.....

    I have a Table with about 41,000 records (and growing). The Table has about 40 Columns. It is automatically updated (by a third party) via SMS messages. Occasionally Duplicate Messages are received, where a 'Duplicate Message' is defined by one where all columns except 4 are identical to another record.

    My task is to remove the duplicate records and somehow inform the caller how many duplicates were deleted.

    The approach I have taken is to create a copy of the original recordset, sort the copy into a suitable order (by Columns "PreviousMessageDate" and "SerialNo"), and then check each record aginst the next record and delete it if it matches. Although slow, this seems to work. The problem I have is in trying to report the number of duplicates deleted per 'SerialNo'. I have tried Appending a new Field ("Duplicates") in the Recordset, but in order to do that I have to Close it, and having Closed it I can't seem to re-Open it and access the original records.

    Here's the code
    Code:
    Option Explicit
    
    Private db As ADODB.Connection
    Private rs As ADODB.Recordset
    
    Private Sub Form_Load()
    Dim strSQL As String
    Dim rs1 As ADODB.Recordset
    Set db = New ADODB.Connection
    db.CursorLocation = adUseClient
    db.Open "PROVIDER=Microsoft.Jet.OLEDB.4.0;Data Source=\\HOME-VISTA\Public\PAYG.mdb;"
    Set rs = New ADODB.Recordset
    strSQL = "SELECT * FROM RemoteAuditT"
    rs.Open strSQL, db, adOpenStatic, adLockOptimistic
    If rs.RecordCount <> 0 Then
        Set rs1 = New ADODB.Recordset
        Set rs1 = RemoveDups(rs)
    End If
    '
    ' Rest of the Application Code goes here....
    '
    End Sub
    
    Private Function RemoveDups(rs As ADODB.Recordset) As ADODB.Recordset
    Dim rsTemp As ADODB.Recordset
    Dim intI As Integer
    Dim intRec As Integer
    Dim intDup As Integer
    Dim strCurField As String
    Dim strNewField As String
    Set rsTemp = New ADODB.Recordset
    Set rsTemp = CreateCopy(rs)
    rsTemp.Open
    rsTemp.MoveFirst        ' This is where I get an
                            ' either BOF or EOF is true..etc Message
    rsTemp.Sort = "PreviousMessageDate,SerialNo"
    strCurField = CopyFields(rsTemp)
    rsTemp.MoveNext
    Do
        strNewField = CopyFields(rsTemp)
        If strNewField = strCurField Then
            rsTemp.Delete adAffectCurrent
            intDup = intDup + 1
            rsTemp.MovePrevious
            rsTemp.Fields("Duplicates").Value = intDup
        Else
            strCurField = strNewField
            intDup = 0
        End If
        rsTemp.MoveNext
    Loop Until rsTemp.EOF
    End Function
    
    Private Function CopyFields(rs As ADODB.Recordset) As String
    '
    ' Create a String holding the values for the required fields
    ' of the current record
    '
    Dim intI As Integer
    Dim strFields As String
    For intI = 0 To rs.Fields.Count - 1
        If rs.Fields(intI).Name <> "RMA" And _
            rs.Fields(intI).Name <> "ServerTimeAuditReceived" And _
            rs.Fields(intI).Name <> "TXSignalStrength" And _
            rs.Fields(intI).Name <> "RXSignalStrength" Then
            strFields = strFields & rs.Fields(intI).Value
        End If
    Next intI
    CopyFields = strFields
    End Function
    
    Private Function CreateCopy(rs As ADODB.Recordset) As ADODB.Recordset
    '
    ' Create a Copy of the RecordSet
    ' and add a new Field to the copy
    '
    Dim pb As New PropertyBag
    Dim rsTemp As ADODB.Recordset
    pb.WriteProperty "Copy", rs
    Set rsTemp = pb.ReadProperty("Copy")
    rsTemp.Close
    rsTemp.Fields.Append "Duplicates", adInteger, 2, adFldMayBeNull
    Set CreateCopy = rsTemp
    Set rsTemp = Nothing
    End Function
    I originally used the .Clone Method to create the Copy,but ran into the same issues regarding reporting the Duplicates. The 'Clone' method had the advantage of actually deleting the records from the Table/original Recordset.

    The actual data in the Table is not updated or modified by the target application.

    I suppose I could just return a Comma separated string: SerialNo1,Dups,SerialNo2,Dups etc and let the caller Split and process it, but I'm rather hoping there's a technique / ADODB 'feature' that is somewhat 'tidier'.

    Any suggestions as to where I'm going wrong or a better method of doing what's required would be gratefully received.
    Last edited by Doogle; Jan 15th, 2011 at 07:37 AM. Reason: Added some more Info

  2. #2
    PowerPoster RhinoBull's Avatar
    Join Date
    Mar 2004
    Location
    New Amsterdam
    Posts
    24,135

    Re: VB6 / Access - Removing Duplicate Records

    There are several ways to manage dupplicate records...
    you may:
    - loop through all records and try to find a match and based on your findings insert or refuse
    - check for dups with simple sql using Having count(some key field) > 1 and insert distinct record into some "stage" table (this can be done with one statement)
    - then, you can delete all dups from main table and finally insert all from the satging (you will of course need to purge stage table at the end)
    - etc, etc, etc...


    However, why not definingg unque constraint on you table to disallow dups in the first plae?

  3. #3
    PowerPoster RhinoBull's Avatar
    Join Date
    Mar 2004
    Location
    New Amsterdam
    Posts
    24,135

    Re: VB6 / Access - Removing Duplicate Records

    Btw, if you can define unique costraint then each rejected record can be recorded into some dups table.
    Once it's setup you will be able to easily "inform the caller how many duplicates were rejected" (name of your caller can be stored in the dups table).

  4. #4

    Thread Starter
    PowerPoster
    Join Date
    Jul 2006
    Location
    Maldon, Essex. UK
    Posts
    6,334

    Re: VB6 / Access - Removing Duplicate Records

    Quote Originally Posted by RhinoBull View Post
    However, why not definingg unque constraint on you table to disallow dups in the first plae?
    Aye, there's the rub. As I said, the updating is via a third party and until they can get their act together, muggins (ie. me)has to cope with it !!

  5. #5

  6. #6

    Thread Starter
    PowerPoster
    Join Date
    Jul 2006
    Location
    Maldon, Essex. UK
    Posts
    6,334

    Re: VB6 / Access - Removing Duplicate Records

    I forgot to say thanks for the advice.

    Basically, there are a few hundred machines dotted about the country which at a certain time of day, report usage, status and other statistics to a Table in a Central Database via SMS Messaging. This update is effectively just a dump of whatever is received into the Table - no validation is performed. The only thing we can rely on is that each record has a unique ascending key (RMA). Not a particularly clever design but it's the one we have to work with.

    We do not have the ability to change the design of the Table in the Central Database. It is under the control of the people who make the hardware that populates it. The fact that duplicates can be generated may indicate a fault in a particular machine, or some sort of 'manual intervention', which we need to be awae of.

    We read this table, identifying new records (by 'remembering' the last used RMA), add those into our Table and produce statistics and Invoices.

    Perhaps we should add a 'Duplicates' column to our Table and in the reading process check for duplicates, count, report and remove them. I was rather hoping we didn't need to do that, but on reflection, and on the advice you've given I think it's going to be the most managable solution.

    The application has been running for about a year and the duplicates issue arose in December 2010 - pity no one told us of the possibility when we started, or rather it's a pity we didn't think to ask, we could have catered for them in our original design - ah the benefits of 20/20 hindsight and thorough analysis and design
    Last edited by Doogle; Jan 16th, 2011 at 03:24 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width

Survey posted by VBForums.