Results 1 to 6 of 6

Thread: [RESOLVED] VB6 / Access - Removing Duplicate Records

Threaded View

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Jul 2006
    Location
    Maldon, Essex. UK
    Posts
    6,334

    Resolved [RESOLVED] VB6 / Access - Removing Duplicate Records

    Ok, I'm no Database Expert.... I assume I'm in the correct forum.....

    I have a Table with about 41,000 records (and growing). The Table has about 40 Columns. It is automatically updated (by a third party) via SMS messages. Occasionally Duplicate Messages are received, where a 'Duplicate Message' is defined by one where all columns except 4 are identical to another record.

    My task is to remove the duplicate records and somehow inform the caller how many duplicates were deleted.

    The approach I have taken is to create a copy of the original recordset, sort the copy into a suitable order (by Columns "PreviousMessageDate" and "SerialNo"), and then check each record aginst the next record and delete it if it matches. Although slow, this seems to work. The problem I have is in trying to report the number of duplicates deleted per 'SerialNo'. I have tried Appending a new Field ("Duplicates") in the Recordset, but in order to do that I have to Close it, and having Closed it I can't seem to re-Open it and access the original records.

    Here's the code
    Code:
    Option Explicit
    
    Private db As ADODB.Connection
    Private rs As ADODB.Recordset
    
    Private Sub Form_Load()
    Dim strSQL As String
    Dim rs1 As ADODB.Recordset
    Set db = New ADODB.Connection
    db.CursorLocation = adUseClient
    db.Open "PROVIDER=Microsoft.Jet.OLEDB.4.0;Data Source=\\HOME-VISTA\Public\PAYG.mdb;"
    Set rs = New ADODB.Recordset
    strSQL = "SELECT * FROM RemoteAuditT"
    rs.Open strSQL, db, adOpenStatic, adLockOptimistic
    If rs.RecordCount <> 0 Then
        Set rs1 = New ADODB.Recordset
        Set rs1 = RemoveDups(rs)
    End If
    '
    ' Rest of the Application Code goes here....
    '
    End Sub
    
    Private Function RemoveDups(rs As ADODB.Recordset) As ADODB.Recordset
    Dim rsTemp As ADODB.Recordset
    Dim intI As Integer
    Dim intRec As Integer
    Dim intDup As Integer
    Dim strCurField As String
    Dim strNewField As String
    Set rsTemp = New ADODB.Recordset
    Set rsTemp = CreateCopy(rs)
    rsTemp.Open
    rsTemp.MoveFirst        ' This is where I get an
                            ' either BOF or EOF is true..etc Message
    rsTemp.Sort = "PreviousMessageDate,SerialNo"
    strCurField = CopyFields(rsTemp)
    rsTemp.MoveNext
    Do
        strNewField = CopyFields(rsTemp)
        If strNewField = strCurField Then
            rsTemp.Delete adAffectCurrent
            intDup = intDup + 1
            rsTemp.MovePrevious
            rsTemp.Fields("Duplicates").Value = intDup
        Else
            strCurField = strNewField
            intDup = 0
        End If
        rsTemp.MoveNext
    Loop Until rsTemp.EOF
    End Function
    
    Private Function CopyFields(rs As ADODB.Recordset) As String
    '
    ' Create a String holding the values for the required fields
    ' of the current record
    '
    Dim intI As Integer
    Dim strFields As String
    For intI = 0 To rs.Fields.Count - 1
        If rs.Fields(intI).Name <> "RMA" And _
            rs.Fields(intI).Name <> "ServerTimeAuditReceived" And _
            rs.Fields(intI).Name <> "TXSignalStrength" And _
            rs.Fields(intI).Name <> "RXSignalStrength" Then
            strFields = strFields & rs.Fields(intI).Value
        End If
    Next intI
    CopyFields = strFields
    End Function
    
    Private Function CreateCopy(rs As ADODB.Recordset) As ADODB.Recordset
    '
    ' Create a Copy of the RecordSet
    ' and add a new Field to the copy
    '
    Dim pb As New PropertyBag
    Dim rsTemp As ADODB.Recordset
    pb.WriteProperty "Copy", rs
    Set rsTemp = pb.ReadProperty("Copy")
    rsTemp.Close
    rsTemp.Fields.Append "Duplicates", adInteger, 2, adFldMayBeNull
    Set CreateCopy = rsTemp
    Set rsTemp = Nothing
    End Function
    I originally used the .Clone Method to create the Copy,but ran into the same issues regarding reporting the Duplicates. The 'Clone' method had the advantage of actually deleting the records from the Table/original Recordset.

    The actual data in the Table is not updated or modified by the target application.

    I suppose I could just return a Comma separated string: SerialNo1,Dups,SerialNo2,Dups etc and let the caller Split and process it, but I'm rather hoping there's a technique / ADODB 'feature' that is somewhat 'tidier'.

    Any suggestions as to where I'm going wrong or a better method of doing what's required would be gratefully received.
    Last edited by Doogle; Jan 15th, 2011 at 07:37 AM. Reason: Added some more Info

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width