PDA

Click to See Complete Forum and Search --> : Removing duplicates from an array


plenderj
Apr 2nd, 2002, 05:21 AM
How to remove duplicated items in an array.
This is (in my opinion) a very fast way of doing it.

I don't know of any faster methods.
If you need to remove duplicated items from an array of a different type, then just adjust the code accordingly.

Note: You will need to add a reference to "Microsoft Scripting Runtime" as the code uses its Dictionary object.
To do this, select Project from the toolbar, then select "References", and then select "Microsoft Scripting Runtime"


Option Explicit

Private Sub removeDuplicates(ByRef arrName() As Long)
Dim i As Long, tempArr() As Long: ReDim tempArr(UBound(arrName))
Dim d As New Dictionary, n As Long
For i = 0 To UBound(arrName)
If Not d.Exists(arrName(i)) Then
d.Add arrName(i), arrName(i)
tempArr(n) = arrName(i): n = n + 1
End If
Next
ReDim Preserve tempArr(n)
arrName = tempArr
End Sub

Private Sub Form_Load()
Dim x() As Long, i As Long: ReDim x(99)

'' this loop will fill the array with values :
'' 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3...
''
For i = 0 To 99
x(i) = i Mod 4
Next

'' now remove the duplicated items
''
removeDuplicates x

'' now display whats left
For i = 0 To UBound(x) - 1
Debug.Print i & ":" & x(i)
Next
End Sub

plenderj
Oct 21st, 2004, 08:42 AM
I still believe this to be the fastest method of removing duplicated items available in Classic VB.

plenderj
Oct 21st, 2004, 09:45 AM
* 21-October-2004 - Moved to CodeBank *

alawra
Oct 31st, 2004, 07:09 AM
how to use it in text please

plenderj
Oct 31st, 2004, 12:27 PM
Just modify the code above to filter out Strings instead of Longs.

Merri
Dec 30th, 2004, 11:36 PM
I still believe this to be the fastest method of removing duplicated items available in Classic VB.

You don't need to believe anymore: it isn't :)


This one is faster with strings by 10 - 20%

Public Sub strArrRemoveDuplicate(ByRef StringArray() As String)
Dim LowBound As Long, UpBound As Long
Dim TempArray() As String, Cur As Long
Dim A As Long, B As Long

'check for empty array
If (Not StringArray) = True Then Exit Sub

'we need these often
LowBound = LBound(StringArray)
UpBound = UBound(StringArray)

'reserve check buffer
ReDim TempArray(LowBound To UpBound)

'set first item
Cur = LowBound
TempArray(Cur) = StringArray(LowBound)



'loop through all items
For A = LowBound + 1 To UpBound
'make a comparison against all items
For B = LowBound To Cur
'if is a duplicate, exit array
If LenB(TempArray(B)) = LenB(StringArray(A)) Then
If InStrB(1, StringArray(A), TempArray(B), vbBinaryCompare) = 1 Then Exit For
End If
Next B
'check if the loop was exited: add new item to check buffer if not
If B > Cur Then Cur = B: TempArray(Cur) = StringArray(A)
Next A

'fix size
ReDim Preserve TempArray(LowBound To Cur)
'copy
StringArray = TempArray
End Sub



This works for Byte, Integer and Long datatypes and is four to five times faster:

Public Sub bArrRemoveDuplicate(ByRef ByteArray() As Byte)
Dim LowBound As Long, UpBound As Long
Dim TempArray() As Byte, TempByte As Byte, Cur As Long
Dim A As Long, B As Long

'check for empty array
If (Not ByteArray) = True Then Exit Sub

'we need these often
LowBound = LBound(ByteArray)
UpBound = UBound(ByteArray)

'reserve check buffer
ReDim TempArray(LowBound To UpBound)

'set first item
Cur = LowBound
TempArray(Cur) = ByteArray(LowBound)

'loop through all items
For A = LowBound + 1 To UpBound
TempByte = ByteArray(A)
'make a comparison against all items
For B = LowBound To Cur
'if is a duplicate, exit array
If (TempArray(B) Xor TempByte) = 0 Then Exit For
Next B
'check if the loop was exited: add new item to check buffer if not
If B > Cur Then Cur = B: TempArray(Cur) = ByteArray(A)
Next A

'fix size
ReDim Preserve TempArray(LowBound To Cur)
'copy
ByteArray = TempArray
End Sub


To convert it to use Long for example, just use VB's inbuilt replace from the edit menu and make it convert Byte to Long. And rename the function, of course :)


What is the best thing with these functions: you don't need to add any extra reference to your project!

plenderj
Jan 10th, 2005, 03:41 AM
Can you post the code you used to compare the times, because in the brief test I did my code still worked faster...

Maven
Jan 14th, 2005, 07:08 PM
http://vbforums.com/attachment.php?attachmentid=33047

plenderj
Jan 24th, 2005, 05:54 AM
Ah I take it you're comparing my code to your ASM code?

Maven
Jan 24th, 2005, 12:10 PM
Ah I take it you're comparing my code to your ASM code?

Na, he was comparing my asm code, you're code, with his code.

oli12345
Nov 10th, 2009, 08:44 AM
How about this compared to your methods? How are you comparing the speeds?

Function removeDuplicates(ByVal initialArray As String()) As String()
Dim i As Integer = 0
Dim j As Integer = 0
Dim newArray(0) As String

For i = 0 To UBound(initialArray)
For j = 0 To UBound(initialArray)
If Not initialArray(i) = "" Then
If Not j = i Then
If initialArray(i) = initialArray(j) Then
initialArray(j) = ""
End If
End If
End If
Next
Next

j = 0
For i = 0 To UBound(initialArray)
If Not initialArray(i) = "" Then
ReDim Preserve newArray(j)
newArray(j) = initialArray(i)
j = j + 1
End If
Next

Return newArray

End Function