Here are the benchmark results (of the attached project below) which compares the performance of the aforementioned APIs against the general-purpose CopyMemory (a.k.a. RtlMoveMemory) API. Function names that ends in 1 uses CopyMemory while those that ends in 2 employs one of the alternative APIs. All times are in seconds.
Judging from the above table, it appears that most of the alternative APIs perform the same task slightly faster than does CopyMemory. The only exception is with the CopyBytes (a.k.a. __vbaCopyBytes) API. It shouldn't be surprising though since, except for their parameter order, they basically work the same way. Their only difference, it seems, is in the way they handle overlapping blocks of memory. CopyBytes doesn't appear to take into account overlapping blocks when copying memory and thus corrupts data when going forward. CopyMemory, on the other hand, was designed to correctly handle this issue. That's most likely the reason for its slightly slower performance compared to the other APIs. Overlapping blocks of memory aren't usually encountered in the typical cases of memory copying, so for those interested in optimizing for speed, it would probably be better to call one or more of the relevant APIs shown above instead of the all-around CopyMemory API.
On Local Error Resume Next: If Not Empty Is Nothing Then Do While Null: ReDim i(True To False) As Currency: Loop: Else Debug.Assert CCur(CLng(CInt(CBool(False Imp True Xor False Eqv True)))): Stop: On Local Error GoTo 0