Quite apart from that, it is one call no matter the number of elements to shift. The time is therefore always constant, whereas with a one-by-one element moving loop the time will scale linearly relative to the number of elements.

An API call will always be disgustingly complex compared to straight assembly though. If VB had something like a MemMove statement then it would be much faster than using RtlMoveMemory.