PDA

Click to See Complete Forum and Search --> : xor bytes


Maven
Dec 6th, 2004, 07:37 PM
I was playing around a day or two ago with an xor encryption..... I had a small need to a faster way of doing it, so I went to an asm dll.

I didn't paste everything here, only whats needed. Can you make it better?



SAFEARRAYBOUND struct
cElements DWORD ? ; Number of Elements
lLbound DWORD ? ; Lower Boundary
SAFEARRAYBOUND ends

OLE_SAFEARRAY struct
cDims WORD ? ; Number of dimensions
fFeatures WORD ? ; Bitfield indicating attributes
cbElements DWORD ? ; size of an element of the array
cLocks DWORD ? ; lock counter 0=Locked
pvData DWORD ? ; Pointer to data
rgsabound SAFEARRAYBOUND <> ; Contains info for dimensions
OLE_SAFEARRAY ends




xorbytes proc msg:DWORD, key:DWORD
push ebx
push esi
push edi
push ebp

mov eax, msg
mov ebx, key
movd MM0, esp

mov edx, [eax]
mov ecx, [ebx]

mov ebp, (OLE_SAFEARRAY ptr [edx]).rgsabound.cElements
mov ebx, (OLE_SAFEARRAY ptr [ecx]).rgsabound.cElements
mov edi, (OLE_SAFEARRAY ptr [edx]).pvData
mov esi, (OLE_SAFEARRAY ptr [ecx]).pvData

cmp ebp, 0
je done

cmp ebx, 0
je done

xor ecx, ecx

mov esp, esi
add esp, ebx

jmp For_Loop

SetKeyBack:
sub esi, ebx
jmp lblCon

align 16
For_Loop:
cmp esi, esp
je SetKeyBack

lblCon:

mov al, [esi]
xor byte ptr [edi], al

add edi, 1
add esi, 1
add ecx, 1
cmp ecx, ebp
jne For_Loop

done:
movd esp, MM0
pop ebp
pop edi
pop esi
pop ebx
ret
xorbytes endp

Maven
Dec 6th, 2004, 07:41 PM
this is the basic algorithm that I converted:

For i = 1 To Len(Message)
plainchar = Asc(Mid(Message, i, 1))
pk = Asc(Mid(key, (i Mod Len(key) + 1), 1))
buffer = buffer & Chr(plainchar Xor pk)
Next i

The only difference is that I got rid of the need for "MOD" (divisons are very epensive and I always try like hell to get rid of them) and buffer as the one in asm just xors the message that is passed to it.

nkad
Dec 21st, 2004, 11:53 PM
Thats good stuff. I wish I knew ASM. I bought a book but didn't get to far.

Just looking at that code makes you appreciate the greatness of high level languages.

Maven
Dec 22nd, 2004, 01:49 AM
Thats good stuff. I wish I knew ASM. I bought a book but didn't get to far.

Just looking at that code makes you appreciate the greatness of high level languages.

It makes you thankful for all that code it saves you from typing. I must say though, I miss the power of asm in high level languages.

tailz
Dec 22nd, 2004, 03:39 AM
The first comparison in the for loop (cmp esi, esp), can you find an alternative to that?

As you know, comparisons are expensive and to do one for every byte of the message/key may be slowing it down.... still, you must have a massive improvement over that hideous piece of VB code (that Im sure could also be improved)

Maven
Dec 22nd, 2004, 01:15 PM
The first comparison in the for loop (cmp esi, esp), can you find an alternative to that?

As you know, comparisons are expensive and to do one for every byte of the message/key may be slowing it down.... still, you must have a massive improvement over that hideous piece of VB code (that Im sure could also be improved)

Compare instructions aren't too bad on my processor (p4):
Clocks : 0.334644

Anyway it's a lot cheaper then multiply wich is like 18 clock cycles and divide which is like 29 clock cycles.

Maven
Dec 25th, 2004, 12:31 PM
Here is the latest version.


xorbytes proc msg:DWORD, key:DWORD
.data

align 4
lblPtr dd L0, L1, L2, L3
keyPtr dd 0
keysize dd 0
counter dd 0

.code

push ebx
push esi
push edi
push ebp

mov ebx, msg ; Message Array
mov edx, key ; Key Array

mov eax, [ebx] ; reference Message Array
mov ecx, [edx] ; reference Key Array

mov ebx, (OLE_SAFEARRAY ptr [ecx]).rgsabound.cElements
mov esi, (OLE_SAFEARRAY ptr [ecx]).pvData
mov ebp, (OLE_SAFEARRAY ptr [eax]).rgsabound.cElements
mov edi, (OLE_SAFEARRAY ptr [eax]).pvData

; -------------------------------------------------------------------------
; EBX = Len of Key
; ESI = Pointer to key data.
; EBP = Len of msg
; EDI = Pointer to msg data.
; -------------------------------------------------------------------------
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

mov keyPtr, esi
mov keysize, ebx

cmp ebx, 4
jb singlebyte ; If keysize < 4 then load by single bytes only.


xor edx, edx ; Clear edx for unsigned divison.
mov eax, ebp ; Copy msgsize into eax
div ebx ; Divide by keysize.

push edx ; Save remainder

mov ebp, eax ; ebp = number of times to loop through message


mov ecx, ebx
shr ecx, 2 ; Shift right 2 places (keeps it 4 byte alinged)
mov eax, ecx
shl eax, 2
sub ebx, eax ; Remainder bytes (Odd bytes)

mov counter, ecx ; Save counter.

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

Loop1:
jmp dword ptr [lblPtr + ebx * 4] ; Jump Table, will jump instantly to the correct label.

L1: ; 1 odd Byte.
mov al, byte ptr [esi]
xor byte ptr [edi], al
add esi, 1
add edi, 1
jmp L0

L2: ; 2 odd bytes
mov ax, word ptr [esi]
xor word ptr [edi], ax
add esi, 2
add edi, 2
jmp L0

L3: ;3 odd bytes
mov ax, word ptr [esi]
mov dl, byte ptr [esi+2]
xor word ptr [edi], ax
xor byte ptr [edi+2], dl
add esi, 3
add edi, 3

L0: ; Aligned by 4
mov eax, DWORD PTR [esi]
xor DWORD PTR [edi], eax
add esi, 4
add edi, 4
sub ecx, 1
jnz L0

mov ecx, counter ; Count used for L0 loop.
mov esi, keyPtr ; Move to start pos of the key.

sub ebp, 1
jnz Loop1
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

pop ebp ; Number of bytes left in msg to xor.

cmp ebp, 0 ; If none then goto done.
je done

mov ebx, keysize ; Move keysize into ebx.

singlebyte:
xor ecx, ecx ; Clear ecx (used for counter).
sub ebx, 1
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

looper:

mov al, byte ptr [esi+ecx] ; Load key byte.
xor byte ptr [edi], al ; xor msg byte with key byte and save.
add edi, 1 ; Point to next byte in msg.

cmp ecx, ebx ; if counter = keysize then
jne S1

xor ecx, ecx

jmp S2

S1: ;else
add ecx, 1 ; add 1 to counter.

S2:
sub ebp, 1
jnz looper

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

done:
xor eax, eax ; Clear Return
pop ebp ; Restore Registers
pop edi
pop esi
pop ebx

ret ; Exit Function
xorbytes endp

Merri
Dec 25th, 2004, 08:48 PM
I had to optimize the VB version:


'for comparison:
' For i = 1 To Len(Message)
' plainchar = Asc(Mid(Message, i, 1))
' pk = Asc(Mid(key, (i Mod Len(key) + 1), 1))
' buffer = buffer & Chr(plainchar Xor pk)
' Next i

Public Function XorEncode(ByVal strMessage As String, ByVal strKey As String) As String
Dim bArrMessage() As Byte, bArrKey() As Byte, bArrOutput() As Byte
Dim msgSize As Long, keySize As Long
Dim I As Long

'convert strings to byte arrays
bArrMessage = strMessage
bArrKey = strKey

msgSize = UBound(bArrMessage) 'upper bound
keySize = LenB(strKey) 'actual size (due to the way used with Mod)

'prepare output buffer
ReDim bArrOutput(msgSize)

'without Unicode string support (step by two)
For I = 0 To msgSize Step 2
'simple Xor based on given key
bArrOutput(I) = bArrMessage(I) Xor bArrKey(I Mod keySize)
Next I

'convert byte array to string
XorEncode = bArrOutput
End Function


I don't know how well this can be compared to the ASM version, but shouldn't be too bad (considering it has to do string conversions, taking that functionality out and working with pure byte arrays would make it superb).


To make this faster:
- use SafeArray trick
- with a small trick, you can get rid of extra math required to use Mod, and so result in less overall math

Merri
Dec 25th, 2004, 09:48 PM
And here is the one with the later trick:


Public Function XorEncode2(ByVal strMessage As String, ByVal strKey As String) As String
Dim bArrMessage() As Byte, bArrKey() As Byte, bArrOutput() As Byte
Dim msgSize As Long, keySize As Long
Dim I As Long, A As Long, B As Byte

'convert strings to byte arrays
bArrMessage = strMessage
bArrKey = strKey

msgSize = UBound(bArrMessage) 'upper bound
keySize = LenB(strKey) 'actual size (due to the way used with Mod)

'prepare output buffer
ReDim bArrOutput(msgSize)

'without Unicode string support (step by two)
For A = 0 To keySize - 1 Step 2
B = bArrKey(A)
For I = A To msgSize Step keySize
bArrOutput(I) = bArrMessage(I) Xor B
Next I
Next A

'convert byte array to string
XorEncode2 = bArrOutput
End Function



Comparison on my machine (compiled, all advanced optimizations) with 100 000 iterations:
- XorEncode: 320 ms
- XorEncode2: 275 ms
- just the string conversion stuff, reserving of memory and function call: 260 ms


Note I really had a long pause between coding it, I did it because I had nothing better to do at the time :D


Another note for those who might come and claim using For...Next loops are slower than Do loops: in this case and use, For loops are slightly faster. No major speed difference can be gained using Do loops instead.

Edit Simplified it a bit while the speed remained the same.
Edit Slight optimization, increase in overall speed by 10ms.

Maven
Dec 26th, 2004, 01:32 AM
And here is the one with the later trick:


Public Function XorEncode2(ByVal strMessage As String, ByVal strKey As String) As String
Dim bArrMessage() As Byte, bArrKey() As Byte, bArrOutput() As Byte
Dim msgSize As Long, keySize As Long
Dim I As Long, A As Long, B As Byte

'convert strings to byte arrays
bArrMessage = strMessage
bArrKey = strKey

msgSize = UBound(bArrMessage) 'upper bound
keySize = LenB(strKey) 'actual size (due to the way used with Mod)

'prepare output buffer
ReDim bArrOutput(msgSize)

'without Unicode string support (step by two)
For A = 0 To keySize - 1 Step 2
B = bArrKey(A)
For I = A To msgSize Step keySize
bArrOutput(I) = bArrMessage(I) Xor B
Next I
Next A

'convert byte array to string
XorEncode2 = bArrOutput
End Function



Comparison on my machine (compiled, all advanced optimizations) with 100 000 iterations:
- XorEncode: 320 ms
- XorEncode2: 275 ms
- just the string conversion stuff, reserving of memory and function call: 260 ms


Note I really had a long pause between coding it, I did it because I had nothing better to do at the time :D


Another note for those who might come and claim using For...Next loops are slower than Do loops: in this case and use, For loops are slightly faster. No major speed difference can be gained using Do loops instead.

Edit Simplified it a bit while the speed remained the same.
Edit Slight optimization, increase in overall speed by 10ms.


right now I'm doing 1,000,001 iterations at:

90-130

Reason for such a big difference is because I don't have the algorithm testing for miss-aligned data. So its running 90 on aligned bytes and 130 on unalinged. =/ You get kicked in the balls if you don't align your data when loading 4 bytes at a time.

I also need to get it running in Parallel. Get another nice little speed boast doing that. Definitely need to get rid of that 2nd loop.

Anyway we'll be some xoring mofo's if I ever get that done. Setting up the loop to do all that is the trouble.

Maven
Dec 27th, 2004, 06:48 AM
I'm growing board with this so this will be the final version. I've attached the source and exe's.


xorbytes proc msg:DWORD, key:DWORD
.data
align 4
byte_4_counter dd 0

.code

push ebx
push esi
push edi
push ebp

mov ebx, msg ; Message Array
mov edx, key ; Key Array

mov eax, [ebx] ; reference Message Array
mov ecx, [edx] ; reference Key Array

mov ebx, (OLE_SAFEARRAY ptr [ecx]).rgsabound.cElements
mov esi, (OLE_SAFEARRAY ptr [ecx]).pvData
mov ebp, (OLE_SAFEARRAY ptr [eax]).rgsabound.cElements
mov edi, (OLE_SAFEARRAY ptr [eax]).pvData

; -------------------------------------------------------------------------
; EBX = Len of Key
; ESI = Pointer to key data.
; EBP = Len of msg
; EDI = Pointer to msg data.
; -------------------------------------------------------------------------
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

xor edx, edx ; Clear Edx
mov eax, ebp ; Set eax to size of msg
div ebx ; Divide by keysize

sub ebp, edx ; Minus odd bytes
push edx ; Save Remainder (Odd bytes).

mov eax, ebx ; Move eax len of key
shr eax, 2 ; Divide by 4
mov byte_4_counter, eax ; Save how many times to load by 4
shl eax, 2 ; shift it back
mov ecx, ebx
sub ecx, eax ; Subtract
xor edx, edx

; -------------------------------------------------------------------------
; Ecx now has the odd byte count for our key
; -------------------------------------------------------------------------


cmp ecx, 0 ; If no odd bytes
je byte_4

cmp ecx, 2 ; If 2 odd bytes
je byte_2

cmp ecx, 3 ; if 3 odd bytes
je byte_3



; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

; -------------------------------------------------------------------------
; XOR ONE BYTE AT A TIME.
; -------------------------------------------------------------------------

byte_1:
xor ecx, ecx ; Clear counter
movzx eax, byte ptr [esi] ; Load the key char into al
L1:
xor byte ptr [edi+ecx], al ; Xor msg by key char
add ecx, ebx ; add keysize to msg
cmp ecx, ebp
jb L1
mov edx, 1
jmp byte_4


; -------------------------------------------------------------------------
; XOR TWO BYTES AT A TIME.
; -------------------------------------------------------------------------

byte_2:
xor ecx, ecx ; Clear Counter
mov dx, word ptr [esi]

L2:
xor word ptr [edi+ecx], dx ; Xor msg by key charecters


add ecx, ebx ; add keysize to counter
cmp ecx, ebp ; if ecx >= msg then goto byte_4
jb L2
mov edx, 2
jmp byte_4

; -------------------------------------------------------------------------
; XOR THREE BYTES AT A TIME.
; -------------------------------------------------------------------------

byte_3:
xor ecx, ecx ; Clear Counter
mov ax, word ptr [esi] ; Load 2 bytes into ax
mov dl, byte ptr [esi+2] ; load 1 byte into dl
L3:
xor word ptr [edi+ecx], ax ; xor 3 bytes
xor byte ptr [edi+ecx+2], dl
add ecx, ebx
cmp ecx, ebp ; if ecx >= msg then goto byte_4
jb L3
mov edx, 3
; -------------------------------------------------------------------------
; XOR FOUR BYTES AT A TIME.
; -------------------------------------------------------------------------
byte_4:
mov ecx, byte_4_counter ; Get number of times to load by 4.
cmp ecx, 0 ; If 0 then jump to byte_8
je PassTwo

L4:
mov ecx, edx ; Clear Ecx
mov eax, dword ptr [esi+edx] ; load 4 bytes

L5:
xor dword ptr [edi+ecx], eax ; xor 4 bytes of msg.
add ecx, ebx ; add keysize to ecx
cmp ecx, ebp
jb L5

add edx, 4

sub byte_4_counter, 1 ; For each load of 4 bytes, loop.
jnz L4
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

PassTwo:

pop edx ; Get odd bytes in message
cmp edx, 0 ; If none then we are finished.
je done



;comment * -----------------------------------------------------

oddbytes:
mov al, byte ptr [esi] ; Load key byte.
add esi, 1
xor byte ptr [edi+ebp], al ; xor msg byte with key byte and save.
add ebp, 1 ; Point to next byte in msg.
sub edx, 1
jnz oddbytes

;------------------------------------------------------- *

done:
xor eax, eax
pop ebp
pop edi
pop esi
pop ebx
ret
xorbytes endp

Merri
Dec 27th, 2004, 08:40 AM
Just for comparison between VB6 and ASM performance (though it is still possible to optimize the VB6 code but I'm too lazy to do it).

I made them as comparable as possible, working just about the same way, removed a sub call from the faster VB one (because calling a sub in VB costs too much time in benchmarking).

VB one is about two times slower.

Maven
Dec 27th, 2004, 12:14 PM
Just for comparison between VB6 and ASM performance (though it is still possible to optimize the VB6 code but I'm too lazy to do it).

I made them as comparable as possible, working just about the same way, removed a sub call from the faster VB one (because calling a sub in VB costs too much time in benchmarking).

VB one is about two times slower.

You have to convert mine to asci first.

IE:

tmpMsg = StrConv(MAINMSG, vbFromUnicode)
tmpKey = StrConv(MAINKEYWORD, vbFromUnicode)



You'll notice that as you increase the message size, the difference between the vb code and the asm code gets larger.


Just leave everything like it is and set the msg to that quote and you'll see the difference.

Merri
Dec 27th, 2004, 12:49 PM
Uh, but that's not optimal at all! Doing the extra StrConv just uses extra processing power for nothing. You can actually run it in the Unicode insanely fast (as you can see from the test). No need to use slow StrConv beforehand.

The thing is, the pure VB code is still pretty fast, considering how much extra coding you had to do just to make it do that simple thing. Also, there is no need for an extra DLL (which most people prefer to avoid while making their application). And the one side being, you have to be sure you're making something compatible with all processors while coding in ASM.

I'd understand this if you were making it for something that processes several megabytes of data and you'd need to do it fast. Since you are doing only simple XOR, which pretty much everyone with some knowledge can crack, I see no real point in your project except the possibility to learn ASM.

My general point being here is that you can code fast enough code with VB. I mostly wanted to do the optimized VB code after seeing you were comparing ASM to the String-mode VB6 code with Asc and Mid, which with just one call is slower than calling the ASM code million times... :)

Maven
Dec 27th, 2004, 03:04 PM
Uh, but that's not optimal at all! Doing the extra StrConv just uses extra processing power for nothing. You can actually run it in the Unicode insanely fast (as you can see from the test). No need to use slow StrConv beforehand.

The thing is, the pure VB code is still pretty fast, considering how much extra coding you had to do just to make it do that simple thing. Also, there is no need for an extra DLL (which most people prefer to avoid while making their application). And the one side being, you have to be sure you're making something compatible with all processors while coding in ASM.

I'd understand this if you were making it for something that processes several megabytes of data and you'd need to do it fast. Since you are doing only simple XOR, which pretty much everyone with some knowledge can crack, I see no real point in your project except the possibility to learn ASM.

My general point being here is that you can code fast enough code with VB. I mostly wanted to do the optimized VB code after seeing you were comparing ASM to the String-mode VB6 code with Asc and Mid, which with just one call is slower than calling the ASM code million times... :)

I wanted to work with it in ascii for a reason actually. When using it with other ciphers you can't have the unicde in there or you would open your message up to a freq attack. Eventually I'd like to create a set of high speed crypto functions. But that I'm afraid is a long ways off. I can program in asm but I can't completely bend it to my will as of yet. There is a whole bunch of stuff you have to know in order to really optimizing the living daylights out of code. You have to worry about latency, cpu cache, memory and code alignment, register dependencies, and various other things. Anyway I've seen people who are great at it take a good c algorithm and just simply smoke it. Eventually I wanna be able to do that lol.

Well it's not really all that much code honestly, you have to remember that ASM is a one to one with machine language. As far as dlls goes, I never did have a problem using them. As long as you don't modify the interface to the dll, you don't even have to re-compile your program.

You have processor directives that keep up with what processor your code will run on. .386 for example, will run on any 32 bit processor. It's when you start using special instructions that you have to test for processors. Examples would be: MMX, SSE, SSE2, SSE3, and 3DNow, etc.

I completely dissagree about visual basic being fast enough. Ask any user in the world, it's never fast enough.

Merri
Dec 27th, 2004, 04:21 PM
Actually, I just asked on IRC and everyone said something similar to "yes, Visual Basic is fast enough" :) Of course it depends on what you're doing, but on most needs, VB really is enough. And VB6 isn't running all that slow compared to C/C++ as the compiler of VB6 is a hacked C/C++ compiler... :)

Maven
Dec 28th, 2004, 02:42 AM
Actually, I just asked on IRC and everyone said something similar to "yes, Visual Basic is fast enough" :) Of course it depends on what you're doing, but on most needs, VB really is enough. And VB6 isn't running all that slow compared to C/C++ as the compiler of VB6 is a hacked C/C++ compiler... :)

What irc channel do you hang out in?

I think many people get too hung up in a language. Languages solve different needs and one should always gard against being too conservative in their choices of languages.

Visual basic and C++ works quite a bit differently. In C++ you are responsable for a lot more then what you are in visual basic. That and a very long history of optimization by many people other then microsoft are what make C and C++ so much faster then visual basic. The same is going to be true for the .net enviroment according to Microsoft.

Speed isn't a virtue you look for in visual basic. What you do look for is how easy it is to use. When you're programming in visual basic, you really don't have any worries. Just about every single aspect of visual basic is point and click. I think some visual basic programmer would be shocked to find out how much visual basic actually does for you. That is the best thing about visual basic though and I think microsoft did a great job at it. So if I was going to say visual basic solves a need, I would say it solves the need to do some programming work quickly and easily.

If you want to do a major application though, visual basic isn't really enough.