|
-
Jan 14th, 2004, 04:03 PM
#1
Square root??? *Gave up*
Do anyone know what method that are used to performe Square root in C++? Is it Newton Raphsons? If so I have an idea. Maybe it could be made faster if you wrote it in ASM, and you didn't need 100% correct answer. In DX you have to use Sqr a couple of times even if you try not to. But it is using really long time. But if you only need lets say the 4 first decimales right mayeb it could be done faster. What do you think?
Never used ASM code in VC++, but I have heard it should work. So if anyone think it can be done faster I will try it out, but I am not really good in ASM so I guess I will use a lot of time to optimize it, so I am not going to try if it's not worth it.
Last edited by NoteMe; Jan 18th, 2004 at 08:12 PM.
-
Jan 14th, 2004, 05:02 PM
#2
transcendental analytic
Yeah, newton raphson it is, dunno if it would get any faster with ASM
ok note I fixed you something, although I haven't tested it yet:
Code:
float root(float N){
float x,i;
do{
x=i;
i=0.5*( x + N/x );
}while(abs(i-x) > 0.00001);
return i;
}
Last edited by kedaman; Jan 14th, 2004 at 10:19 PM.
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 14th, 2004, 05:15 PM
#3
What is "i" in that "code"....and why don't you think it will be faster then the Sqr in C++. If Sqr is using Newton Raphsons it has to do it in a loop or recursivt, and the "loop" will either way do it a lot more times then the aproximat way with only about 4 decimales...
-
Jan 14th, 2004, 05:19 PM
#4
Frenzied Member
Libraries implement different ways to get square roots. Some use integer arithmetic because it's faster. They don't use Newton Raphson necessarily.
Paul Hsieh's page has the state of the art. Before you re-invent a not-too-round wheel, check out the various algorithms - C and asm:
http://www.azillionmonkeys.com/qed/sqroot.html
-
Jan 14th, 2004, 05:20 PM
#5
thanks Jim...I will have a look at it. Hope you saved me a lot of time here..
-
Jan 15th, 2004, 11:32 AM
#6
Originally posted by kedaman
Yeah, newton raphson it is, dunno if it would get any faster with ASM
ok note I fixed you something, although I haven't tested it yet:
Code:
float root(float N){
float x,i;
do{
x=i;
i=0.5*( x + N/x );
}while(abs(i-x) > 0.00001);
return i;
}
OK..thanks I will have a go at it. And tell you what I think about it...
-
Jan 15th, 2004, 11:33 AM
#7
BTW "i" has to start somewhere doesn't it?
-
Jan 15th, 2004, 11:35 AM
#8
I guess I could try to find a fast formula that can calculate an "i" that is close to the answer...
-
Jan 15th, 2004, 11:37 AM
#9
transcendental analytic
yeah you could make a quick guess. I don't know which is better, linear approximation or logaritmic, theyre both fast to implement though..
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 15th, 2004, 11:40 AM
#10
Will try it out, and see what is working best...guess after all the CPU type has something to say what method is fastest in the end...
-
Jan 17th, 2004, 12:12 AM
#11
Addicted Member
Square root???
If its speed you need download the quake III source or any recent game source they have fast square root functions in there math file for finding vector magnitudes which are very important in 3d games.
-
Jan 17th, 2004, 02:27 AM
#12
OK...thanks....seen any links to the source lately?
-
Jan 17th, 2004, 02:34 AM
#13
I can only find Quace 2 source code.....Will have a look at it at see if there is something there too..
-
Jan 17th, 2004, 03:13 AM
#14
I don't have more time to look at it right now...will do it after work. But the only places I have found where he is using Sqr is places like here:
Code:
double sqrt(double x);
vec_t VectorLength(vec3_t v)
{
int i;
float length;
length = 0;
for (i=0 ; i< 3 ; i++)
length += v[i]*v[i];
length = sqrt (length); // FIXME
return length;
}
And he is writing fimfe after nearly all the places. But is sqrt(double) a C++ lib thingy or is it something he has made?
-
Jan 17th, 2004, 02:11 PM
#15
Addicted Member
square root
Quake 3 has the following code but I cant figure out what the hell it does and didnt get even close to a correct square root.
This website explains what the quake 3 code for square root does. It gets close to the inverse square root of a number using newtons method so the more iterations the more accurate it is. Writing your own square root will be really hard and I doubt you will get much faster than this.
http://www.magic-software.com/Docume...nverseSqrt.pdf
Code:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the ****?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
Last edited by abcdefg; Jan 17th, 2004 at 02:37 PM.
-
Jan 17th, 2004, 03:06 PM
#16
Without reading more I can't see what they are using i for at all...bu this is "1/Sqr(x)".....not sure if that was what I wanted....but maybe if I understand this code maybe I can use it to calcultate"Sqr(x), but it doesn't look good.
-
Jan 17th, 2004, 04:35 PM
#17
transcendental analytic
excuse me, but when does the vector length need to be calculated beyond cases where you can do with its square? for 3dgames that is.
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 17th, 2004, 07:45 PM
#18
Addicted Member
square root
the adobe file link explains why the i is in the code. The line with the i has a lot of math behind it and is pretty complicated to come up with.
You could always use the regular square root it only uses about 70 clock cycles. Just dont call it a ton of times.
-
Jan 18th, 2004, 03:24 PM
#19
Originally posted by kedaman
excuse me, but when does the vector length need to be calculated beyond cases where you can do with its square? for 3dgames that is.
Where you talking to me or the Quake 3 example?
-
Jan 18th, 2004, 03:28 PM
#20
Re: square root
Originally posted by abcdefg
You could always use the regular square root it only uses about 70 clock cycles. Just dont call it a ton of times.
As I stated in my first post, there is a lot you can do to try to avoid using Sqr, but some times you just have to use it, and it would be great to have a working example of a Sqr formula that wasn't that correct. Working with QuerreryPercornace* in C++ now so I can try to test the diffrent approches...never used it inn C++ before...
-
Jan 18th, 2004, 04:26 PM
#21
transcendental analytic
yeah i was talking to you note and the quake3 example at the same time, but most of the time you, i want to know.
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 18th, 2004, 04:49 PM
#22
I have never talked about vector lenghts at all...I said that I can't remember right now where I have to use it...but in Chapter 5 in my DX book it said that I had to use it for that purpose. If you want me to look through the whole chapter for you too see what he ment, I can do that....but I am not going to check if I don't have to before I get to that stage in the game...
-
Jan 18th, 2004, 04:56 PM
#23
Originally posted by kedaman
Yeah, newton raphson it is, dunno if it would get any faster with ASM
ok note I fixed you something, although I haven't tested it yet:
Code:
float root(float N){
float x,i;
do{
x=i;
i=0.5*( x + N/x );
}while(abs(i-x) > 0.00001);
return i;
}
OK...I am testing this now....havn't found any info at the forum about QuerreryPerformance API, so I am trying to do this with GetTickCount. But I have more wuestions here.
N is the value you want to find the square root of.
i is after a while the square root. But what is x? The last iteration of the formula? So that should start as N?
-
Jan 18th, 2004, 06:32 PM
#24
Just forget it Keda...I think it will be really slow anyway. I have to learn more ASM to get this up and going.
I tested a bit and trancelated the one that I used in VB:
Code:
inline float Sqr2(float N){
float last = (1/2) + (N / 2);
float* pLast = &last;
float now = ((*pLast) / 2) + (N / (2 * (*pLast)));
float* pNow = &now;
float* temp;
while(((*pLast)-(*pNow)) > 0.0001){
temp = pLast;
pLast = pNow;
pNow = temp;
*pNow = ((*pLast) / 2) + (N / (2 * (*pLast)));
}
return *pNow;
}
And it looks like it is 30 times slover then the one in math.h
Then I looked at the one in ASM:
inline float Sqr3(float num) {
_asm fld num
_asm fsqrt
_asm fst num
return num;
}
And it looks like even that one is 10 times slower...
-
Jan 18th, 2004, 08:11 PM
#25
This is the give up post of this thread....tried a lot of things now...and I think it will take me years to do it better then the buildt in function.... ....
Here is the code....DON'T USE IT...
Code:
//ØØ sin Newton teite opptimalisering
inline float Sqr2(float N){
float last = (1/2) + (N / 2);
float* pLast = &last;
float now = ((*pLast) / 2) + (N / (2 * (*pLast)));
float* pNow = &now;
float* temp;
while(((*pLast)-(*pNow)) > 0.0001){
temp = pLast;
pLast = pNow;
pNow = temp;
*pNow = ((*pLast) / 2) + (N / (2 * (*pLast)));
}
return *pNow;
}
//Stian sin ASM teite opptimalisering
inline float Sqr3(float num) {
_asm fld num
_asm fsqrt
_asm fst num
return num;
}
inline float Sqr4(float y){
float c;
float b;
float a = 0.5*(y+1.0); /* Initial guess */
if( a<1.0 ) b = 2.0; /* Initial reciprocal of guess */
else b = 0.5;
for(c=0;c<14;c++) {
a = 0.5*(a+y*b);
//t = b*(2.0-a*b);
//b = (t>0) ? t : b * 0.5;
//printf("(a,b)=(%19.16f,%19.16f)\n",a,b);
}
return a;
}
double __stdcall Sqr5(double x) {
__asm {
; I'm assuming that the argument is aligned to a 64-bit boundary.
mov eax,0BFCDD6A1h ; 1u Constant from James Van Buskirk
mov edx,[esp+8] ; 1v Potential pms.
sub eax,edx ; 2u
push 03FC00000h ; 2v Constant 1.5, aligns stack
shr eax,1 ; 3u
sub edx,000100000h ; 3v =.5*x, biased exponent must > 1
mov [esp+12],edx ; 4u
push eax ; 4v
; The lower 32-bits of the estimate come from uninitialized stack.
fld QWORD PTR [esp-4] ; 5 Potential pms
fmul st,st ; 6-8
fld QWORD PTR [esp-4] ; 7
fxch st(1) ; 7x
fmul QWORD PTR [esp+12] ; 9-11 Potential pms
fld DWORD PTR [esp+4] ; 10
add esp,4 ; 12 Faster on Pro/PII
fsub st,st(1) ; 12-14
fmul st(1),st ; 15-17
fmul st(1),st ; 18-20
fld DWORD PTR [esp] ; 19
fxch st(1) ; 19
fmulp st(2),st ; 20-22
fsub st,st(1) ; 21-23
fmul st(1),st ; 24-26
fmul st(1),st ; 27-29
fld DWORD PTR [esp] ; 28
fxch st(1) ; 28
fmulp st(2),st ; 29-31
fsub st,st(1) ; 30-32
fmul st(1),st ; 33-35
pop eax ; 34
fmul st(1),st ; 36-38
fld DWORD PTR [esp] ; 37
fxch st(1) ; 37
fmulp st(2),st ; 38-40
fsubrp st(1),st ; 39-41
fmulp st(1),st ; 42-44
}
return 0;
}
inline float Sqr6(float x) {
float y;
_asm {
mov ecx, x
and ecx, 0x7f800000
fld x
mov eax, ecx
add ecx, 0x3f800000
or ecx, 0x00800000
shr ecx, 1
mov y, eax
and ecx, 0x3f800000
fld y
fadd
fstp y
and [y], 0x007fffff
or [y], ecx
}
return y;
}
inline float SqrQ( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the ****?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
static short sqrttab[0x100];
void build_table()
{
unsigned short i;
float f;
unsigned int *fi = (unsigned int *)&f;
for(i=0; i<= 0x7f; i++) {
*fi = 0;
*fi = (i << 16) | (127 << 23);
f = sqrt(f );
sqrttab[i] = (*fi & 0x7fffff) >> 16;
*fi = 0;
*fi = (i << 16) | (128 << 23);
f = sqrt(f );
sqrttab[i+0x80] = (*fi & 0x7fffff) >> 16;
}
}
inline float Sqr7(float n) {
unsigned int *num = (unsigned int *)&n;
short e;
if (n == 0) return 0 ;
e = (*num >> 23) - 127;
*num &= 0x7fffff;
if (e & 0x01) *num |= 0x800000;
e >>= 1;
*num = ((sqrttab[*num >> 16]) << 16) | ((e + 127) << 23);
return (n);
}
Results (in ms):
Sqrt test tid: 344
Sqrtf test tid: 625
Sqrtl test tid: 609
Sqr2 test tid: 3922
Sqr3 test tid: 1015
Sqr4 test tid: 2016
Sqr6 test tid: 531
SqrQ test tid: 641
Sqr7 test tid: 594
-
Jan 18th, 2004, 08:52 PM
#26
transcendental analytic
lol note.. I wonder what you're doing all this for, can't you tell me?
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 19th, 2004, 04:43 AM
#27
First of all it would be nice to say I did it. And second of all, it would be nice to have in the new game we are making....
-
Jan 19th, 2004, 05:58 AM
#28
Originally posted by kedaman
excuse me, but when does the vector length need to be calculated beyond cases where you can do with its square? for 3dgames that is.
Interceptor missile distance: 134.543
103.953
78.395
47.392
20.933
Impact!
All the buzzt
 CornedBee
"Writing specifications is like writing a novel. Writing code is like writing poetry."
- Anonymous, published by Raymond Chen
Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.
-
Jan 20th, 2004, 01:18 AM
#29
transcendental analytic
Interceptor missile distance: 134.543
103.953
78.395
47.392
20.933
Impact!
dx*dx+dy*dy< impactdistsquared
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 20th, 2004, 01:20 AM
#30
transcendental analytic
Originally posted by NoteMe
First of all it would be nice to say I did it. And second of all, it would be nice to have in the new game we are making....
but third, it would be foolish if it could be done without it
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 20th, 2004, 03:58 AM
#31
Originally posted by kedaman
but third, it would be foolish if it could be done without it
Again I have to refer to my first post....read it you non first post reader at all.....
-
Jan 20th, 2004, 08:27 AM
#32
transcendental analytic
I did, so what are you going to use it for, you just said you wanted to make a faster sqrt function but i don't think you'll need to make one at all unless you tell me why
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 20th, 2004, 09:05 AM
#33
You make me go through that book once more tonight....ok....but waith untill my GF has left...can't do it when she is waithing...can't even understand why I am on the forum...
-
Jan 20th, 2004, 09:15 AM
#34
Originally posted by kedaman
dx*dx+dy*dy< impactdistsquared
You can't show this on the threat warner.
All the buzzt
 CornedBee
"Writing specifications is like writing a novel. Writing code is like writing poetry."
- Anonymous, published by Raymond Chen
Don't PM me with your problems, I scan most of the forums daily. If you do PM me, I will not answer your question.
-
Jan 20th, 2004, 02:35 PM
#35
Originally posted by CornedBee
You can't show this on the threat warner.
Maybe I don't have to read through the whole book tonight after all...
-
Jan 20th, 2004, 10:15 PM
#36
transcendental analytic
damn note, are you going to implement that just to annoy me
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 20th, 2004, 10:18 PM
#37
transcendental analytic
Ok, even you do implement that, its not going to use conciderable amount of cpu like collisiondetection even with the help of bsp trees , but you don't need to use sqrt unless you have to display it on the screen in particular
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Jan 21st, 2004, 06:32 PM
#38
If you are using the Sqrt() as part of the Pythagorean Theorum to figure distances, I made up a simple function that uses nothing but addition and bit shifts (and two If statements ). It is not particularly accurate, with an average error of only about 6% (acceptable in this case) except for the perverse case where the delta x, delta y, and delta z are all 1. Would that help?
-
Jan 21st, 2004, 08:56 PM
#39
Addicted Member
Ok, even you do implement that, its not going to use conciderable amount of cpu like collisiondetection even with the help of bsp trees , but you don't need to use sqrt unless you have to display it on the screen in particular
What do bsp trees have to do with square root?
-
Jan 21st, 2004, 09:15 PM
#40
transcendental analytic
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|