DEMO.DESIGN
Frequently Asked Questions
 
| demo party ex- | infused bytes e-mag | ib/news | | | win koi lat

p (2)
p SQRT [Andrew Zabolotny] p p Sqrt: 1) H: a) A - p X0 b) p X(n+1) = (X(n) + A/X(n))/2 X(n+1)-X(n)<=1. demonstration of this we`ll leave as a exercice to the reader :-) p: a) BSR p . p 0-31. b) 32 32 p pp jump` pp p . c) pp p p , p `p p` p. Hpp pp p 15 ( ) : - p p p 2^15..2^16-1. - `p` p p (Sqrt(2^15) + Sqrt(2^16-1))/2 = (181 + 256) / 2 = 218. - p p p p (2^15 / 2^16-1) pp pp p. Hpp: mov ebx,Number mov ecx,218 ; p @@1: mov eax,ebx cdq div ecx ; A/X(n) add eax,ecx ; X(n) + A/X(n) shr eax,1 ; (X(n) + A/X(n))/2 mov ecx,eax mov eax,ebx cdq div ecx add eax,ecx shr eax,1 [...] AX p p. AX EAX? Sqrt(0..2^32 - 1) = 0..2^16-1. p - p (.. A=1 A=2) p `` mov ax,1. H A p 0 - bsr ZF p. .. pp pp : iSqrt proc ; H - EAX = bsr ebx,eax jz @@noBits shl bx,1 jmp @@jumpTable[bx] @@jumpTable: dw offset @@bit0,offset @@bit1, offset @@bit2, ..., offset @@bit31 ; p : @@bit0: @@bit1: mov ax,1 @@noBits: ret ; EAX=4..7 @@bit2: mov ax,2 ret ; EAX=8..15 @@bit3: mov ax,3 ret ; EAX=16..31 @@bit4: mov ecx,5 cdq div ecx add eax,ecx shr eax,1 ret ; EAX=32..63 @@bit5: mov ecx,(5+8)/2 cdq div ecx add eax,ecx shr eax,1 ret ; EAX=64..128 @@bit6: mov ecx,(8+11)/2 cdq div ecx add eax,ecx shr eax,1 ret [...] ; EAX=32768..65535; ( :-) @@bit15: mov ecx,(181+256)/2 mov ebx,eax cdq div ecx add eax,ecx shr eax,1 mov ecx,eax mov eax,ebx cdq div ecx add eax,ecx shr eax,1 ret [ ] 2) p. / p 16 p. : function iSqrt(Val : Longint) : Word; var root,bitSqr : Longint; begin bitSqr := $40000000; root := 0; While bitSqr <> 0 do begin if Val >= bitSqr + root then begin Dec(Val, bitSqr + root); root := (root shr 1) or bitSqr; end else root := root shr 1; bitSqr := bitSqr shr 2; end; iSqrt := Root; end; p p: Function iSqrt(x : Longint) : Word; assembler; asm db $66;mov cx,x.Word db $66;xor ax,ax db $66;mov bx,$0000; dw $4000 @@LP1: db $66;mov dx,cx {edx = val} db $66;sub dx,bx {val - bitsqr} jc @@LP2 db $66;sub dx,ax {val - root} jc @@LP2 db $66;mov cx,dx {val >= (root+bitsqr) -> accept subs} db $66;shr ax,1 {root >> 1} db $66;or ax,bx {root | bitsqr} db $66;shr bx,2 {bitsqr>>2} jnz @@LP1 jmp @@LocEx @@LP2: db $66;shr ax,1 {val < (root+bitsqr) -> dont change val} db $66;shr bx,2 {bitsqr>>2} jnz @@LP1 @@LocEx: end; 3) H - p . H :-) - p p pp p . p - iSqrt(0..2^16-2^18). 1 3 - p 32 , p p . H p. p p p ( p p, , 4, ). p p , pp p . -[Alex-Victorov]--------------------------------------------------------------- p - D.D.UUE pp Sqrt ( AZ ).. p , ... ;) p - 0.05%, p 70% ... TESTSQRT (486SL/25) p: Timing Sqrt 3304.0 square root per second ...... iSqrt 46170.0 ....newtonSqrt 59047.0 >... FastSqrt 95354.5 !!! - precomputation 2 .... === Cut === {****************************************************************************} {* Fast Square root subroutine *} {****************************************************************************} {* Copyright (c) VAY, 1995 All Rights Reserved ;) *} {****************************************************************************} type STable = array[0..1023] of Word; var SqrtTable: ^STable; i: Word; function FastSqrt( S: LongInt ): Word; assembler; asm db 66h; xor si, si { xor esi, esi } les si, dword ptr SqrtTable db 66h; mov bx, word ptr S { mov ebx, S } mov dx, 11 db 66h, 0Fh, 0BDh, 0CBh { bsr ecx, ebx } sub cx, 9; jle @less shr cx, 1 adc cx, 0 sub dx, cx shl cx, 1 db 66h; shr bx, cl { shr ebx, cl } @less: db 26h, 67h, 8Bh, 04h, 5Eh { mov ax, es:[esi+ebx*2] } mov cx, dx shr ax, cl end; begin New( SqrtTable ); for i:=0 to 1023 do SqrtTable^[i]:=Round(Sqrt(i)*2048); end.
p (3)|p p (1)
- [24] Usenet echoes (21:200/1) --------------------- COMP.GRAPHICS.ALGORITHMS - Msg : 49 of 73 From : 4188-21922 2:5030/144.99 28 Mar 94 14:33:00 To : All Subj : Re: looking for reference -------------------------------------------------------------------------------- hbaker@netcom.com (Henry G. Baker @ nil) asks: >I remember seeing a very clever iterative algorithm for computing >Euclidean distances on the plane by Bentley??? in about 1985. It >converges cubically in the number of iterations -- i.e., faster than >Newton's method for the square root. Does anyone know this reference? >It's driving me crazy trying to find it. Thanks in advance. The paper in question is Cleve Moler and Donald Morrison, ``Replacing Square Roots by Pythagorean Sums,'' IBM Journal of Research and Development, Vol. 27, Number 6, pp. 577-581, Nov. 1983 Here is a C function that implements their algorithm. double hypot(double p, double q){ double r, s; if(p<0) p = -p; if(q<0) q = -q; if(p<q){ r=p; p=q; q=r; } if(p==0) return 0; for(;;){ r=q/p; r*=r; s=r+4; if(s==4) return p; r/=s; p+=2*r*p; q*=r; } } This routine produces a result good to 6.5 digits after 2 iterations, to 20 digits after 3 iterations, and to 62 digits after 4 iterations. In normal use you would probably unwind the loop as many times as you need and omit the test altogether. Moler and Morrison's paper describes the generalization of this algorithm to n dimensions, and exhibits a related square root algorithm which also has cubic convergence. [Dubrulle 83] gives a geometric justification of the algorithm and presents a set of generalizations that have arbitrarily large order of convergence. [Dubrulle 83] Augustin A. Dubrulle, ``A Class of Numerical Methods for the Computation of Pythagorean Sums,'' IBM Journal of Research and Development, Vol. 27, Number 6, pp. 582-589, Nov. 1983
p (4)|p p (2)
- [27] Usenet echoes (21:200/1) --------------------- COMP.GRAPHICS.ALGORITHMS - Msg : 16 of 21 From : virteam@nettle.canberra.edu.au 2:5030/144.99 08 Apr 94 02:49:00 To : All Subj : El fasto square root -------------------------------------------------------------------------------- Greetings All, Well, there has been a bit of talk about finding fast square roots, so I thought I might as well also put my seven cents in. The fastest way to perform a function is to precalculate it. Now, this may seem a little over the top, having look-up tables for all values of x= sqrt (A), but depending on the problem it may not actually be as cumbersome as it seems. Making an assuption here that we wish to find D = sqrt (X^2 + Y^2) that is to say, the distance away from a point. With this model, it can be seen that four multiplications, one addition and a square root operator are needed to find the answer. However, It should be remembered that do not really want the square root of x^2 plus y^2, but rather the hypotenuse. As seen in basic trig: Y| |-----* point | /| | d/ | | / | | / | |/ | --------- X o We know X and Y, therefore, we can work out the angle the hyptonuse makes with the plane, and therefore, work out the value of d. What this actually equates to is: d = X / ( cos (arctan(y/x) ) will also give the correct answer. "Screams of dread!" you might say, "Thats much harder than the 'ol square root equation." Not really. If you precalculate a table of 1/ cos (arctan( theta ) ) for all of theta (where theta = y/x), then the equation becomes d = X * TABLE (Y/X) which means that square root can be calculated using one multiplication, one division and one table lookup!!!! (Compare that one to 4 multiplications, one addition and the multiple iterations of square root) Much faster. The only caveat here is that you will probably need a very large table, depending on what accuracy you wish. This table can be reduced by of course ensuring that X and Y are postive values, and maybe stipulating that X >= Y, but this adds in more calculations. Increased accuracy can be gained by linearly interpolating table results, which gives surprisingly good answers with only a few more basic calculations. But hey, in its raw form, d = X * TABLE (Y / X) is a fast way of calculating the distances between two points with a good level of accuracy... Oh, I almost forgot, this system works with 3D points also. Just perform it twice: (where TABLE is of 1/(cos(arctan(theta)))) h = X * TABLE (Y/X) d = h * TABLE (Z/h) which gives you the distance to a point in 3d space with two multiplication, two divisions and two table lookups... I hope that this info is of use to someone. Catcha, Regards, Michael Kunstelj.
p (5)|p p (3)
>Is Newton's method the best method to compute a cubic root? Depends on what you mean by "best". Newton's method is a quadratically converging algorithm, but there are higher- order techniques like, say, Halley's method. The iteration 2x^3 + 4r x' = ----------- x 4x^3 + 2r is cubically converging to the cube root of r , for example (it's generalization to nth roots is due to Lambert in 1770). And there's no end to higher-order iterative methods such as this (cf. I. Kiss. A generalization of Newton's approximation procedure [German]. _Z. angew. Math. Mech._ 34, 68--69, 1954.) but of course the price you pay is more computation per iteration.
p (6)|p p (4)
- [27] Usenet echoes (21:200/1) --------------------- COMP.GRAPHICS.ALGORITHMS - Msg : 7 of 21 From : steinarm@ifi.uio.no 2:5030/144.99 04 Apr 94 18:09:00 To : All Subj : Re: Fastest long integer square root algorithm? -------------------------------------------------------------------------------- A very simple way to find the square root without any divisions is: int sqr(int i) { /* We don't want input that needs to be of type 'long' :) */ int c = 0, d = 0; while ((i -= c) > 0) { c += 2; d++; } return d; } It'll be very slow if the input is a large number, though (sqr(1000000) = 1000 iterations). If input is a 16-bit word, I think it'll be best at average on a 68000 (although I've never tested it). Only 5 instructions of 680x0 code would be needed, and I guess it should be something like this: moveq #-1,d0 ; d1 = input, d0 = result. .loop addq.l #1,d0 sub.l d0,d1 sub.l d0,d1 bhi.s .loop -$- Steinar
p (7)|p p (5)
- Usenet echoes (21:200/1) -------------------------- COMP.GRAPHICS.ALGORITHMS - Msg : 34 of 34 From : joakim@alpha.hut.fi 2:5030/144.99 15 Apr 94 15:13:18 To : All 17 Apr 94 01:45:38 Subj : A Simple 1500-line 32-bit SQRT for C -------------------------------------------------------------------------------- The following C code should generate a reasonably fast 32 bit integer to 32 bit integer algorithm for those processors that can handle floating arithmetic quickly but fail to do the same for the sqrt function. The code has been tested for some random numbers and error was always equal to or less than 1 (however, we do not guarantee this!). The resulting code is some 20-40 % slower than the sqrt() function (in HP 9000/755). Once the floating arithmetics are replaced with some clever fixed point arithmetics, this kind of a sqrt algorithm should be of the fastest kind on the low end processors. The algorithm is based on piecewise linear interpolation, as can be easily seen by just looking at the code: ------------------CUT--------------- // Program to construct an efficient sqrt program for a computer with an // arithmetic coprocessor without sqrt ?!? // copyleft (l) Jyrki Alakuijala, Joakim Laitinen, 1994 #include <stdlib.h> #include <math.h> #define NESTING() {for (int i = n; i--;) printf (" ");} void recursion (double a, double b, int n) { if (sqrt ((a + b) / 2.0) - (sqrt (a) + sqrt (b)) / 2.0 > 0.5) { NESTING(); printf("if (val < %.0lf) {\n", (a + b) / 2.0 + 0.5); recursion (a, b - (b - a) / 2.0, n + 1); NESTING(); printf ("} else {\n"); recursion (a + (b - a) / 2.0, b, n + 1); NESTING(); printf ("}\n"); } else { NESTING(); printf (" return (%.3lf + (val - %.3lf) * %20.20lf);\n", sqrt(a) + 0.75, a, (sqrt(b) - sqrt(a)) / (b - a)); } } main () { printf ("unsigned long SlowSqrt (register unsigned long val) {\n"); recursion (0.0, 4294967295.0, 1); printf ("}\n"); } ----------------END CUTTING-----------
p (8)|p p (6)
- Usenet echoes (21:200/1) -------------------------- COMP.GRAPHICS.ALGORITHMS - Msg : 28 of 28 From : cgouldie@mail.utexas.edu 2:5030/315 10 Dec 95 15:16:14 To : All 13 Dec 95 04:25:02 Subj : Re: Help: square root -------------------------------------------------------------------------------- X-RealName: Chris Gouldie In article <4afu4q$3lp@oak.zilker.net>, jc@zilker.net (Jonathan Clark) wrote: >In article <4a5drm$6l9@nntp.pinc.com>, >Doug MacDonald <dmacdonald@dataflux.bc.ca> wrote: >> >>I am in need of a fast integer square root algorithm. If anyone >>can provide such, or point me in the right direction, I'd >>appreciate it. Here's one from the Dos32 package that uses Newton's method. Should be faster than binary search, except that there's a div in there, so you might wanna test it. ; Writen by Adam Seychell 386 model flat .CODE comment $ returns EAX with the SQUARE ROOT of EAX. Uses newtons method: Xn+1 = Xn - f(Xn) / f'(Xn) f'(x) is the derivitive of f(x) For square routes: f(x) = x*x - ( Root value i.e EAX) so f'(x) = 2x For cube routes: f(x) = x*x*x - ( Root value ) so f'(x) = 3x +$ Public SQRT SQRT Proc PASCAL USES EDI ECX EBP EDX EBX mov edi,eax ;----- Get Estimated Root Value ( ebp = rough value of the sqrt of EAX ) --- bsr ecx,edi ; Get bit number of highest NZ bit shr cl,1 ; Half this highest bit number mov ebp,edi ; Rotate Root Value by shr ebp,cl ; the bit number to get a estimate root. ;----- Main Loop that makes EBP converge to the square root of EDI ------ calc_sqrt_Loop: push EBP ; save x so can check if x has converged mov EBX,EBP shl EBX,1 ; ebx = f'(x) = 2x mov EAX,EBP mul EBP sub EAX,EDI ; eax = f(x) = X*X - Root_value sbb EDX,0 ; use 64bit subtract idiv EBX ; eax = f(Xn) / f'(Xn) sub EBP,EAX ; ebp = Xn+1 = Xn - f(Xn) / f'(Xn) pop EAX ; look at old value cmp EBP,EAX ; loop only if new value is different jne calc_sqrt_Loop ;-------------------------- End of Loop -------------------------------- mov eax,ebp ; EBP now contians the square root of EDI ret ; SQRT Endp End .
p (9)|p p (7)
- Demo/intro making and discussion (2:5030/84) ------------------ DEMO.DESIGN - Msg : 1125 of 1126 From : Alexander Matchugovsky 2:5020/996.21 01 Jul 97 01:25:00 To : Alex Markov 03 Jul 97 23:58:07 Subj : 3d Tracing ------------------------------------------------------------------------------- GoodNight, Alex! 29 Jun 97 07:38, Alex Markov ===wrote to===. Oleg Homenko: AM>>> : AM>>> , ? Aa! a pa a x ca oepe 4K-intro a K0010 ( DemoParty) o, o ao pa aop ce .op, a a a K e copoeccopa :( Aop pa ( ooo oe, ee a ..) oaa Tarh'. O epee a Intel'oc ASM coo, o a cax ee 300 paoae cpee copoeccopoo (a ooe-o poopoao ee eaeoo op - o ec e oe - e xe :( ao a eox cax - caoe oo). Caec oo ea ac. Bxo: bx (16-papoe aee) Bxo: dx (ooe-o oe 8-pap) xor di,di mov cx,di mov dx,di inc di mov ax,di inc di @1: add cx,ax add ax,di inc dx cmp cx,bx jle @1 dec dx Bye. Yours Manwe.

8 p(/) |p p (8)

FAQ - .

design/collection/some content by Frog,
DEMO DESIGN FAQ (C) Realm Of Illusion 1994-2000,
: "DEMO.DESIGN FAQ, http://www.enlight.ru/demo/faq".