# any trick to accelerate vecter normalize?

Show 40 post(s) from this thread on one page
Page 1 of 2 12 Last
• 09-04-2007, 07:07 PM
wycwang
any trick to accelerate vecter normalize?
hi all:
any one know how to accelerate vecter normalization??
the precision may be sacrifice a little..

p.s now i use a inverse sqrt function and multiplication..
• 09-05-2007, 02:47 AM
Nils Pipenbrinck
what's your target cpu? do you need vector normalize in fixed or floating point?

Also, could you please post your existing normalization code, so we have a reference to start with. If possible include the assembly compiler output.

Nils
• 09-05-2007, 04:03 AM
wycwang
my target CPU is ARM1156T2-S
i need vector normalization in fixed points 15.16
and i re-computed the inverse sqrt table and use one iteration Newton-raphson.

#define DG_F int
#define DGint32 int
#define DGfixed32 int
#define DGfixed64 long long
#define DG_ONE 0x00010000
#define DG_ZERO 0x0

#define fMul32x32(a,b) ( (DGfixed64)a * b )
#define sar_64_32(a) (DGfixed32)( (a) >> 16 )
#define xMul(a,b) ((DGfixed32)((((DGfixed64)(a))*(b))>>DGX_FRAC_BITS ))

DGvoid DG_MATH_Normalize2B(DG_F *vec)
{

DG_F length;
DG_F a, b, c, d;

length = sar_64_32( fMul32x32(vec[0], vec[0]) +
fMul32x32(vec[1], vec[1]) +
fMul32x32(vec[2], vec[2]) );

//length = EGL_InvSqrt( 0xfffe );
length = DG_MATH_InvSqrt( length );

vec[0] = fMul(vec[0],length);
vec[1] = fMul(vec[1],length);
vec[2] = fMul(vec[2],length);
}

DG_F
DG_MATH_InvSqrt(DG_F a)
{
DG_F x;
DGint32 i, exp;
if ( a == DGX_ZERO ) return 0x7fffffff;
if ( a == DGX_ONE ) return a;

__asm
{
CLZ exp, a;
}

if ((exp&1)==0)
x = DG_context->g_pTInvSqrt_EVEN[(a>>(24-exp))&0x7f]; //28:8, 27:16, 26:32, 25:64, 24:128
else // &7 &f &1f &3f &7f
x=DG_context->g_pTInvSqrt_ODD[(a>>(24-exp))&0x7f];

exp -= 16;
if (exp <= 0)
x >>= -exp>>1;
else
x <<= (exp>>1)+(exp&1);

x = fMul((x>>1),(DGX_ONE*3 - fMul(fMul(a,x),x)));

return x;
}
• 09-05-2007, 05:38 AM
Nils Pipenbrinck
Hi wycwang,

could you please post the assembly output of the DG_MATH_Normalize2B function? If you're using GCC (most likely) you can get it if you compile with the options -O3 -S somefile.c

That generates a somefile.s which should contain the assembly code.

The code looks fine. I'm almost sure the compiler just needs to be hinted into generating good code for it.

Nils
• 09-05-2007, 07:09 PM
wycwang
hi Nils:
of course this code is fine.
and i have use compile option -O3.
and i have an assembly version(hand make, an optimized code)

but i think i need a "MORE FAST" normalization method.
a different algorithm.. not only rely on code optimzation .
• 09-06-2007, 04:07 AM
Nils Pipenbrinck
There aren't any more ways to get the performance up without loosing even more precision.

You could use this method of distance approximation:

http://www.oroboro.com/rafael/docserv.p ... e/distance

This could be just as slow as you need a divide afterwards. The distance approximation itself should compile to nice and fast code on any ARM.
• 09-06-2007, 05:40 AM
wycwang
hi Nils
this methos seems need a division operation(or a reciprocal).
because my reciprocal operation was implement using Newton raphson too
i think this method wouldn't faster.
• 09-06-2007, 05:40 AM
wycwang
hi Nils
this methos seems need a division operation(or a reciprocal).
because my reciprocal operation was implement using Newton raphson too
i think this method wouldn't faster.
• 09-16-2007, 12:34 AM
wycwang
i think the cube map normalization is a nice idea, but how to compute the access index is another issue....
• 09-16-2007, 02:19 AM
Xmas
Quote:

Originally Posted by wycwang
i think the cube map normalization is a nice idea, but how to compute the access index is another issue....

What access index?
Show 40 post(s) from this thread on one page
Page 1 of 2 12 Last