Returns a vector in the same direction as p
but with a length of 1.
fast_normalize
is computed as:
p
* half_rsqrt(p.x
^{2} + p.y
^{2} +...)
The result shall be within 8192 ulps error from the infinitely precise result of:
with the following exceptions:
If the sum of squares is greater than FLT_MAX
then the
value of the floating-point values in the result vector are undefined.
If the sum of squares is less than FLT_MIN
then the
implementation may return back p
.
If the device is in 'denorms are flushed to zero'
mode, individual operand elements with magnitude less than
sqrt(FLT_MIN
)
may be flushed to zero before proceeding with the calculation.
Built-in geometric functions operate component-wise. The description
is per-component. floatn
is
float, float2, float3, or float4 and
doublen
is double, double2,
double3, or double4. The built-in geometric functions are
implemented using the round to nearest even rounding mode.
The geometric functions can be implemented using contractions such as mad or fma.