## Quake's fast inverse square root

The inverse square root function (1/sqrt(x)) is used heavily during a game engine's draw loop to normalize vectors into "unit" vectors with a length of 1. Normalized vectors are important because when you take the dot product of two of them (Ax*Bx + Ay*By + Az*Bz), the result is the cosine of the angle between the two vectors.

So imagine you have a vector that describes the way a surface is facing (its surface normal) and a second vector that describes the direction of sunlight. If the sunlight vector is parallel to your surface normal, which is to say the surface is facing the sun, the dot product of the two normalized vectors is the cosine of 0, which is 1. If the surface is 90 degrees from the sun, the dot product is the cosine of 90, which is 0. Anything in between and you get a value between 0 and 1, which essentially allows you to describe how brightly you should light that surface.

Now, you run this calculation on every triangle visible in a 3D game, and you do all that 30 or more times a second, and you have a basic point-source lighting effect! Recall, however, that you need that inverse square root function to calculate unit vectors for each of those triangles. The square root operation is slow. Do it thousands of times per draw loop, and you have yourself a slow game.

But if you don't mind throwing away a marginal bit of accuracy, there's a faster way.

There's an inverse square root function that a lot of game engines use that is rumored to have originated in Quake III by Nvidia programmer Gary Tarolli. It looks like this:

```float InvSqrt(float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x; // get bits for floating value
i = 0x5f3759df - (i>>1); // gives initial guess y0
x = *(float*)&i; // convert bits back to float
x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy
return x;
}
```

Hold on, what was that?!?!

So way back when, Newton came up with a clever way of approximating the inverse square root. First, you divide the original number x by two. Lets call that "xhalf". Then, you make a reasonably close guess at the inverse square root. Let's call that g. Then, you take those two variables and perform this calculation (which you'll recognize as the last step in the InvSqrt function):

g = g*(1.5 - xhalf * g *g)

If you do that over and over, substituting in the updated g on each iteration, g will quickly hone in on the inverse square root of x! In fact, if you pick a decent g to start with, a single iteration or two will get you very close to the correct value.

The question is, how do you find that initial guesstimate for the first g? The game engine coders use trick with how floating point numbers are represented in binary, with the exponent and mantissa broken up similar to scientific notation. In a 32-bit float, the left-most bit is a sign bit and is 0 for positive numbers. That's followed by 8 bits of exponent (biased by 127 to represent negative and positive exponents), and the final 23 bits represent the mantissa.

Well, to do an inverse square root, you basically need to multiply the exponent by -1/2.
When you shift those bits to the right (a very fast operation), the effect on the exponent is to divide it by 2. You still need to subtract the exponent from 0 to change it's sign, though, and what do you do about the mantissa which was also affected in the bit shift operation?

This is where the magic 0x5f3759df number comes in. It's absolutely bonkers, but by subtracting the bit shift result from 0x5f3759df, the mantissa is reset to near to it's original state and the exponent is subtracted from 0 (taking into account it's bias of 127).

The result is very close to the inverse square root. Close enough for a single pass through Newton's equation to end up with a value precise enough for practical purposes.

Posted by Jason Striegel | Dec 4, 2008 07:47 PM

### Recent Entries

 Posted by: on December 5, 2008 at 8:15 AM

that is some cool math there

 Posted by: Jim Shima on January 6, 2009 at 9:53 AM

DSP wrangler

This trick has been used in the signal processing community for some time. In fact, the inverse square root via Newton's method is an opcode on the ADI Sharc processors (to do fast sqrt operations). Newton's method for any function x^n requires a division per iteration, and for x^(-n) requires only multiplications. Since DSPs can do multiplications very fast, computing x^(-1/2) is much faster than computed x^(1/2). And to compute a fast sqrt, you can take the output of the inverse sqrt and multiply by the original x to get sqrt(x).

Also as stated, for the initial guess in floating pt when you shift the IEEE float number to the right by one that gives you a very accurate starting guess (you must remove the exp bias first). This effectively divides the exp by two as well as the mantissa. If you go through the entire analysis, you will find that the starting guess is, at worst, only 6% in error to the true sqrt. Not bad for just one shift! That is why doing only one Newton iteration is good enough for most applications.

 Posted by: DJSolin on February 11, 2009 at 2:33 AM

DJSolin

And I had always asked my calc profs what in the hell did we do math anyway, "It's not like I am going to use it in programming!!" I used to say!! ;) I guess the joke is on me... I LOVE MATH!!

#### Bloggers

Welcome to the Hacks Blog!

Brian Jepson

Jason Striegel

Phillip Torrone

### Recent Posts

www.flickr.com
 More photos in Hacks