In the previous post I introduced an as-far-as-I-know novel method for performing progressive least squares optimisation with spherical basis functions. Here, I’ll go into more detail about how it works, and also derive my original, approximate method from the corrected version.

Many thanks to Peter-Pike Sloan for providing the first part of this derivation.

We’ll be dealing with spherical integrals for the sake of this post, but everything is equally applicable to hemispheres by restricting the integration domain. For example:

will be used as shorthand for ‘the integral over the sphere of the function $f(s)$, where $s$ is a direction vector. All integrals will be done in respect to $s$.

$f(s)$ is taken to mean the value of the function we’re trying to fit in direction $s$; this value will usually be obtained using Monte Carlo sampling.

We’ll also assuming fixed basis functions parameterised only by their direction, such that $B_i(s)$ is the value of the ith basis function in direction $s$. The basis functions will be evaluated by multiplying with a per-basis amplitude $b_i$ and summing, such that the result $R$ in direction $s$ is given by:

In the case of spherical Gaussians, $b_i = \mu_i$, and $B_i(s) = e^{\lambda_i (s \cdot \vec{p}_i - 1)}$, or the value of the ith lobe evaluated in direction $s$.

Our goal is to minimise the squared difference between $R(s)$ and $f(s)$ so that the fit matches the original function as closely as possible. Mathematically, that can be expressed as:

To minimise, we differentiate the function with respect to each unknown $b_i$ and then set the derivative to 0.

Let $g(s) = \sum_k b_k B_k(s) - f(s)$. Therefore, $\frac{d}{b_k} \begin{bmatrix} g(s) \end{bmatrix} = B_k(s)$ for each $b_k$.

Therefore, by setting $\frac{dE}{b_i} = 0$,

At this step, we now have a method for producing a Monte Carlo estimate of the raw moments $B_i(s) \cdot f(s)$: as each sample comes in, multiply it by each basis function and add it to the estimate for each lobe. This is in fact what was done for the naïve projection used in The Order: 1886. To reconstruct the lobe amplitudes $b_i$ we need to multiply by the inverse of $\int_S ( B_i(s) \cdot B_k(s) ) )$:

This is a perfectly valid method of performing least squares without storing all of the samples at every step, although it can be noisier than if all samples were used to perform the fit. However, it does require a large matrix multiplication to reconstruct the $b_i$ amplitudes, which is unsuitable for progressive rendering.

In the ‘running average’ algorithm, we want to reconstruct the $b_i$ amplitudes as every sample comes in so that the results can be displayed at every iteration. There are therefore a few more steps we need to perform.

Let’s take the above equation and look at it for a single sample:

We can rearrange this to solve for a single amplitude $b_i$:

This is effectively the equation that is performed at each step of the ‘running average’ algorithm. Each $b_i$ estimate is accumulated and averaged to give a Monte Carlo estimate of the true value of $b_i$ for the function.

It’s worth noting that there’s an inherent inaccuracy here since each successive b_i estimate is based on previous estimates; an early high-variance $f(s)$ estimate will propagate throughout the rest of the solve process. If the average values for $b_i$ were known and exact then this method would also be exact; however, that would defeat the purpose! In practice, though, this inaccuracy tends to have a very small impact.

One effective way to combat this issue is to gradually increase the sample weights over time. Anecdotally, I can say using an exponential weighting of $w = 1 - exp(-\frac{sampleIndex}{sampleCount})$ provides a quality boost with very noisy input compared to $w = 1$, with only a very slight reduction in quality for when $f(s)$ represents the true function value. If the total sample count is unknown, as in progressive rendering, using $w = 1 - exp(-\frac{sampleIndex}{1024})$ seems to work reasonably well.

In this equation, the integral in the denominator can be calculated using Monte Carlo integration in the same way that $b_i$ is. In fact, it turns out that computing both of them in lockstep improves the accuracy of the algorithm since any sampling bias in the numerator will be partially balanced out by the bias in the denominator. However, it’s also true that the integral may be wildly inaccurate at small sample counts; therefore, to balance that out, I recommend clamping the estimator for the integral to at least the true integral. Alternatively, it’s possible to always use the precomputed true integral on the denominator and only estimate the $b$ vector, although this results in slightly increased error.

My original algorithm was created by experimentation. I thought it would be worth going through why it worked and the approximations it made. Note that none of this is necessary to understand the corrected equation – it’s purely for curiosity and interest!

Effectively, at each step, it solved the following equation:

If we rearrange that to get into a form vaguely resembling our proper solution above:

For the spherical integral of a spherical Gaussian basis function with itself, $\int_S (B_i(s)) \approx 2 \int_S (B_i(s))^2$, since $\int_S (B_i(s))^2 = \frac{\pi}{4 \lambda} (1 - e^{-4 \lambda})$ and $\int_S B_i(s) = \frac{\pi}{2 \lambda} (1 - e^{-2 \lambda})$. Therefore,

This is very close to our ‘correct’ equation above. In fact, it becomes equal when

We can rearrange that a little further:

Since we assume that, as the fit converges, $f(s) \approx \sum_{k} b_k B_k(s)$, we’re left with:

In other words, using the original algorithm for a given sample, the error is mostly determined by how close $B_i(s)$ is to $\frac{1}{2}$. Since the influence of samples with higher basis weights $B_i(s)$ is greater anyway, this turned out to be a reasonable approximation. However, given the option, I’d still recommend using the corrected algorithm!