NVIDIA Developer Zone

GPU Gems 2: Part I - Geometric Complexity

Today's games are visually more interesting and complex than ever before. Geometric complexity—how many objects are visible and how detailed each looks—is one of the dimensions in which games are making leaps and bounds.

Advances in technology are partly responsible for these leaps and bounds: CPUs, memory, and buses all have become faster, but specifically GPUs are undergoing significant change and are becoming ever more powerful—at a rate faster than Moore's Law.

These GPU changes include incorporating fixed-function processing for the vertex- and pixel-shading units, then generalizing those to be fully programmable. GPUs also have gained more units to process pixels and vertices in parallel: the GeForce 6800 Ultra, for example, incorporates 6 vertex shader units and 16 pixel pipelines.

Despite these performance advances, rendering complex scenes is still more difficult than simply dumping all geometry onto the GPU and forgetting about it. The simple approach tends to fail either because the generated GPU workload turns out to be excessive, or because the associated CPU overhead is prohibitive. This part of the book discusses the challenges today's games face in rendering complex geometric scenes.

Chapter 1, "Toward Photorealism in Virtual Botany" by David Whatley of Simutronics Corporation, provides a holistic view on how to render nature scenes. It explains the multitude of different techniques, from scene management and rendering various plant layers to post-processing effects, that Simutronics' Hero's Journey employs to generate complex and stunning visuals.

Rendering terrain is a good example of why simply dumping all available data to the GPU cannot work: the horizon represents a near-infinite amount of vertex data and thus workload. Arul Asirvatham and Hugues Hoppe of Microsoft Research use vertex texture fetches for a new highly efficient terrain-rendering algorithm. Their technique avoids overloading the GPU even as it shifts most work onto the GPU and away from the CPU, which too often is the bottleneck in modern games. Chapter 2, "Terrain Rendering Using GPU-Based Geometry Clipmaps," provides all the implementation details.

As already mentioned, another way to increase geometric complexity is to increase the number of visible objects in a scene. The straightforward solution of drawing each object independently of the others, however, quickly bogs down even a high-end system. It is much easier to efficiently draw ten objects that are one million triangles each, than it is to draw one million objects that are ten triangles each. Francesco Carucci of Lionhead Studios faces this very problem while developing Black & White 2, the sequel to Lionhead's critically acclaimed Black & White. Chapter 3, "Inside Geometry Instancing," describes his solution: a framework of instancing techniques that applies to legacy GPUs as well as to GPUs supporting DirectX 9's instancing API. Jon Olick of 2015 provides further optimizations to the instancing technique that prove beneficial for 2015's title Men of Valor: Vietnam. Jon describes his findings in Chapter 4, "Segment Buffering."

Also, as games incorporate more and more data—more complex scenes of more complex meshes rendered in multiple, disparate passes supporting the gamut of differing functionality from legacy to current high-end GPUs—managing this glut of data efficiently becomes paramount. Oliver Hoeller and Kurt Pelzer of Piranha Bytes are currently working on Piranha Bytes' Gothic III engine. They share their solutions in Chapter 5, "Optimizing Resource Management with Multistreaming."

The best way to render lots of geometry to create geometric complexity is to avoid rendering the occluded parts. Michael Wimmer and Jirí Bittner of the Vienna University of Technology explore how best to apply that idea in Chapter 6, "Hardware Occlusion Queries Made Useful." Occlusion queries are a GPU feature that provides high-latency feedback on whether an object is visible or not after it is rendered. Unlike earlier occlusion-query culling techniques, Michael and Jirí's algorithm is pixel-perfect. That is, it introduces no rendering artifacts, generates a near-optimal set of visible objects to render, does not put unnecessary load on the GPU, and has minimal CPU overhead.

Similarly, increasing geometric detail only where visible and simplifying it when and where it isn't visible is a good way to avoid excessive GPU loads. View-dependent and adaptive subdivision schemes are an appealing solution that the offline-rendering world already employs to render their highly detailed models to subpixel geometric accuracy. Subdivision surfaces have not yet found a place in today's real-time applications, partly because they are not directly supported in graphics hardware. Rendering subdivision surfaces thus seems out of reach for real-time applications. Not so, says Michael Bunnell of NVIDIA Corporation. In Chapter 7, Michael shows how his implementation of "Adaptive Tessellation of Subdivision Surfaces with Displacement Mapping" is already feasible on modern GPUs and results in movie-quality geometric detail at real-time rates.

Finally, faking geometric complexity with methods that are cheaper than actually rendering geometry allow for higher apparent complexity at faster speeds. Replacing geometry with textures that merely depict it used to be an acceptable trade-off—and in the case of grates and wire-mesh fences, often still is. Normal mapping is a more sophisticated fake that properly accounts for lighting information. Parallax mapping is the latest craze that attempts to also account for intra-object occlusions. William Donnelly of the University of Waterloo one-ups parallax mapping: he describes "Per-Pixel Displacement Mapping with Distance Functions" in Chapter 8. Displacement mapping provides correct intra-object occlusion information, yet minimally increases computation cost. His technique gives excellent results while taking full advantage of the latest programmable pixel-shading hardware. Even better, it is practical for applications today.

Matthias Wloka, NVIDIA Corporation