NVIDIA Developer Zone

GPU Gems: Part V - Performance and Practicalities

GPU Gems

GPU Gems is now available, right here, online. You can purchase a beautifully printed version of this book, and others in the series, at a 30% discount courtesy of InformIT and Addison-Wesley.

Please visit our Recent Documents page to see all the latest whitepapers and conference presentations that can help you with your projects.



Part V: Performance and Practicalities

Part V: Performance and Practicalities

As GPUs become more complex, incorporating the GPU efficiently into your application can become challenging. This part of the book offers several perspectives on shader management and integration, as well as an overview of the graphics performance characteristics that shape integration decisions.

In Chapter 28, "Graphics Pipeline Performance," Cem Cebenoyan gives an overview of the modern graphics pipeline, including the programmable pipelines that give rise to many of the techniques discussed in this book. In this chapter, he describes a process to test for bottlenecks in the GPU pipeline, and he offers potential remedies for several bottlenecks.

Dean Sekulic of Croteam discusses the powerful but often-misused occlusion query feature in Chapter 29, "Efficient Occlusion Culling." Occlusion queries allow the GPU to return the amount of pixels that an object would represent on screen. If the object represents no pixels, due to z or stencil tests, it can be skipped. But because of the decoupled nature of the CPU and the GPU, an occlusion query can't be issued like a single-threaded function call, or else one would lose most or all of the performance benefit. Instead, Dean discusses several methods of ensuring that the results of the GPU occlusion query can be applied quickly and efficiently.

In Chapter 30, "The Design of FX Composer," Christopher Maughan discusses a powerful shader-authoring tool. FX Composer 1.0 provides a full IDE for shader authors, as well as an artist-tweakable GUI to adjust shader attributes. Chris describes design aspects of the tool, offering insight into cutting-edge shader integration.

Chapter 31, "Using FX Composer," also by Christopher Maughan, delves into the details of FX Composer usage, including shader authoring, setting up simple scenes, and applying shaders to objects. This chapter provides a good introduction to both shader authoring and tool usage.

In Chapter 32, "An Introduction to Shader Interfaces," Matt Pharr describes shader objects, which can simplify the integration of shaders into applications via the concept of shader interfaces. By specifying shader fragments as objects, with well-defined interfaces, you can efficiently combine these fragments at runtime automatically, improving both flexibility and performance.

In Chapter 33, "Converting Production RenderMan Shaders to Real Time," Stephen Marshall of Sony Pictures Imageworks tells how RenderMan-style offline shaders can be modified and leveraged in a GPU-aware production pipeline. Offline shaders are written with CPU advantages and limitations in mind; only by rethinking shaders in terms of modern GPUs can the maximum speed benefits be realized.

Cinema 4D is another modern, shader-capable authoring tool. Jörn Loviscach, in Chapter 34, "Integrating Hardware Shading into Cinema 4D," discusses how he integrated GPU shaders to emulate the existing CPU shading pipeline as closely as possible. Jörn offers a compelling example of how to seamlessly add GPU capability to a more traditional, existing workflow.

Although GPUs get more flexible and powerful each year, it will likely be quite a while before all content-creation rendering tasks can be handled on the graphics card. In Chapter 35, "Leveraging High-Quality Software Rendering Effects in Real-Time Applications," Alexandre Jean Claude and Marc Stevens discuss how they leveraged the GPU shader horsepower while still retaining the flexibility of a mature, existing software rendering and modeling pipeline.

Finally, John O'Rorke's chapter on shader integration, Chapter 36, "Integrating Shaders into Applications," focuses on the DirectX .fx file format and how it can be used. John demonstrates how to use .fx file features such as semantics and annotations, which enable simpler shader integration. He concludes with several ideas for customizing and extending .fx files, including shader inheritance.

Sim Dietrich, NVIDIA