Modern VR Rendering Pipeline Architecture
The modern VR rendering pipeline involves multiple steps broadly categorized as rendering and post processing. Post rendering, the HMD compositor processes the rendered views to better suit the characteristics of the display and to compensate for pose latency before the textures are displayed. Figure 1 illustrates the process flow. This post discusses cross-process synchronization, NVIDIA’s API to improve overall VR performance.

Figure 2 outlines a typical scheduling model adopted where frames rendered by the VR game process are composited by the VR compositor process before they are flipped to the display.

Cross-Process Synchronization Challenges
Since these two processes execute dependent tasks on the GPU, some VR compositor models might want to execute these tasks in a synchronized manner. Achieving such synchronization needs explicit handling because the GPU scheduler by default does not guarantee synchronized GPU execution between two different processes.
For an example, if a VR compositor model wants to execute frame rendering on the GPU only after completion of composition on the previous frame, then they must achieve it by using some explicit mechanism.
There are different approaches possible to achieve such synchronization, but they all have some drawbacks. For example, developers can choose to use events to signal work completion on the GPU and wait on these events before queuing additional GPU work. This approach is illustrated in figure 3 below where the VR game process is waiting on completion of composition for frame N before it starts queuing rendering of the next frame N+1 .

As you can see from the chart above, idle bubbles are introduced on the GPU, which can reduce processing efficiency. This serialization subsequently results in limited time for the frame to be ready on the GPU before the next composition cycle starts.
Cross-Process Synchronization API
The inefficient nature of traditional synchronization techniques causes serialization between CPU and GPU task execution. To mitigate these issues, NVIDIA now provides a Cross-Process Synchronization API which enables flexible GPU work synchronization between two processes. Processes participating in Cross-Process Synchronization can queue GPU work ahead of time. This also increases the chances of frame readiness for every composition cycle for the scenarios like in figure 3 above.
When these APIs are used, the NVIDIA driver will guarantee correct synchronization on the GPU timeline without any extra overhead. Refer to the example in figure 4 below where every frame gets enough time on the GPU to complete the render before the composition cycle starts.

Integration of these APIs into existing VR compositor models has shown significant improvements in FPS and thus improved VR experience. Multiple VR titles with certain settings with HTC Vive Pro could give upto 15% FPS improvement on the NVIDIA GTX1070, as shown in figure 5.

Cross-Process Synchronization can significantly improve VR application performance on a variety of VR headsets. Cross-Process Synchronization is available to VR headset developers through our VRWorks HMD Developer SDKs. If you are interested in the SDK please submit your request