内容创建/渲染

利用 NVIDIA DesignWorks 实现实时 GPU 加速的高斯体渲染示例 vk_gaussian_splatting

高斯射是一种渲染复杂 3D 场景的新颖方法,可将这些场景表示为 3D 空间中各向异性 Gaussians 的集合。这项技术能够实时渲染从小集图像中学习到的逼真场景,非常适合游戏、虚拟现实和实时专业可视化领域的应用。

vk_gaussian_splatting 是基于 Vulkan 的新示例,展示了实时高斯射,这是一种先进的立体渲染技术,可实现辐射场的高效表示。这是 NVIDIA DesignWorks 示例 的最新成员。

NVIDIA DevTech 团队将这个新的示例项目视为探索和比较 3D 高斯射 (3D Gaussian splatting) 实时可视化的各种方法的试验平台。通过评估各种技术和优化,该团队旨在就使用 Vulkan API 时的性能、质量和实施权衡提供有价值的见解。

初始实施基于光栅化,展示了两种渲染 splats 的方法,一种利用 mesh shaders,另一种使用 vertex shaders。

A diagram comparing Synchronous GPU sorting and Asynchronous CPU sorting in Gaussian Splatting Rasterization. The left side shows the GPU timeline for synchronous sorting, where 'Dist & Cull' and 'Radix Sort' steps are performed before 'Mesh' and 'Fragment' processing for each frame. The right side illustrates asynchronous CPU sorting, where a separate sorting thread computes 'Dist & Sort' without culling, swaps indices, and then copies them to VRAM before the GPU processes 'Mesh' and 'Fragment' stages.
图 1。Mesh Shader pipeline 中展示的排序方法比较

由于 Gaussian splats 需要前后一致的排序才能进行正确的 alpha compositing,因此提供了两种替代排序方法:

  • 在计算工作流中实现的基于 GPU 的 Radix Sort
  • 基于 CPU 的异步排序策略,使用 C++ STL 中的多线程排序函数
A screenshot of the vk_gaussian_splatting sample application displaying a rendered 3D gaussian splatting model of a bicycle near a park bench with trees and a path in the background. The user interface includes various settings and statistics panels. On the right, options for data storage, rendering, and sorting methods are visible, with settings for V-Sync, frustum culling, splat scale, and mesh shaders. At the bottom, memory statistics and a profiler panel show GPU and CPU usage, including frame time, sorting, and rendering performance. The application runs at 510 FPS with 1.961 ms frame time.
图 2。 vk_gaussian_splatting 用户界面提供多个分析反馈元素,例如 RAM 和 VRAM 中的内存使用情况,以及用于测量管道不同阶段的性能计时器

该示例允许您探索和试验此渲染技术的多个方面,包括:

  • 多种可视化模式,用于检查高斯 splat 的不同方面 (球谐波、splat、点密度等)
  • 完整的 benchmarking 系统可用,并支持实时分析
  • 有关 RAM 和 VRAM 内存消耗的更多详细信息,以了解要渲染的数据流
  • 研究不同技术的每个阶段的 GPU 计时,以便了解工作负载和潜在瓶颈
  • 使用所有这些数字生成的图形报告
A bar chart titled 'Pipeline Performance Comparison - SH storage formats in float 32, float 16 and uint 8' showing performance benchmarks from a Vulkan Gaussian Splatting sample. The chart compares processing times in microseconds across different test scenes, with stacked bars representing three pipeline stages: GPU Distribution (black), GPU Sort (dark green), and Rendering (light green). Various scenes are listed along the x-axis including bicycle, bonnet, counter, dining room, flowers, garden, kitchen, playroom, room, stump, train, treehill, and truck - each with their splat count and format details in parentheses. Most scenes show total processing times between 500-1500 microseconds, with rendering typically being the most time-consuming stage. The garden scene shows the highest total processing time at nearly 3000 microseconds. The chart demonstrates that smaller spherical harmonics (SH) storage formats (uint8 vs float16 vs float32) consistently result in faster rendering performance across all test scenes.
图 3。完整数据集的渲染性能与不同数据存储格式对比报告示例

此示例为希望尝试高斯射渲染技术和基于 Vulkan 的优化的开发者提供了一个起点。

要开始探索神经辐射场的实时渲染,请查看 nvpro-samples/vk_gaussian_splatting GitHub 资源库。

 

标签