SLI Zone
NVIDIA.com Developer Home

Last Updated: 08 / 29 / 2002

Developer Newsletter: Issue #5

In this month's issue:


NVIDIA Cg Toolkit Beta 2 Released

We've released Beta 2 of our Cg Toolkit, which includes the following key updates (among others):

  • Support for new profiles
  • arbvp1 [ARB_vertex_program]
  • vp30 [NV30 Vertex programs]
  • fp30 [NV30 Fragment programs]
  • NV30 sample shaders
  • Cg Effects Explained Document
  • Support for binding semantics
  • Improved runtime functionality
  • Documentation updated
The complete download along with piece-by-piece downloads are available here.


"CineFX" Platform Announced

During SIGGRAPH, we announced the NVIDIA "CineFX" Platform, which enables real-time cinematic-quality rendering for the first time ever. In short, the "CineFX" Platform is the combination of Cg with our next-generation NV30 hardware. "CineFX" will offer the following key features:

  • Vertex programs of up to 65,536 instructions (with data-dependent branching)
  • Pixel programs of up to 1024 instructions (with arbitrary use of texture lookups and arithmetic operations)
  • 128-bit IEEE floating-point color precision throughout the graphics pipeline
  • Cg, a high level language for accessing this massive programmability
You can learn more about the "CineFX" Platform here.


Developer Site Updated

We've updated our Developer site with clearer navigation, along with a new NVSDK Repository, which organizes more than 250 our documents, tools, and demos by topic, for easier access. You can also list only specific subsets of document types, if you prefer. Of course, you can still use the site's Search feature to find documents, but it's no longer your only choice! You can find the NVSDK Repository here.

We have also added a News Archive area, which contains past newsletters and headlines, for your convenience in case you missed some of our past newsletters or announcements. You can find the News Archive here.

Cg Compiler Open-Sourced

In case you missed the announcement during SIGGRAPH, we've open-sourced our Cg compiler. This release provides the source code for the Cg compiler (cgc.exe) with a "generic" profile, which does some minimal semantic checks and prints out a tree representation of the code. It can be built either with the included Microsoft Visual C++ 6.0 projects and workspace, or with the included Makefile. The release also includes examples, along with a detailed description of the release contents in the Readme.txt file. You can download the source code here.


NVParse Updated

NVParse is an OpenGL tool that simplifies the programming of vertex and pixel computations on NVIDIA GPUs. It is a library that can be used in conjunction with native OpenGL calls to:

  • Simplify the process of configuring Texture Shaders
  • Simplify the process of configuring Register Combiners
  • Load Vertex Programs (with improved error reporting)
In addition, NVParse provides support for standard Microsoft DirectX 8.1 Vertex Shaders and Pixel Shaders. Therefore, you can use NVParse in conjunction with Cg to implement "pixel shaders" with OpenGL -- simply compile your Cg code using the dx8ps profile and feed the resulting output into NVParse, which will set up the corresponding OpenGL texture shader and register combiner state.

You can get the latest version of NVParse here.


Upcoming Events

Come visit us at several upcoming events in Europe:

ECTS: ECTS is where games mean business. It's the only place in Europe where the entire interactive entertainment industry comes together for three days to do business. Be sure to drop by NVIDIA stand #1240 to see the hottest technology, the best developer tools, and coolest games from NVIDIA's software partners. If you would like to schedule time with either NVIDIA's PR or Developer Relations staff, please send mail to ects@nvidia.com.

GDCE: If you're traveling to London for the Game Developer's Conference Europe August 26th through 28th, leave time to meet with NVIDIA's Developer Relations Group. Send mail to gdce@nvidia.com for more information.

Games Convention: Nvidia is also going to be at the Games Convention in Leipzig, Germany (August 29 - September 1): http://www.gc-germany.de.

Stop by and visit us!


SIGGRAPH Wrap-Up

As usual, a lot happened during SIGGRAPH 2002. Courses, Cg workshops, events, press releases, and parties -- all of it is recapped on our SIGGRAPH 2002 website. Details about the "CineFX" platform and the design of the Cg Language are available in the slides for the "NVIDIA Programmable Graphics Technology" course.


Coding Tip

When rendering using the hardware transform-and-lighting (TnL) pipeline or vertex-shaders, the GPU intermittently caches transformed and lit vertices. Storing these post-transform and lighting (post-TnL) vertices avoids recomputing the same values whenever a vertex is shared between multiple triangles and thus saves time. The post-TnL cache increases rendering performance by up to 2x.

Because the post-TnL cache is limited in size, taking maximum advantage of it requires rearranging triangle rendering-order. The easiest way to rearrange triangle-order for the post-TnL cache is to use the NVTriStrip library (.LIBs and source are available for free: here).

If you are interested in exploring the performance characteristics of the post-TnL cache yourself, here are the details: The post-TnL cache is a strict First-In-First-Out buffer, and varies in size from effectively 10 (actual 16) vertices on GeForce 256, GeForce 2, and GeForce 4 MX chipsets to effectively 18 (actual 24) on GeForce 3 and GeForce 4 Ti chipsets. Non-indexed draw-calls cannot take advantage of the cache, as it is then impossible for the GPU to know which vertices are shared.

The following example explores how these restrictions translate into optimally rendering a 100x100 vertex mesh. The mesh needs to be submitted in a single draw-call to optimize batch-size. The draw-call must be with an indexed primitive-type (see above), either strips or lists -- the performance difference between strips and lists is negligible when taking advantage of the post-TnL cache. For illustration purposes, our example uses strips.

Rendering the mesh as 99 strips running along each row of triangles and stitching them together into a single strip with degenerate triangles only marginally takes advantage of the post-TnL cache. Only two vertices in each triangle hit the post-TnL cache (i.e., one vertex is transformed and lit per triangle).

Let's limit the length of each row-strip to at most 2*16 = 32 triangles. The mesh thus separates into ceil(99/16) = 7 columns of strips, each no longer than 32 triangles. Rendering all row-strips in column 0 first, from top to bottom and connected via degenerate triangles, allows an 18 entry post-TnL cache to store not only the last two vertices for each triangle but also the whole top row of vertices for each row-strip. Thus, only 1 vertex for every 2 triangles needs to be computed. For the vertices to be in just the right order such that the top row of vertices is in the post-TnL cache, the top-row of vertices of each column should be sent as a list of degenerate triangles.


Send Us Your Comments!

Please let us know how you like the restructured Developer website by sending mail to devrelfeedback@nvidia.com with the word COMMENTS in the subject line. If you have had any problem downloading our Cg Toolkits, please let us know as well. Other comments are also welcome, of course.




nvidiadeveloper Twitterfeed
Popular References
Free Books Online