Best practice
Aug 14, 2024
Optimizing Inference Efficiency for LLMs at Scale with NVIDIA NIM Microservices
As large language models (LLMs) continue to evolve at an unprecedented pace, enterprises are looking to build generative AI-powered applications that maximize...
8 MIN READ
Jul 31, 2024
Shader Debugging Made Easy with NVIDIA Nsight Graphics
Shaders are specialized programs that run on the GPU that manipulate rays, pixels, vertices, and textures to achieve unique visual effects. With shaders, you...
8 MIN READ
Jul 24, 2024
Developing Product Configurators with OpenUSD
Developers from advertising agencies to software vendors are empowering global brands to deliver hyperpersonalization for digital experiences and visual...
5 MIN READ
Jul 09, 2024
Building Cyber Language Models to Unlock New Cybersecurity Capabilities
General-purpose large language models (LLMs) have proven their usefulness across various fields, offering substantial benefits in applications ranging from text...
13 MIN READ
Jun 27, 2024
Secure LLM Tokenizers to Maintain Application Integrity
This post is part of the NVIDIA AI Red Team’s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase...
6 MIN READ
Jun 12, 2024
Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates
The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...
7 MIN READ
May 29, 2024
New Webinar: Deploying Generative AI in Production
Ready to move your pilot to production? Get an expert overview on how to deploy generative AI applications.
1 MIN READ
May 08, 2024
Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints
Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...
13 MIN READ
Apr 29, 2024
Top Data Science Sessions from NVIDIA GTC 2024 Now Available On Demand
At GTC 2024, experts from NVIDIA and our partners shared insights about GPU-accelerated tools, optimizations, and best practices for data scientists. From the...
2 MIN READ
Feb 21, 2024
Limiting CPU Threads for Better Game Performance
Many PC games are designed around an eight-core console with an assumption that their software threading system ‘just works’ on all PCs, especially...
6 MIN READ
Jan 23, 2024
Simplifying Network Operations for AI with NVIDIA Quantum InfiniBand
A common technological misconception is that performance and complexity are directly linked. That is, the highest-performance implementation is also the most...
4 MIN READ
Jan 05, 2024
Improving CUDA Initialization Times Using cgroups in Certain Scenarios
Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by...
5 MIN READ
Dec 15, 2023
Advanced API Performance: Swap Chains
Swap chains are an integral part of how you get rendering data output to a screen. They usually consist of some group of output-ready buffers, each of which can...
4 MIN READ
Nov 21, 2023
Advanced API Performance: Intrinsics
Intrinsics can be thought of as higher-level abstractions of specific hardware instructions. They offer direct access to low-level operations or...
2 MIN READ
Nov 15, 2023
Best Practices for Securing LLM-Enabled Applications
Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...
11 MIN READ
Nov 14, 2023
Accelerating Ptychography Workflows with NVIDIA Holoscan at Diamond Light Source
Diamond Light Source is a world-renowned synchrotron facility in the UK that provides scientists with access to intense beams of x-rays, infrared, and other...
10 MIN READ