CUDA
Oct 02, 2024
Webinar: Accelerating Python with GPUs
Join us on October 9 to learn how your applications can benefit from NVIDIA CUDA Python software initiatives.
1 MIN READ
Oct 02, 2024
Accelerating LLMs with llama.cpp on NVIDIA RTX Systems
The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...
5 MIN READ
Sep 30, 2024
Advancing Quantum Algorithm Design with GPTs
AI techniques like large language models (LLMs) are rapidly transforming many scientific disciplines. Quantum computing is no exception. A collaboration between...
8 MIN READ
Sep 24, 2024
Accelerating Leaderboard-Topping ASR Models 10x with NVIDIA NeMo
NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging...
13 MIN READ
Sep 19, 2024
Just Released: Torch-TensorRT v2.4.0
Includes C++ runtime support in Windows Support, Enhanced Dynamic Shape support in Converters, PyTorch 2.4, CUDA 12.4, TensorRT 10.1, Python 3.12.
1 MIN READ
Sep 17, 2024
Accelerating Oracle Database Generative AI Workloads with NVIDIA NIM and NVIDIA cuVS
The vast majority of the world's data remains untapped, and enterprises are looking to generate value from this data by creating the next wave of generative AI...
6 MIN READ
Sep 11, 2024
Advanced Strategies for High-Performance GPU Programming with NVIDIA CUDA
Stephen Jones, a leading expert and distinguished NVIDIA CUDA architect, offers his guidance and insights with a deep dive into the complexities of mapping...
2 MIN READ
Sep 11, 2024
Constant Time Launch for Straight-Line CUDA Graphs and Other Performance Enhancements
CUDA Graphs are a way to define and batch GPU operations as a graph rather than a sequence of stream launches. A CUDA Graph groups a set of CUDA kernels and...
8 MIN READ
Sep 10, 2024
Accelerating the HPCG Benchmark with NVIDIA Math Sparse Libraries
In the realm of high-performance computing (HPC), NVIDIA has continually advanced HPC by offering its highly optimized NVIDIA High-Performance Conjugate...
9 MIN READ
Sep 06, 2024
Enhancing Application Portability and Compatibility across New Platforms Using NVIDIA Magnum IO NVSHMEM 3.0
NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on...
7 MIN READ
Aug 29, 2024
Spotlight: clicOH Accelerates Last-Mile Delivery 20x with NVIDIA cuOpt
Driven by shifts in consumer behavior and the pandemic, e-commerce continues its explosive growth and transformation. As a result, logistics and transportation...
3 MIN READ
Aug 29, 2024
Boosting CUDA Efficiency with Essential Techniques for New Developers
To fully harness the capabilities of NVIDIA GPUs, optimizing NVIDIA CUDA performance is essential, particularly for developers new to GPU programming. This talk...
2 MIN READ
Aug 08, 2024
Improving GPU Performance by Reducing Instruction Cache Misses
GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...
11 MIN READ
Aug 07, 2024
Optimizing llama.cpp AI Inference with CUDA Graphs
The open-source llama.cpp code base was originally released in 2023 as a lightweight but efficient framework for performing inference on Meta Llama models....
8 MIN READ
Aug 01, 2024
Just Released: CUDA Toolkit 12.6
The release supports GB100 capabilities and new library enhancements to cuBLAS, cuFFT, cuSOLVER, cuSPARSE, as well as the release of Nsight Compute 2024.3.
1 MIN READ
Jul 17, 2024
NVIDIA Transitions Fully Towards Open-Source GPU Kernel Modules
With the R515 driver, NVIDIA released a set of Linux GPU kernel modules in May 2022 as open source with dual GPL and MIT licensing. The initial release targeted...
7 MIN READ