-
Generative AINew AI Research Foreshadows Autonomous Robotic Surgery
-
Simulation / Modeling / DesignHow AI is Making Climate Modeling Faster, Greener, and More Accurate
-
Generative AIBuild Your First Human-in-the-Loop AI Agent with NVIDIA NIM
-
Models / Libraries / FrameworksAI Unlocks Early Clues to Alzheimer’s Through Retinal Scans
-
Simulation / Modeling / DesignAI Research Delivers Rapid, Accurate Prostate Cancer Predictions
Recent
Dec 11, 2024
Deploying NVIDIA H200 NVL at Scale with New Enterprise Reference Architecture
Last month at the Supercomputing 2024 conference, NVIDIA announced the availability of NVIDIA H200 NVL, the latest NVIDIA Hopper platform. Optimized for...
8 MIN READ
Dec 11, 2024
Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint
In today's fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it's a necessity. Whether addressing...
10 MIN READ
Dec 11, 2024
NVIDIA TensorRT-LLM Now Accelerates Encoder-Decoder Models with In-Flight Batching
NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes...
4 MIN READ
Dec 10, 2024
New AI Research Foreshadows Autonomous Robotic Surgery
A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans....
4 MIN READ
Dec 10, 2024
NVIDIA CUDA-Q Runs Breakthrough Logical Qubit Application on Infleqtion QPU
Infleqtion, a world leader in neutral atom quantum computing, used the NVIDIA CUDA-Q platform to first simulate, and then orchestrate the first-ever...
6 MIN READ
Dec 09, 2024
Just Released: NVIDIA VILA VLM
Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.
1 MIN READ
Dec 06, 2024
Content Moderation and Safety Checks with NVIDIA NeMo Guardrails
Content moderation has become essential in retrieval-augmented generation (RAG) applications powered by generative AI, given the extensive volume of...
10 MIN READ
Dec 05, 2024
Just Released: NVIDIA Modulus v24.12
The new release includes new network architectures for external aerodynamics application as well as for climate and weather prediction.
1 MIN READ
Dec 05, 2024
Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary
As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...
7 MIN READ
Dec 05, 2024
Unified Virtual Memory Supercharges pandas with RAPIDS cuDF
cuDF-pandas, introduced in a previous post, is a GPU-accelerated library that accelerates pandas to deliver significant performance improvements—up to 50x...
5 MIN READ
Dec 05, 2024
Optimize GPU Workloads for Graphics Applications with NVIDIA Nsight Graphics
One of the great pastimes of graphics developers and enthusiasts is comparing specifications of GPUs and marveling at the ever-increasing counts of shader...
11 MIN READ
Dec 05, 2024
Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack
The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...
7 MIN READ
Inference Performance
Dec 05, 2024
Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack
The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...
7 MIN READ
Dec 02, 2024
TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to 3.6x
NVIDIA TensorRT-LLM support for speculative decoding now provides over 3x the speedup in total token throughput. TensorRT-LLM is an open-source library that...
9 MIN READ
Nov 21, 2024
NVIDIA TensorRT-LLM Multiblock Attention Boosts Throughput by More Than 3x for Long Sequence Lengths on NVIDIA HGX H200
Generative AI models are advancing rapidly. Every generation of models comes with a larger number of parameters and longer context windows. The Llama 2 series...
5 MIN READ
Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ
Nov 15, 2024
Streamlining AI Inference Performance and Deployment with NVIDIA TensorRT-LLM Chunked Prefill
In this blog post, we take a closer look at chunked prefill, a feature of NVIDIA TensorRT-LLM that increases GPU utilization and simplifies the deployment...
4 MIN READ
Nov 08, 2024
5x Faster Time to First Token with NVIDIA TensorRT-LLM KV Cache Early Reuse
In our previous blog post, we demonstrated how reusing the key-value (KV) cache by offloading it to CPU memory can accelerate time to first token (TTFT) by up...
5 MIN READ
Nov 01, 2024
3x Faster AllReduce with NVSwitch and TensorRT-LLM MultiShot
Deploying generative AI workloads in production environments where user numbers can fluctuate from hundreds to hundreds of thousands – and where input...
5 MIN READ
Oct 28, 2024
NVIDIA GH200 Superchip Accelerates Inference by 2x in Multiturn Interactions with Llama Models
Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing...
7 MIN READ
Oct 09, 2024
NVIDIA Grace CPU Delivers World-Class Data Center Performance and Breakthrough Energy Efficiency
NVIDIA designed the NVIDIA Grace CPU to be a new kind of high-performance, data center CPU—one built to deliver breakthrough energy efficiency and optimized...
8 MIN READ
Oct 09, 2024
Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch
The continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...
8 MIN READ
Sep 26, 2024
Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance
Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...
8 MIN READ
Sep 24, 2024
NVIDIA GH200 Grace Hopper Superchip Delivers Outstanding Performance in MLPerf Inference v4.1
In the latest round of MLPerf Inference – a suite of standardized, peer-reviewed inference benchmarks – the NVIDIA platform delivered outstanding...
7 MIN READ
Generative AI
Dec 11, 2024
Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint
In today's fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it's a necessity. Whether addressing...
10 MIN READ
Dec 11, 2024
NVIDIA TensorRT-LLM Now Accelerates Encoder-Decoder Models with In-Flight Batching
NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes...
4 MIN READ
Dec 10, 2024
New AI Research Foreshadows Autonomous Robotic Surgery
A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans....
4 MIN READ
Dec 09, 2024
Just Released: NVIDIA VILA VLM
Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.
1 MIN READ
Dec 06, 2024
Content Moderation and Safety Checks with NVIDIA NeMo Guardrails
Content moderation has become essential in retrieval-augmented generation (RAG) applications powered by generative AI, given the extensive volume of...
10 MIN READ
Dec 05, 2024
Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary
As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...
7 MIN READ
Dec 03, 2024
How to Build a Generative AI-Enabled Synthetic Data Pipeline for Perception AI
Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of...
6 MIN READ
Dec 03, 2024
Build an Agentic Video Workflow with Video Search and Summarization
Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system...
11 MIN READ
Dec 02, 2024
TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to 3.6x
NVIDIA TensorRT-LLM support for speculative decoding now provides over 3x the speedup in total token throughput. TensorRT-LLM is an open-source library that...
9 MIN READ
Dec 02, 2024
Unified Whole-Body Control for Physically Simulated Humanoids
Creating interactive simulated humanoids that move naturally and respond intelligently to diverse control inputs remains one of the most challenging problems in...
7 MIN READ
Nov 22, 2024
Spotlight: TCS Increases Automotive Software Testing Speeds by 2x Using NVIDIA Generative AI
Generative AI is transforming every aspect of the automotive industry, including software development, testing, user experience, personalization, and safety....
8 MIN READ
Nov 22, 2024
Hymba Hybrid-Head Architecture Boosts Small Language Model Performance
Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,...
12 MIN READ
Data Science
Dec 05, 2024
Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary
As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...
7 MIN READ
Dec 05, 2024
Unified Virtual Memory Supercharges pandas with RAPIDS cuDF
cuDF-pandas, introduced in a previous post, is a GPU-accelerated library that accelerates pandas to deliver significant performance improvements—up to 50x...
5 MIN READ
Dec 03, 2024
In-Silico Antibody Development with AlphaBind Using NVIDIA BioNeMo and AWS HealthOmics
Antibodies have become the most prevalent class of therapeutics, primarily due to their ability to target specific antigens, enabling them to treat a wide range...
6 MIN READ
Nov 28, 2024
Supercharging Deduplication in pandas Using RAPIDS cuDF
A common operation in data analytics is to drop duplicate rows. Deduplication is critical in Extract, Transform, Load (ETL) workflows, where you might want to...
12 MIN READ
Nov 21, 2024
Best Practices for Multi-GPU Data Analysis Using RAPIDS with Dask
As we move towards a more dense computing infrastructure, with more compute, more GPUs, accelerated networking, and so forth—multi-gpu training and analysis...
5 MIN READ
Nov 21, 2024
Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor
Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...
6 MIN READ
Nov 19, 2024
Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator
Open-source large language models (LLMs) excel in English but struggle with other languages, especially the languages of Southeast Asia. This is primarily due...
17 MIN READ
Nov 18, 2024
Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance
AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of...
8 MIN READ
Nov 18, 2024
Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI
AI has proven to be a force multiplier, helping to create a future where scientists can design entirely new materials, while engineers seamlessly transform...
11 MIN READ
Nov 18, 2024
Effortlessly Scale NumPy from Laptops to Supercomputers with NVIDIA cuPyNumeric
Python is the most common programming language for data science, machine learning, and numerical computing. It continues to grow in popularity among scientists...
12 MIN READ
Nov 14, 2024
Deep Learning Model Boosts Accuracy in Long-Range Weather and Climate Forecasting
Dale Durran, a professor in the Atmospheric Sciences Department at the University of Washington, introduces a breakthrough deep learning model that combines...
2 MIN READ
Nov 14, 2024
Faster Causal Inference on Large Datasets with NVIDIA RAPIDS
As consumer applications generate more data than ever before, enterprises are turning to causal inference methods for observational data to help shed light on...
4 MIN READ
Robotics
Dec 10, 2024
New AI Research Foreshadows Autonomous Robotic Surgery
A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans....
4 MIN READ
Dec 03, 2024
Scaling Action Recognition Models with Synthetic Data
Action recognition models such as PoseClassificationNet have been around for some time, helping systems identify and classify human actions like walking,...
11 MIN READ
Dec 02, 2024
Unified Whole-Body Control for Physically Simulated Humanoids
Creating interactive simulated humanoids that move naturally and respond intelligently to diverse control inputs remains one of the most challenging problems in...
7 MIN READ
Nov 21, 2024
NVIDIA JetPack 6.1 Boosts Performance and Security through Camera Stack Optimizations and Introduction of Firmware TPM
NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release,...
8 MIN READ
Nov 06, 2024
Advancing Humanoid Robot Sight and Skill Development with NVIDIA Project GR00T
Humanoid robots present a multifaceted challenge at the intersection of mechatronics, control theory, and AI. The dynamics and control of humanoid robots are...
10 MIN READ
Nov 06, 2024
Spotlight: Galbot Builds a Large-Scale Dexterous Hand Dataset for Humanoid Robots Using NVIDIA Isaac Sim
Robotic dexterous grasping is a critical area of research and development, aimed at enabling robots to interact with and manipulate objects as flexibly as...
5 MIN READ
Nov 06, 2024
Spotlight: Fourier Trains Humanoid Robots for Real-World Roles Using NVIDIA Isaac Gym
This post was written in partnership with the Fourier research team. Training humanoid robots to operate in fields that demand high levels of interaction and...
4 MIN READ
Nov 04, 2024
Build a Video Search and Summarization Agent with NVIDIA AI Blueprint
This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications...
11 MIN READ
Oct 30, 2024
Teaching Robots to Tackle Household Chores
Robotics could make everyday life easier by taking on repetitive or time-consuming tasks. At NVIDIA GTC 2024, researchers from Stanford University unveiled...
2 MIN READ
Oct 25, 2024
NVIDIA Showcases the Future of Intelligent Robots at CoRL 2024
From humanoids to policy, explore the work NVIDIA is bringing to the robotics community.
1 MIN READ
Oct 24, 2024
Powering the Next Wave of AI Robotics with Three Computers
NVIDIA has built three computers and accelerated development platforms to enable developers to create physical AI.
1 MIN READ
Oct 22, 2024
A Beginner’s Guide to Simulating and Testing Robots with ROS 2 and NVIDIA Isaac Sim
Physical AI-powered robots need to autonomously sense, plan, and perform complex tasks in the physical world. These include transporting and manipulating...
10 MIN READ
Simulation / Modeling / Design
Dec 10, 2024
New AI Research Foreshadows Autonomous Robotic Surgery
A robot commonly used and manually manipulated by surgeons for routine operations can now autonomously perform key surgical tasks as precisely as humans....
4 MIN READ
Dec 10, 2024
NVIDIA CUDA-Q Runs Breakthrough Logical Qubit Application on Infleqtion QPU
Infleqtion, a world leader in neutral atom quantum computing, used the NVIDIA CUDA-Q platform to first simulate, and then orchestrate the first-ever...
6 MIN READ
Dec 05, 2024
Just Released: NVIDIA Modulus v24.12
The new release includes new network architectures for external aerodynamics application as well as for climate and weather prediction.
1 MIN READ
Dec 05, 2024
Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack
The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...
7 MIN READ
Dec 04, 2024
How AI is Making Climate Modeling Faster, Greener, and More Accurate
Christopher Bretherton, Senior Director of Climate Modeling at the Allen Institute for AI (AI2), highlights how AI is revolutionizing climate science. In this...
2 MIN READ
Dec 03, 2024
Scaling Action Recognition Models with Synthetic Data
Action recognition models such as PoseClassificationNet have been around for some time, helping systems identify and classify human actions like walking,...
11 MIN READ
Dec 03, 2024
How to Build a Generative AI-Enabled Synthetic Data Pipeline for Perception AI
Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of...
6 MIN READ
Dec 03, 2024
Introducing NVIDIA cuPQC for GPU-Accelerated Post-Quantum Cryptography
In the past decade, quantum computers have progressed significantly and could one day be used to undermine current cybersecurity practices. If run on a quantum...
6 MIN READ
Dec 02, 2024
Accelerated Quantum Supercomputing with the NVIDIA CUDA-Q and Amazon Braket Integration
As quantum computers scale, tasks such as controlling quantum hardware and performing quantum error correction become increasingly complex. Overcoming these...
6 MIN READ
Dec 02, 2024
Unified Whole-Body Control for Physically Simulated Humanoids
Creating interactive simulated humanoids that move naturally and respond intelligently to diverse control inputs remains one of the most challenging problems in...
7 MIN READ
Nov 21, 2024
Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor
Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...
6 MIN READ
Nov 21, 2024
Powering AI-Augmented Workloads with NVIDIA and Windows 365
We are entering a new era of AI-powered digital workflow, where Windows 365 Cloud PCs are dynamic platforms that host AI technologies and reshape traditional...
7 MIN READ
Computer Vision / Video Analytics
Dec 09, 2024
Just Released: NVIDIA VILA VLM
Now available in preview, NVIDIA VILA is an advanced multimodal VLM that provides visual understanding of multi-images and video.
1 MIN READ
Dec 05, 2024
Celebrating Open Science and Enterprise AI Innovation on MONAI’s 5th Anniversary
As MONAI celebrates its fifth anniversary, we're witnessing the convergence of our vision for open medical AI with production-ready enterprise solutions. ...
7 MIN READ
Dec 03, 2024
Scaling Action Recognition Models with Synthetic Data
Action recognition models such as PoseClassificationNet have been around for some time, helping systems identify and classify human actions like walking,...
11 MIN READ
Dec 03, 2024
Build an Agentic Video Workflow with Video Search and Summarization
Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system...
11 MIN READ
Nov 25, 2024
Just Released: NVIDIA DeepStream 7.1
The new release introduces Python support in Service Maker to accelerate real-time multimedia and AI inference applications with a powerful GStreamer...
1 MIN READ
Nov 21, 2024
AI Unlocks Early Clues to Alzheimer’s Through Retinal Scans
Your eyes could hold the key to unlocking early detection of Alzheimer’s and dementia, with a groundbreaking AI study. Called Eye-AD, the deep learning...
3 MIN READ
Nov 04, 2024
Build a Video Search and Summarization Agent with NVIDIA AI Blueprint
This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications...
11 MIN READ
Oct 31, 2024
Deep Learning AI Model Identifies Breast Cancer Spread without Surgery
A new deep learning model could reduce the need for surgery when diagnosing whether cancer cells are spreading, including to nearby lymph nodes—also known as...
4 MIN READ
Oct 29, 2024
AI-Powered Devices Track Howls to Save Wolves
A new cell-phone-sized device—which can be deployed in vast, remote areas—is using AI to identify and geolocate wildlife to help conservationists track...
5 MIN READ
Oct 24, 2024
Federated Learning in Autonomous Vehicles Using Cross-Border Training
Federated learning is revolutionizing the development of autonomous vehicles (AVs), particularly in cross-country scenarios where diverse data sources and...
10 MIN READ
Oct 23, 2024
Optimizing the CV Pipeline in Automotive Vehicle Development Using the PVA Engine
In the field of automotive vehicle software development, more large-scale AI models are being integrated into autonomous vehicles. The models range from vision...
16 MIN READ
Oct 07, 2024
Accelerating Reality Capture Workflows with AI and NVIDIA RTX GPUs
Reality capture creates highly accurate, detailed, and immersive digital representations of environments. Innovations in site scanning and accelerated data...
10 MIN READ
Content Creation / Rendering
Dec 05, 2024
Optimize GPU Workloads for Graphics Applications with NVIDIA Nsight Graphics
One of the great pastimes of graphics developers and enthusiasts is comparing specifications of GPUs and marveling at the ever-increasing counts of shader...
11 MIN READ
Nov 21, 2024
Powering AI-Augmented Workloads with NVIDIA and Windows 365
We are entering a new era of AI-powered digital workflow, where Windows 365 Cloud PCs are dynamic platforms that host AI technologies and reshape traditional...
7 MIN READ
Oct 07, 2024
Producing Cinematic Content at Scale with a Generative AI-Enabled OpenUSD Pipeline
Producing commercials is resource-intensive, requiring physical locations and various props and setups to display products in different settings and...
7 MIN READ
Oct 02, 2024
Accelerating LLMs with llama.cpp on NVIDIA RTX Systems
The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...
5 MIN READ
Oct 01, 2024
Revolutionizing Cloud Gaming and Graphics Rendering with NVIDIA GDN
Gaming has always pushed the boundaries of graphics hardware. The most popular games typically required robust GPU, CPU, and RAM resources on a user’s PC or...
7 MIN READ
Oct 01, 2024
Simplify and Scale AI-Powered MetaHuman Deployment with NVIDIA ACE and Unreal Engine 5
At Unreal Fest 2024, NVIDIA released new Unreal Engine 5 on-device plugins for NVIDIA ACE, making it easier to build and deploy AI-powered MetaHuman characters...
4 MIN READ
Sep 23, 2024
Just Released: Free OpenUSD Training Courses
Accelerate your OpenUSD workflows with this free curriculum for developers and 3D practitioners.
1 MIN READ
Sep 16, 2024
Orchestrating Innovation at Scale with NVIDIA Maxine and Texel
The NVIDIA Maxine AI developer platform is a suite of NVIDIA NIM microservices, cloud-accelerated microservices, and SDKs that offer state-of-the-art features...
5 MIN READ
Sep 11, 2024
Enabling Customizable GPU-Accelerated Video Transcoding Pipelines
Today, over 80% of internet traffic is video. This content is generated by and consumed across various devices, including IoT gadgets, smartphones, computers,...
10 MIN READ
Sep 09, 2024
Transform Live Media Pipelines with NVIDIA Holoscan for Media
NVIDIA Holoscan for Media is now ready to be used in live production, taking advantage of the best of both networking and GPU technologies. Holoscan for...
3 MIN READ
Aug 30, 2024
Fast Inversion for Real-Time Image Editing with Text
Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts. They operate by mapping a random sample from a...
8 MIN READ
Aug 20, 2024
Deploy the First On-Device Small Language Model for Improved Game Character Roleplay
At Gamescom 2024, NVIDIA announced our first on-device small language model (SLM) for improving the conversation abilities of game characters. We also announced...
4 MIN READ
Conversational AI
Dec 11, 2024
Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint
In today's fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it's a necessity. Whether addressing...
10 MIN READ
Nov 22, 2024
Hymba Hybrid-Head Architecture Boosts Small Language Model Performance
Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,...
12 MIN READ
Nov 19, 2024
Create a Custom Slackbot LLM Agent with NVIDIA NIM and LangChain
In the dynamic world of modern business, where communication and efficient workflows are crucial for success, AI-powered solutions have become a competitive...
9 MIN READ
Oct 28, 2024
Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA
The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system...
11 MIN READ
Oct 22, 2024
Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes
Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs...
16 MIN READ
Oct 21, 2024
IBM’s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient
Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...
5 MIN READ
Oct 16, 2024
Simplify AI Application Development with NVIDIA Cloud Native Stack
In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional...
5 MIN READ
Oct 01, 2024
Evaluating Medical RAG with NVIDIA AI Endpoints and Ragas
In the rapidly evolving field of medicine, the integration of cutting-edge technologies is crucial for enhancing patient care and advancing research. One such...
11 MIN READ
Sep 26, 2024
Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance
Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...
8 MIN READ
Sep 25, 2024
Build a Digital Human Interface for AI Apps with an NVIDIA NIM Agent Blueprint
Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to...
5 MIN READ
Sep 25, 2024
Deploying Accelerated Llama 3.2 from the Edge to the Cloud
Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...
6 MIN READ
Sep 24, 2024
Accelerating Leaderboard-Topping ASR Models 10x with NVIDIA NeMo
NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging...
13 MIN READ
Edge Computing
Nov 25, 2024
Just Released: NVIDIA DeepStream 7.1
The new release introduces Python support in Service Maker to accelerate real-time multimedia and AI inference applications with a powerful GStreamer...
1 MIN READ
Nov 22, 2024
Hymba Hybrid-Head Architecture Boosts Small Language Model Performance
Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,...
12 MIN READ
Nov 21, 2024
NVIDIA JetPack 6.1 Boosts Performance and Security through Camera Stack Optimizations and Introduction of Firmware TPM
NVIDIA JetPack has continuously evolved to offer cutting-edge software tailored to the growing needs of edge AI and robotic developers. With each release,...
8 MIN READ
Nov 14, 2024
NVIDIA DOCA 2.9 Enhances AI and Cloud Computing Infrastructure with New Performance and Security Features
NVIDIA DOCA enhances the capabilities of NVIDIA networking platforms by providing a comprehensive software framework for developers to leverage hardware...
9 MIN READ
Oct 29, 2024
AI-Powered Devices Track Howls to Save Wolves
A new cell-phone-sized device—which can be deployed in vast, remote areas—is using AI to identify and geolocate wildlife to help conservationists track...
5 MIN READ
Oct 24, 2024
Powering the Next Wave of AI Robotics with Three Computers
NVIDIA has built three computers and accelerated development platforms to enable developers to create physical AI.
1 MIN READ
Oct 21, 2024
AI Accurately Forecasts Extreme Weather Up to 23 Days Ahead
New research from the University of Washington is refining AI weather models using deep learning for more accurate predictions and longer-term forecasts. The...
3 MIN READ
Oct 16, 2024
Maximizing Energy and Power Efficiency in Applications with NVIDIA GPUs
As the demand for high-performance computing (HPC) and AI applications grows, so does the importance of energy efficiency. NVIDIA Principal Developer Technology...
2 MIN READ
Oct 16, 2024
Treating Brain Disease with Brain-Machine Interactive Neuromodulation and NVIDIA Jetson
Neuromodulation is a technique that enhances or restores brain function by directly intervening in neural activity. It is commonly used to treat conditions like...
4 MIN READ
Oct 08, 2024
Bringing AI-RAN to a Telco Near You
Inferencing for generative AI and AI agents will drive the need for AI compute infrastructure to be distributed from edge to central clouds. IDC predicts that...
14 MIN READ
Oct 07, 2024
Real-Time Surgical Guidance by Fusing Multi-Modal Imaging with NVIDIA Holoscan
Developers in the fields of image-guided surgery and surgical vision face unique challenges in creating systems and applications that can significantly improve...
7 MIN READ
Oct 03, 2024
AI Investigates Antarctica's Disappearing Moss to Uncover Climate Change Clues
Antarctica plays a crucial role in regulating Earth’s climate. Most climate research into the world’s coldest, most windswept continent focuses on the...
5 MIN READ
Data Center / Cloud
Dec 11, 2024
Deploying NVIDIA H200 NVL at Scale with New Enterprise Reference Architecture
Last month at the Supercomputing 2024 conference, NVIDIA announced the availability of NVIDIA H200 NVL, the latest NVIDIA Hopper platform. Optimized for...
8 MIN READ
Dec 05, 2024
Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack
The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...
7 MIN READ
Nov 21, 2024
NVIDIA TensorRT-LLM Multiblock Attention Boosts Throughput by More Than 3x for Long Sequence Lengths on NVIDIA HGX H200
Generative AI models are advancing rapidly. Every generation of models comes with a larger number of parameters and longer context windows. The Llama 2 series...
5 MIN READ
Nov 21, 2024
Deploying Fine-Tuned AI Models with NVIDIA NIM
For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...
5 MIN READ
Nov 21, 2024
Advancing Ansys Workloads with NVIDIA Grace and NVIDIA Grace Hopper
Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. Delivering these advancements requires...
10 MIN READ
Nov 21, 2024
Powering AI-Augmented Workloads with NVIDIA and Windows 365
We are entering a new era of AI-powered digital workflow, where Windows 365 Cloud PCs are dynamic platforms that host AI technologies and reshape traditional...
7 MIN READ
Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ
Nov 18, 2024
Fusing Epilog Operations with Matrix Multiplication Using nvmath-python
nvmath-python (Beta) is an open-source Python library, providing Python programmers with access to high-performance mathematical operations from NVIDIA CUDA-X...
8 MIN READ
Nov 15, 2024
NVIDIA NIM 1.4 Ready to Deploy with 2.4x Faster Inference
The demand for ready-to-deploy high-performance inference is growing as generative AI reshapes industries. NVIDIA NIM provides production-ready microservice...
3 MIN READ
Nov 15, 2024
Streamlining AI Inference Performance and Deployment with NVIDIA TensorRT-LLM Chunked Prefill
In this blog post, we take a closer look at chunked prefill, a feature of NVIDIA TensorRT-LLM that increases GPU utilization and simplifies the deployment...
4 MIN READ
Nov 14, 2024
Exploring the Case of Super Protocol with Self-Sovereign AI and NVIDIA Confidential Computing
Confidential and self-sovereign AI is a new approach to AI development, training, and inference where the user’s data is decentralized, private, and...
15 MIN READ
Nov 14, 2024
NVIDIA DOCA 2.9 Enhances AI and Cloud Computing Infrastructure with New Performance and Security Features
NVIDIA DOCA enhances the capabilities of NVIDIA networking platforms by providing a comprehensive software framework for developers to leverage hardware...
9 MIN READ