Compilation
Oct 04, 2022
CUDA Toolkit 11.8 New Features Revealed
NVIDIA announces the newest CUDA Toolkit software release, 11.8. This release is focused on enhancing the programming model and CUDA application speedup through...
4 MIN READ
Apr 06, 2021
N Ways to SAXPY: Demonstrating the Breadth of GPU Programming Options
Back in 2012, NVIDIAN Mark Harris wrote Six Ways to Saxpy, demonstrating how to perform the SAXPY operation on a GPU in multiple ways, using different languages...
9 MIN READ
Feb 12, 2021
Boosting Productivity and Performance with the NVIDIA CUDA 11.2 C++ Compiler
The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications....
21 MIN READ
Feb 12, 2021
Improving GPU Application Performance with NVIDIA CUDA 11.2 Device Link Time Optimization
CUDA 11.2 features the powerful link time optimization (LTO) feature for device code in GPU-accelerated applications. Device LTO brings the performance...
14 MIN READ
Dec 16, 2020
Enhancing Memory Allocation with New NVIDIA CUDA 11.2 Features
CUDA is the software development platform for building GPU-accelerated applications, providing all the components needed to develop applications targeting every...
9 MIN READ
Nov 18, 2020
Detecting Divergence Using PCAST to Compare GPU to CPU Results
Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first...
14 MIN READ
Nov 16, 2020
Accelerating Fortran DO CONCURRENT with GPUs and the NVIDIA HPC SDK
Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran...
13 MIN READ
Sep 11, 2019
NVDLA Deep Learning Inference Compiler is Now Open Source
Designing new custom hardware accelerators for deep learning is clearly popular, but achieving state-of-the-art performance and efficiency with a new design is...
6 MIN READ
Oct 25, 2017
High-Performance GPU Computing in the Julia Programming Language
Julia is a high-level programming language for mathematical computing that is as easy to use as Python, but as fast as C. The language has been created with...
10 MIN READ
Aug 01, 2017
Building Cross-Platform CUDA Applications with CMake
Cross-platform software development poses a number of challenges to your application’s build process. How do you target multiple platforms without maintaining...
10 MIN READ
Nov 07, 2016
New Compiler Features in CUDA 8
CUDA 8 is one of the most significant updates in the history of the CUDA platform. In addition to Unified Memory and the many new API and library features in...
17 MIN READ
Jul 13, 2015
Introducing the NVIDIA OpenACC Toolkit
Programmability is crucial to accelerated computing, and NVIDIA's CUDA Toolkit has been critical to the success of GPU computing. Over three million CUDA...
4 MIN READ
Jun 23, 2015
MapD: Massive Throughput Database Queries with LLVM on GPUs
Note: this post was co-written by Alex Şuhan and Todd Mostak of MapD. At MapD our goal is to build the world's fastest big data analytics and visualization...
12 MIN READ
Oct 08, 2014
The Next Wave of Enterprise Performance with Java, POWER Systems, and NVIDIA GPUs
The Java ecosystem is the leading enterprise software development platform, with widespread industry support and deployment on platforms like the IBM WebSphere...
9 MIN READ
Apr 22, 2014
Separate Compilation and Linking of CUDA C++ Device Code
Managing complexity in large programs requires breaking them down into components that are responsible for small, well-defined portions of the overall program....
13 MIN READ
Jun 05, 2013
CUDA Pro Tip: Understand Fat Binaries and JIT Caching
As NVIDIA GPUs evolve to support new features, the instruction set architecture naturally changes. Because applications must run on multiple generations of...
6 MIN READ