Josh Park

Josh Park is an Automotive Solutions Architect Manager at NVIDIA. To date, he has been working on deep learning solutions using DL framework such as TensorFlow on muti-GPUs/multi-nodes servers and embedded systems. Also, he has been evaluating and improving training and inference performances on various GPUs + x86_64/aarch64. He received B.S and M.S. degrees from Korea University, and Ph.D. degree from Texas A&M University in Computer Science.
Avatar photo

Posts by Josh Park

Simulation / Modeling / Design

Sparsity in INT8: Training Workflow and Best Practices for NVIDIA TensorRT Acceleration

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of... 12 MIN READ
Robotics

Accelerating Quantized Networks with the NVIDIA QAT Toolkit for TensorFlow and NVIDIA TensorRT

We’re excited to announce the NVIDIA Quantization-Aware Training (QAT) Toolkit for TensorFlow 2 with the goal of accelerating the quantized networks with... 9 MIN READ
Data Science

Speeding Up Deep Learning Inference Using TensorFlow, ONNX, and NVIDIA TensorRT

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. In this post, you learn how to deploy TensorFlow trained deep learning models using... 15 MIN READ
Computer Vision / Video Analytics

Speeding Up Deep Learning Inference Using NVIDIA TensorRT (Updated)

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. NVIDIA TensorRT is an SDK for deep learning inference. TensorRT provides APIs and... 22 MIN READ
Data Science

Discovering GPU-friendly Deep Neural Networks with Unified Neural Architecture Search

After the first successes of deep learning, designing neural network architectures with desirable performance criteria for a given task (for example, high... 9 MIN READ
Data Science

Estimating Depth with ONNX Models and Custom Layers Using NVIDIA TensorRT

TensorRT is an SDK for high performance, deep learning inference. It includes a deep learning inference optimizer and a runtime that delivers low latency and... 10 MIN READ