Neta Zmora

Neta Zmora is a senior deep learning software architect working on DL acceleration. Before joining NVIDIA in 2020, Neta was a research engineer at Intel’s AI Lab developing methods for deep neural network compression.

Posts by Neta Zmora

AI / Deep Learning

Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT

○ TensorRT is an SDK for high-performance deep learning inference and with TensorRT 8.0, you can import models trained using Quantization Aware Training (QAT)… 17 MIN READ