Posts by Nikolay Markovskiy
Agentic AI / Generative AI
Apr 02, 2026
Achieving Single-Digit Microsecond Latency Inference for Capital Markets
In algorithmic trading, reducing response times to market events is crucial. To keep pace with high-speed electronic markets, latency-sensitive firms often use...
13 MIN READ
Data Center / Cloud
Apr 05, 2023
Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI
The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...
15 MIN READ
Simulation / Modeling / Design
Jul 18, 2018
Neural Machine Translation Inference with TensorRT 4
Neural machine translation exists across a wide variety consumer applications, including web sites, road signs, generating subtitles in foreign languages, and...
25 MIN READ
Simulation / Modeling / Design
Jun 05, 2014
Drop-in Acceleration of GNU Octave
cuBLAS is an implementation of the BLAS library that leverages the teraflops of performance provided by NVIDIA GPUs. However, cuBLAS can not be used as a...
7 MIN READ