Researchers from fast.ai announced a new speed record for training ImageNet to 93 percent accuracy in only 18 minutes.
Fast.ai alumni Andrew Shaw, and Defense Innovation Unit Experimental (DIU) researcher Yaroslav Bulatov achieved the speed record using 128 NVIDIA Tesla V100 Tensor Core GPUs on the Amazon Web Services (AWS) cloud, with the fastai and cuDNN-accelerated PyTorch libraries. For distributed computation, the team used the NVIDIA Collective Communications Library (NCCL) open-source library, which implements ring-style collectives that are integrated with PyTorch’s all-reduce distributed module.
The record is 40% faster than the previous record.
“DIU and fast.ai will be releasing software to allow anyone to easily train and monitor their own distributed models on AWS, using the best practices developed in this project,” said Jeremy Howard, a founding researcher at fast.ai. “We entered this competition because we wanted to show that you don’t have to have huge resources to be at the cutting edge of AI research, and we were quite successful in doing so.”
The researchers said they were encouraged by previous speed records achieved on publicly available machines by the AWS team.
“The set of tools developed by fast.ai focused on fast iteration with single-instance experiments, whilst the nexus-scheduler developed by DIU was focused on robustness and multi-machine experiments,” Howard stated.
The team says they achieved the speed record with 16 AWS instances, at a total compute cost of $40.
“We’re not even done yet – we have some ideas for further simple optimizations which we’ll be trying out,” Howard said. “There’s certainly plenty of room to go faster still.”
You can learn more about the record and fast.ai’s implementation on their blog.
Read more >
Fast.AI Breaks ImageNet Record with NVIDIA V100 Tensor Core GPUs
Aug 10, 2018
Discuss (0)

Related resources
- GTC session: Developer Breakout: Accelerating Enterprise Workflows With Triton Server and DALI (Spring 2023)
- GTC session: FastDeploy: Full-Scene, High-Performance AI Deployment Tool (Presented by Baidu Online Network Technology (Beijing) Co., Ltd.) (Spring 2023)
- GTC session: Accelerate Video, AI, and Graphics Workloads With NVIDIA L4 (Spring 2023)
- SDK: DALI
- SDK: cuDNN
- Webinar: Meet the Experts: Accelerated Data Pre-Processing for Recommendation Systems, Computer Vision and Speech Applications