Many of today’s applications process large volumes of data. While GPU architectures have very fast HBM or GDDR memory, they have limited capacity. Making the most of GPU performance requires the data to be as close to the GPU as possible.
A defining feature of the new Volta GPU Architecture is its Tensor Cores, which give the Tesla V100 accelerator a peak throughput 12 times the 32-bit floating point throughput of the previous-generation Tesla P100.
Previously known as CNTK, the Microsoft Cognitive Toolkit version 2.0 allows developers to create, train, and evaluate their own neural networks that can scale across multiple GPUs and multiple machines on massive data sets.