Generative AI is revolutionizing virtually every use case across every industry, thanks to the constant influx of groundbreaking foundation models capable of understanding context and reason to generate quality content and high-accuracy answers.
NVIDIA is constantly optimizing and publishing community–, partner–, and NVIDIA-built models. This week’s release features two families, Phi-3 and Granite Code, part of the NVIDIA AI Foundation models.
Phi-3 language models
The Phi-3 series from Microsoft contains small language models (SLMs) engineered for optimal performance without compromising computational efficiency. Their robust reasoning and logical prowess render them ideal for content generation, summarization, question-answering, and sentiment analysis tasks.
The Phi-3 language family on the NVIDIA API catalog includes the following:
- Phi-3-medium
- Phi-3-small (short and long context)
- Phi-3-mini
Phi-3 vision model
The Phi-3 family also includes Phi-3 Vision, a 4.2B multimodal model designed to process and interpret both text and visual data. The model supports 128K tokens, enabling it to understand and analyze extensive and complex visual elements within images, such as charts, graphs, and tables.
Granite Code
Granite Code models, published by IBM, are open programming models designed to assist with various coding tasks. Trained on 116 programming languages, the models can generate code examples, identify and fix errors, and provide explanations of code segments.
The models have demonstrated state-of-the-art performance on coding benchmarks and are trained on license-permissible data, making them suitable for enterprise use.
Optimized for performance
These models are optimized for latency and throughput using NVIDIA TensorRT-LLM. They join over three dozen popular AI models that are supported by NVIDIA NIM, a microservice designed to simplify the deployment of performance-optimized NVIDIA AI Foundation models and custom models. NIM enables 10–100x more enterprise application developers to contribute to AI transformations.
NVIDIA is working with leading model builders to support their models on a fully accelerated stack. These include popular models like the following:
- Llama3-70B
- Llama3-8B
- Gemma 2B
- Mixtral 8X22B
- And many more
Get started
To experience, customize, and deploy these models in enterprise applications, see the API catalog.
With free NVIDIA cloud credits, you can start testing the model at scale and build a proof of concept (POC) by connecting your application on the NVIDIA-hosted API endpoint running on a fully accelerated stack.