Timing is everything, especially when it impacts your customer experiences, bottom line, and production efficiency. Edge AI can help by delivering real-time intelligence and increased privacy in intermittent, low bandwidth, and low cost environments.
By 2025, according to Gartner®, 75% of data will be created and processed at the edge, outside the traditional data center or cloud.1 It’s no wonder that thousands of companies are turning to edge AI to drive transformation for their businesses.
As organizations undergo this shift, many IT and business leaders are still in the early stages of planning and executing their edge computing strategies. Because edge AI is a new concept, the process is difficult for many.
NVIDIA, a leading AI infrastructure company with robust experience helping organizations, customers, and partners successfully deploy edge AI solutions, is no stranger to these new concepts.
In an effort to help others, the learnings and recommendations from these experiences are presented in An IT Manager’s Guide: How to Successfully Deploy an Edge AI Solution. The whitepaper offers an in-depth look at building and executing a successful edge AI deployment.
This post features recommendations on some key considerations when configuring an edge system.
Edge system configurations: Design recommendations
There are many parameters to consider when sizing a system. The optimal PCIe server configuration will depend on the target workload for that server.
Edge AI models incorporate various workloads into their applications, such as vision AI, natural language processing, recommendations based on industrial sensors, and predictive analytics.
Edge computing sizing considerations
When it comes to designing full hardware and software solutions at the edge, it is important to look at the solution as a whole to understand how the parts work together. Some of the individual considerations that IT must evaluate for edge AI deployments are detailed below.
Number of streams: Each camera feed is a stream requiring a certain amount of memory and compute for processing. Small configurations of 6-7 video processing streams require relatively small systems. Larger deployments may require high performance systems that are typically seen in the data center.
Application examples: One of the first steps to a successful edge AI deployment is understanding what workload needs to be run to reach your goals. Vision AI applications like image recognition, people or vehicle detection, and segmentation are all common use cases.
Once an application is determined, it is important to understand the intended scale. For example, are additional AI models needed? Typically, a proof of concept (POC) will consist of a single AI model and use case, but most production deployments ultimately incorporate multiple AI models. The next steps include quantifying the business value of the application, dictating any environmental constraints, and securing stakeholder alignment.
Memory: Perhaps the most common way to under-resource an edge AI solution is to configure the edge systems with too little memory. Edge AI systems require significantly more memory than other applications to support the parallel execution of the inference engine across the CPU and GPUs.
The data science team or application vendor who trained the AI will know the memory requirements of the latest model. IT teams should, at a minimum, double that number to accommodate the inevitable expansion of the model as it retrains. This will also provide some headroom for the additional AI models that will need to be deployed alongside the first one.
Another rule of thumb is to provision twice as much system memory as the total GPU memory, and never less than 1.5x the total GPU memory. The memory should be evenly spread across all CPU sockets and memory channels for optimal performance.
Networking: As operations increasingly rely on digital technologies such as edge computing, resilience is key. There are two networks to consider when designing an edge solution: the network between the edge AI location and cloud, and the network between a sensor and the edge AI system.
Understanding the type of network connectivity of your environment will help in determining the specific networking bandwidth requirements for your use case. For example, for a use case like robotics, where wireless connectivity may not be possible, 5G is the next best choice as it offers minimal congestion and guaranteed service and bandwidth.
Accelerators: Most edge applications run adequately on single socket x86 or Arm CPUs. But when the edge applications incorporate AI capabilities, they are far more compute intensive.
To run an inference engine at the edge, the edge hardware needs enough compute power to execute complex neural networks with massively parallel computations. CPUs execute all the independent cells of a neural network sequentially, while discreet accelerators can execute them in parallel. Hence, accelerators are architecturally suited for AI, providing meaningfully better performance. They have become an essential component of modern AI infrastructure.
Among the most effective discrete accelerators for edge AI are GPUs and DPUs.
Storage: Naturally, the edge server requires local storage, usually a solid state hard drive, for its operating system, network components, hardware drivers, and application software. Unlike other applications, edge AI solutions typically process a massive amount of unstructured input data such as images, voices, and sensor readings. Depending on how much of this data needs to be stored, for how long, how securely, and the level of reliability, different storage options are called for.
The first step in determining what storage is necessary for an edge AI solution requires IT teams to think through a data strategy. The data strategy will dictate what and how much data will need to be stored locally or in the cloud. In turn, this will guide which storage options are best for that particular solution. Without a proactive strategy, developers often make inconsistent and suboptimal choices that create problems down the road.
Security: Security is paramount for edge AI computing devices, as they are deployed in remote locations outside the data center firewalls and the physical protections that limit access to systems. For more details, see Edge Computing: Considerations for Security Architects.
When it comes to an edge AI solution, five areas should be understood and made part of the overall solutions architecture: end-to-end encryption, mutual authentication, physical security, zero trust networking, and real-time monitoring.
Management: A remote management plan is critical for edge environments because systems at the edge are distributed, always on, and often operate in remote settings. See Remotely Operating Systems and Applications at the Edge to learn more.
An edge management solution will have automatic deployment and provisioning capabilities, ongoing management, real-time alerting, and auditing. It will also use modern, cloud-native tools.
Organizations can choose whether to build or buy a management solution. The following are questions to consider: How quickly does a solution need to be set up? Is the appropriate team and expertise available? Does the solution provide secure management of the edge environment?
Pillars of a successful edge deployment
Deploying the infrastructure needed to support a scalable edge AI solution is a big challenge. The process is iterative and time consuming, yet critical to do correctly. Decisions that are made when building an edge AI solution have far-reaching implications that will impact an organization’s business outcomes.
For more guidance on this topic, download An IT Manager’s Guide: How to Successfully Deploy an Edge AI Solution.
References
1. Gartner, “Building an Edge Computing Strategy,” G00753920, September 2021. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.