In part 2 of this series, we focus on solutions that optimize and modernize data center network operations. In the first installment, Optimizing Your Data Center Network, we looked at updating your networking infrastructure and protocols.
NetDevOps is an ideology that has been permeating through the IT infrastructure diaspora for the past 5 years. As a theory, it can provide many areas to optimize infrastructure operations.
We will discuss some applications of NetDevOps that can be applied to your operational workflows.
These include:
- Centralizing configuration management through Infrastructure as Code (IaC.)
- Automating repetitive operations tasks.
- Using automation to implement standardization and consistency in configurations.
- Testing and validating changes using networking digital twin simulations.
Centralizing configuration management with IaC
The principles behind IaC have been used in software development for developers to contribute code to the same software project in parallel. But they also create a centralized repository where the code project—including networking configurations for servers, NICs, routers, and switches—can reside and act as a singular source of truth.
The decentralized aspect of configuration management makes it fundamentally inefficient to enforce standardization. It also makes it difficult to determine the correct configuration or track changes.
Using IaC with source control management software like Git can help resolve issues, ensuring the correct network configurations and code are available to all admins, servers, and switches.
Automating repetitive operations tasks
In large-scale infrastructures, components of the configuration will be the same regardless of the device. Configurations like syslog server, NTP server, SNMP settings, and other management settings can be automated with technology such as Zero Touch Provisioning (ZTP). ZTP can apply configurations to a switch-on boot to reduce the errors that may happen with manual configurations across many devices. Applying standard configurations and executing repetitive tasks are perfect for ZTP as it can be enforced consistently across every device.
Leveraging automation to implement standardization in configurations
Automation normally relies on an external tool to drive configurations after the device has fully booted. Automation is more dynamic and can be applied multiple times in a device’s operational cycle, whereas ZTP is used only during the first boot of each device.
Automation tools such as Ansible and Salt apply configurations at scale using templating technologies and scripting. These tools simplify infrastructure management by building standardized templates and relying only on key/value pair data structures to populate the templates. This way, an operator can be confident of the configurations and focus on validating that the correct configurations are going to the right devices.
Additionally, automation tools can apply configurations at scale. Any fixes for misconfigurations or bugs can be confidently applied to thousands of nodes with minimal effort and no risk of node misconfiguration due to a mistyped command or distracted administrator.
Testing and validating changes in networking Digital Twin simulations
When using automation to apply configurations at scale to multiple nodes, it is critical to understand the larger impact before committing changes. Applying changes to a few nodes as a test often doesn’t reveal what will happen when the changes are applied to every node. The NVIDIA Air infrastructure simulation platform creates a digital twin of the environment for users to test all changes before deploying them.
With a digital twin you can run automation in a safe sandbox, to ensure the changes will not cause any unforeseen outages. Coupling the digital twin with a validation technology, such as NVIDIA NetQ, can create an automated testing pipeline to ensure that all configuration changes do exactly what is expected from each change window.
Conclusion
This series covered ways to optimize your data center network. The first approach was through modernizing the network architecture protocol. The second post focused on delivering operational efficiency gains through NetDevOps.
Optimization is critical to maintaining a high level of service, peak efficiency, and productivity. By leveraging the topics discussed, you’ll be able to optimize your data center network to be a more resilient platform that improves the overall performance of your business and save you money.
I encourage you to find additional ways to streamline data center operations and further optimization by clicking on the additional resource links below.
- NVIDIA Air Infrastructure Simulation Platform & Digital Twins
- NVIDIA NetQ User’s Guide
- Zero Touch Provisioning