Data Center / Cloud

Open Source Time Synchronization Services for Data Center Operators

Mar 06, 2023

By Ahmad Byagowi, Dotan Levi, Elad Wind and Elad Blatt

Discuss (0)

AI-Generated Summary

Dislike

NVIDIA, Meta, and the Open Compute Project Time Appliance Project (OCP-TAP) collaborated to establish open, reliable, and scalable time synchronization solution blueprints.
The Open Time Server, developed by the OCP-TAP community, maintains the authoritative source of time for data centers and can support millions of clients.
Meta achieved submicrosecond precision within their data centers using hardware timestamping on commodity servers, and NVIDIA contributed to this effort by adding support for hardware timestamping in PTP4L.

AI-generated content may summarize information incompletely. Verify important information. Learn more

Applications are increasingly real-time and delay-sensitive, from distributed databases and 5G radio access networks (RANs) to gaming, video streaming, high-performance computing (HPC), and the metaverse. Nanosecond resolution time sync augments conventional computing in many ways, including:

helping databases improve accuracy, efficiency, collaboration, and security of data management systems by ensuring that multiple databases are kept up-to-date and consistent with each other
enhancing security policies by screening authentic user activity from malicious and robotic ones simply by looking at based latency patterns
enabling the “ready player one,“ or metaverse of gaming worlds
creating immersive shopping experiences, helping customers make informed purchasing decisions and reduce checkout hassle with computer vision and real-time analytics
automating large factories and facilities, driving production lines, warehouses, and machinery to new efficiencies by enabling the digital factory twin to mimic the real one, and vice versa
maintaining the accuracy, correct distribution, and in-time processing of the incoming bands in 5G networks

Through a series of collaborations, NVIDIA, Meta, and others in the Open Compute Project Time Appliance Project (OCP-TAP) established blueprints for a modern time synchronization solution that are open, reliable, and scalable.

An open time synchronization solution

Meta achieved submicrosecond precision within their giant, globally spread data centers. This was achieved with hardware timestamping on commodity servers, even under CPU and network loads, and subject to temperature variances.

Until recently, deploying Precision Time Protocol (PTP) at such a high scale required specialized and dedicated hardware and software components. In addition, there was an absence of good blueprints for how to enable precise time services in data centers.

This is where OCP-TAP comes in; specifically, the Time Cards innovation enables Meta to synchronize time between data centers. PTP IEEE-1588 applied on network interface cards (NICs) and networking devices, like the NVIDIA ConnectX, synchronizes all the machines within a data center using the network.

A time server that scales to millions of clients

The Open Time Server, which is open sourced by the OCP-TAP community, maintains the authoritative source of time for the data center.

The Time Card can support millions of clients/syncs. The NIC is capable of “full-wire-speed hardware timestamping.” Bottlenecks are pushed to the software domain.

Meta engineers rewrote the entire master functionality of a PTP daemon, with software architecture and design that take extra care regarding scalability. This stack is now known as PTP4U, a scalable PTP stack. For more details, visit facebook/time on GitHub.

The Open Time Server was able to consistently support over 1 million clients (ordinary clocks) with synchronization frequency of 1 Hz, using the PTP4U server software.

Commercial grandmaster clocks support up to several hundred clients, while hyperscale data centers require many orders of magnitude more. The need to support timing at remote edge locations of the network also increases the scale numbers.

Building a huge Open Time Server

If you had asked a PTP expert how to scale a PTP solution in the summer of 2021, the answer would likely have been to use a boundary clock (BC). There are two challenges with introducing BCs into data centers.

The first challenge is operational. While not specific to Meta, BC implementation on network switches assumes certain hardware and software support. Introducing BCs into existing brownfield deployments poses a significant risk. The switches are the core elements of the entire network. Enabling BCs on all participating switches would require requalifying the entire network. This is a long, intensive, expensive, and risky procedure. In terms of ROI at that time, it would have been impossible.

The second challenge relates to synchronization technology mandating each compute node to know not only the precise time, but also the uncertainty window, or degree of accuracy. For more details, see Spanner, True-Time, and The CAP Theorem.

This means having an easy method to determine, for every participating node in the data center, the time offset from the grandmaster (and not only from the direct master as for BCs). A Time Server scaling to millions could rely on transparent clocks (TCs), avoiding BCs altogether.

Transparent clocks for data centers

Transparent clocks do not contribute to the total accumulated noise of the clock tree, simply because TCs are not really clocks, and they do not discipline any clocks. Instead, TCs simply publish the packets’ residency time, typically less than 1 microsecond, a small enough period that even a simple oscillator will not drift dramatically.

Transparent clocks also reduce the operational complexity. They do not run software daemons and are more commonly supported by existing switches. This makes the introduction of PTP into brownfield data centers much simpler.

Finally, TCs are transparent, such that each node is disciplined directly by the grandmaster clock. This facilitates directly figuring out the uncertainty window for all participating nodes.

Precision and accuracy in hardware

A monolithic hardware clock supporting UTC is key for timestamping packets at full wire speed even in high-speed networks. NVIDIA added support for hardware timestamping in PTP4L (PTP 1588-2008 Linux daemon), which enables the system and applications to obtain time in UTC format.

NVIDIA also made several other changes to PTP4L to improve its accuracy; for example, adding support for the use of a hardware reference clock, which can provide a higher level of accuracy than a software-based clock.

Testing PTP reliability at scale

To study how well PTP runs on a high-scale network mandates a method to constantly measure, gauge, and validate the synchronization precision at a high scale. We came up with an infinitely scalable test method using ConnectX-6 Dx Pulse Per Second (PPS-In) as the measurement. (Using the PPS-Out method would max with a handful of devices.)

To this we configured the ConnectX to run real-time clock mode and chained devices from its PPS-In through the PPS-Out (Figure 4). Using this method, we characterized very large PTP trees and proved our PTP solution down to nanoseconds.

Summary

The Time Synchronization infrastructure blueprints are available for everyone and are ready for cloud providers and operators. NVIDIA will continue to invest in high-precision time synchronization with the goal to enhance all product lines and solutions.

The journey is not yet complete. Sharing our work with the Open Compute TAP community, and working with our partners to build more blueprints for various use cases will be key to helping this solution become common and relatively easy to deploy.

Additional resources

Check out these related resources to learn more:

Register for NVIDIA GTC 2023 for free and join us March 20–23 to explore breakthroughs in AI, accelerated computing, and more.

Discuss (0)

About the Authors

About Ahmad Byagowi
Ahmad Byagowi leads the precision time team at Meta. Prior to joining Meta, Ahmad was an academic. He received two PhDs, one from Vienna University of Technology (TUWIEN) in Electrical Engineering (Control Systems) in 2010 and a second from University of Manitoba in Computer Engineering in 2016. Ahmad has been active with the open compute project (OCP) and started the time appliances project (TAP) with his colleagues Elad Wind (NVIDIA), Michel Ouellette (Meta), Dotan Levi (NVIDIA), Oleg Obleukhov (Meta), and Georgi Chalakov (NVIDIA).

View all posts by Ahmad Byagowi

About Dotan Levi
Dotan D. Levi leads a team of architects and algorithm developers in the NVIDIA Mellanox CTO group. His topics of interest are time-sensitive networking, and video. He spent last 2.5 years bringing the 5T for 5G technology from vision to silicon and software implantation. He studied electrical engineering in the Technion, Israel Institute of Technology, 16 years ago, and has been studying ever since.

View all posts by Dotan Levi

About Elad Wind
Elad Wind is currently director of solution engineering at NVIDIA, promoting the adoption of NVIDIA technology by hyperscalers. Before joining NVIDIA, Elad served in various technical and sales roles at Mellanox including product sales and project management. Elad was also a founding member of the OCP Time Appliance Project and Mellanox Singapore APAC head office. Elad holds an MBA from Tel-Aviv University and ESSEC Business School Paris, and a bachelor of science degree in electrical engineering from the Technion, Israel.

View all posts by Elad Wind

About Elad Blatt
Elad Blatt is Global Head of Business Development Telco Networking at NVIDIA. He is an innovative and driven executive, with 17 years of experience of leading Sales and Business development teams in domestic and international marketplaces. Tech-savvy and technology-driven, he builds fruitful partnerships with hundreds of clients, delivering multimillion-dollar bottom-line growth.

View all posts by Elad Blatt