Networking / Communications

eCommerce and Open Ethernet: Criteo Clicks with SONiC

When you see a browser ad for a new restaurant, or the perfect gift for that hard-to-please family member, you probably aren’t thinking about the infrastructure used to deliver that ad. However, that infrastructure is what allows advertising companies like Criteo to provide these insights. The NVIDIA networking portfolio is essential to Criteo technology stack.

Criteo is an online advertising platform, the tier between digital advertisers and publishers. This business requires Criteo to solve problems related to quantities of “web scale.” Criteo processes hundreds of billions of dollars in sales, driven by billions of ads a day over tens of thousands of servers, thousands of networking devices, and terabits of east-west traffic per second. The communication within and between Criteo’s 10 data centers (across three continents) is of paramount importance, with the network taking center stage.

Moving away from lock-in

Starting in 2014, Criteo embarked on an initiative to completely overhaul their networking strategy, modernizing infrastructure, and reducing costs. By multisourcing hardware from different vendors, Criteo would be able to reduce costs, gain more flexibility in the procurement process, and become less dependent on individual vendor supply chains.

Criteo's networking journey including monolithic era, multi-vendor, and network agility.
Figure 1. Criteo’s journey to change their networking approach started in 2014 and continues today

 

With a new hardware approach, software came next. Criteo needed their OS to be compatible with their networking automation stack, consisting of in-house, hardware-agnostic tooling built mostly in Python. However, each new OS added to the mix would require unique updates to the rest of the stack to support it. Additionally, while vendor hardware was often affordable, the proprietary software attached ballooned the budget. 

Picking one OS for all the platforms solved both problems. Enter SONiC: after attending the Open Compute Project (OCP) Global Summit, Criteo began to evaluate the NOS in early 2018. As an open-source OS conceived by Microsoft and the OCP to meet the needs of the hyperscalers, SONiC had the design and functionality to meet Criteo’s needs. Moreover, SONiC’s openness meshed perfectly with Criteo’s flexible hardware sourcing strategy and would fully unlock their networking stack.

Turning over a New Leaf with NVIDIA

More than viewing NVIDIA as just a vendor, Criteo and NVIDIA partner on SONiC, with NVIDIA maintaining and developing SONiC’s feature set, and Criteo helping provide input. This comes from the way NVIDIA offers SONiC to customers. Rather than build a proprietary branch from the community releases, NVIDIA supports the community release of the OS as “pure SONiC,” without any add-ons. As one of the leading contributors to the SONiC codebase, NVIDIA is uniquely positioned to influence SONiC’s roadmap, and make Criteo’s visions a reality. 

Additionally, with NVIDIA providing ASIC-to-Protocol (A2P) support, the network team can fully rely on NVIDIA to offload and triage networking issues at any level with minimal interruption. Criteo also benefits from the reach of NVIDIA in the space. NVIDIA develops the features and uploads them into the community main branch, maintaining a pure SONiC commitment and allowing Criteo freedom of choice.

Timeline of Criteo evaluation and achievements with SONiC.
Figure 2. Criteo was an early adopter of SONiC in 2018, working through early challenges on the way to full data center rollout

Summary

Evaluating the mission, the goals of Criteo’s 2014 project are on target, with costs being brought under control, deployment flexibility growing, and a network team that has picked up some handy DevOps + CI/CD skills. But the goal remains a work in progress; Criteo sees a day when all infrastructure, including their management network, is running SONiC, with truly one NOS to rule them all. So next time, when you see that killer ad, maybe you’ll also think about the network fabric that makes it possible.

For more information see the following resources:

Discuss (2)

Tags