News AMD takes the AI networking battle to Nvidia with Salina 400 DPU & Pollara 400 smart NIC from Pensando

marees

Senior member
Apr 28, 2024
995
1,325
96

AMD launches new DPUs to boost AI efficiency and network performance in data centres​

AMD has launched new a range of data processing units (DPUs) to reduce strain on networks for data centre operators.​

Unveiled at the company’s Advancing AI event in San Francisco, the Pensando Salina 400 and Pensando Pollara 400 are designed to enhance AI workload efficiency by improving networking routing to avoid traffic congestion.

AMD’s Salina 400 is designed for front-end networks. Featuring 16 N1 Arm cores, the DPU is designed for hyperscalers, enabling them to support intelligent load balances to utilise the complete bandwidth while minimising network congestion.
The new Salina 400 DPU optimises back-end networks, enabling performance to remain efficient during intense workloads such as AI training.


The new Polara 400 networking adapter, meanwhile, is designed for back-end networks.
the Polara 400 NIC is the first-ever adapter designed to support the UEC standard for AI and high-performance computing data centre interconnects. Developed by the Ultra Ethernet Consortium, it’s seen as an alternative to InfiniBand, an interconnect standard largely used by hardware rival Nvidia.


Sitting at the heart of both of AMD’s new networking solutions is its P4 engine, a compact fully programmable unit designed to optimise network workloads.
The P4 is capable of supporting 400 gigabytes per second (Gb/s) line rate throughput while multiple services run concurrently on the device.

The devices are essentially designed to ensure AI workloads in data centres powered by vast arrays of GPUs operate at peak efficiency. The hardware can effectively manage network congestion to avoid performance degradation by re-routing workloads to avoid network failures and quickly recover from occasional packet loss.


https://www.capacitymedia.com/article/amd-launches-new-dpus


Soni Jiandani, SVP and general manager of the network technology solutions group at AMD said in a press briefing that the Ethernet-based standard can scale to millions of nodes, compared to the foundational architecture of InfiniBand which is not poised to scale beyond 48,000 nodes “without making dramatic and highly complex workarounds.”

The Polara 400 is also programmable, enabling it to support further UEC-developed standards from release.

“Selena 400 and the Polara 400 are solving the challenges for both front-end and back-end networks, including faster data ingestion, secure access, intelligent load balancing, congestion management and fast failover, and loss recovery,” Jiandani said.

More on DPU:

a DPU is designed for disaggregating the infrastructure and application resources in the data center. The DPU is designed to be an infrastructure endpoint that both exposes network services to a server and to devices and at the same time securely exposes the server and device capabilities to the broader infrastructure.

key characteristics that DPUs share. Among them are:

  • High-speed networking connectivity (usually multiple 100Gbps-200Gbps interfaces in this generation)
  • High-speed packet processing with specific acceleration and often programmable logic (P4/ P4-like is common)
  • A CPU core complex (often Arm or MIPS based in this generation)
  • Memory controllers (commonly DDR4 but we also see HBM and DDR5 support)
  • Accelerators (often for crypto or storage offload)
  • PCIe Gen4 lanes (run as either root or endpoints)
  • Security and management features (offering a hardware root of trust as an example)
  • Runs its own OS separate from a host system (commonly Linux, but the subject of VMware Project Monterey ESXi on Arm as another example)
https://www.servethehome.com/dpu-vs-smartnic-sth-nic-continuum-framework-for-discussing-nic-types/

View attachment 109375
 

marees

Senior member
Apr 28, 2024
995
1,325
96

AMD unveils industry's first Ultra Ethernet ready network card for AI and HPC​

Even though the final UEC 1.0 specification is due in Q1 2025.

https://www.tomshardware.com/networ...ra-ethernet-ready-network-card-for-ai-and-hpc


The Ultra Ethernet Consortium (UEC) has delayed release of the version 1.0 of specification from Q3 2024 to Q1 2025, but it looks like AMD is ready to announce an actual network interface card for AI datacenters that is ready to be deployed into Ultra Ethernet datacenters. The new unit is the AMD Pensando Pollara 400, which promises an up to six times performance boost for AI workloads.

The AMD Pensando Pollara 400 is a 400 GbE Ultra Ethernet card based on a processor designed by the company's Pensando unit. The network processor features a processor with a programmable hardware pipeline, programmable RDMA transport, programmable congestion control, and communication library acceleration. The NIC will sample in the fourth quarter and will be commercially available in the first half of 2025, just after the Ultra Ethernet Consortium formally publishes the UEC 1.0 specification.

The AMD Pensando Pollara 400 AI NIC is designed to optimize AI and HPC networking through several advanced capabilities. One of its key features is intelligent multipathing, which dynamically distributes data packets across optimal routes, preventing network congestion and improving overall efficiency. The NIC also includes path-aware congestion control, which reroutes data away from temporarily congested paths to ensure continuous high-speed data flow
 

marees

Senior member
Apr 28, 2024
995
1,325
96
Nvidia Competition:

Nvidia intros the 'SuperNIC' – it's like a SmartNIC, DPU or IPU, but more super​

If you're doing AI but would rather not do InfiniBand, this NIC is for you​


Before considering the SuperNIC, it's worth remembering that SmartNICs/IPUs/DPUs are network interface controllers (NICs) that include modest compute capabilities – sometimes fixed-function ASICs, with or without a couple of Arm cores sprinkled in, or even highly customizable FPGAs.

Many of Intel and AMD's SmartNICs are based around FPGAs, while Nvidia's BlueField-3 class of NICs pairs Arm cores with a bunch of dedicated accelerator blocks for things like storage, networking, and security offload.

This variety means that certain SmartNICs are better suited, or at the very least marketed, towards certain applications more than others.

https://www.theregister.com/2023/11/21/nvidia_supernic_nic/

For the most part, we've seen SmartNICs – or whatever your preferred vendor wants to call them – deployed in one of two scenarios.

The first is in large cloud and hyperscale datacenters where they're used to offload and accelerate storage, networking, security, and even hypervisor management from the host CPU.

Amazon Web Services' custom Nitro cards are a prime example. The cards are designed to physically separate the cloudy control plane from the host. The result is that more CPU cycles are available to run tenants' workloads.

This is one of the use cases Nvidia has talked up with its BlueField DPUs and has partnered with companies like VMware and Red Hat to integrate the cards into their software and virtualization stacks.

Bypassing bottlenecks

The second application for SmartNICs has focused more heavily on network offload and acceleration, with an emphasis on eliminating bandwidth and latency bottlenecks.

This is the role Nvidia sees for the SuperNIC variant of its BlueField-3 cards. While both BlueField-3 DPUs and SuperNICs are based on the same architecture and share the same silicon, the SuperNIC is a physically smaller device that uses less power, and is optimized for high-bandwidth, low-latency data flows between accelerators.