AMD Helios: A New Era in AI and HPC Infrastructure

Dwijesh t

At the OCP Global Summit, AMD unveiled the Helios rack-scale AI platform, a high-performance, open-standard solution designed for next-generation AI workloads and high-performance computing (HPC). With a focus on scalability, efficiency, and interoperability, Helios embodies AMD’s vision of open, modular, and powerful data center infrastructure.

Open Standards and Design Philosophy

Helios is built on the Open Rack Wide (ORW) standard, contributed by Meta to the Open Compute Project (OCP). This reference design extends AMD’s open-hardware philosophy beyond individual CPUs and GPUs to the entire system and rack level, minimizing vendor lock-in while maximizing flexibility.

Key open standards integrated in Helios include:

  • OCP DC-MHS (Data Center Modular Hardware System)
  • UALink (Ultra Accelerator Link) for high-speed GPU-to-GPU connectivity
  • Ultra Ethernet Consortium (UEC) architectures for scale-out fabric support
  • UALink-over-Ethernet for flexible, standardized interconnects

This open approach ensures interoperability and modularity across hyperscale deployments.

Core Components of the Helios Platform

Helios combines AMD’s top-tier technologies for maximum AI and HPC performance:

  • GPUs: AMD Instinct™ MI450 Series GPUs (CDNA™ architecture), supporting up to 72 GPUs per rack in some configurations
  • CPUs: AMD EPYC™ processors, including future generations like “Venice”
  • Networking: Advanced AMD Pensando™ networking, featuring Pollara 400 AI NICs for low-latency, high-bandwidth connectivity across racks and clusters

This combination delivers an integrated platform capable of massive parallel processing and ultra-fast data transfer across the system.

Performance and Specifications

Helios is engineered for exascale-class AI training and HPC workloads:

  • Peak Performance: 1.4 exaFLOPS at FP8 precision and up to 2.9 exaFLOPS at FP4 precision
  • Memory Capacity: Up to 31 TB of HBM4 memory per rack
  • Memory Bandwidth: Up to 1.4 PB/s aggregate bandwidth
  • Interconnect Bandwidth: Scale-up interconnect ~260 TB/s; Ethernet scale-out ~43 TB/s

These specifications make Helios a powerful contender for AI supercomputing, capable of handling the largest and most complex AI models.

Physical Design and Efficiency

  • Open Rack Wide Form Factor: Double-wide rack optimized for dense cooling and serviceability
  • Cooling: Quick-disconnect liquid cooling to maintain sustained thermal performance
  • Serviceability: Easy access for maintenance in dense data center environments
  • Resiliency: Ethernet-based multi-path redundancy for continuous AI workload operation

Market Role and Adoption

Helios serves as a reference design for OEMs, ODMs, and cloud providers, accelerating adoption of open AI systems at scale. Major cloud providers, such as Oracle, plan to deploy AI superclusters using Helios and MI450 GPUs by late 2026.

By offering open-standard architecture and competitive hardware, AMD positions Helios as a direct rival to platforms like Nvidia’s Vera Rubin AI infrastructure.

The AMD Helios rack-scale AI platform represents a bold step forward in scalable, open-standard AI infrastructure. With its combination of AMD Instinct GPUs, EPYC CPUs, high-bandwidth networking, and exascale-class performance, Helios is poised to power the world’s largest AI models and redefine the future of AI and HPC workloads.

Share This Article