
Introduction
In the past years, the Clos or fat-tree architecture has been the dominant model for building large-scale datacenter networks. Its success stems from a simple but powerful idea: provide bandwidth and path diversity in abundance. Multiple equal-cost paths exist between nearly every source and destination pair, allowing relatively simple forwarding techniques such as ECMP to distribute traffic across a highly structured topology.
The emergence of large-scale AI clusters has begun to challenge the economics of this approach. As clusters scale to tens or hundreds of thousands of GPUs, the cost of the network itself becomes increasingly significant. Switches, optics, cabling, and power consumption now represent meaningful portions of overall infrastructure cost, creating strong incentives to find architectures that deliver comparable performance with less infrastructure.
Research papers on network optimization have been exploring different approaches to achieving this objective with several successful deployments. The Rail-Only architecture eliminates the spine layer and leverages workload-specific communication characteristics to maintain application performance while reducing network infrastructure costs by approximately 38% to 77% and network power consumption by 37% to 75%. [R1]
More recently, Random Network Graph (RNG) architectures have demonstrated that expander topologies can achieve similar performance and resilience characteristics while reducing network costs by as much as 45% relative to traditional Clos designs. [R2]

Although these architectures differ significantly, they share a common goal: extracting greater value from less infrastructure. Traditional Clos fabrics achieve performance through structural abundance. Rail-Only and RNG fabrics achieve performance through structural efficiency.
This shift has important implications for traffic engineering.
In a Clos fabric, excess topology often masks inefficiencies because abundant path diversity and spare capacity are available throughout the network. In contrast, rail-only and random graph fabrics intentionally remove infrastructure to reduce cost. Every switch, optic, and link eliminated from the design increases the importance of efficiently utilizing the resources that remain.
This creates a natural progression in datacenter networking. As architectures evolve from Clos fabrics toward increasingly efficient topologies, traffic engineering becomes necessary. Techniques that focus on selecting among abundant paths are well aligned with highly structured Clos environments. Techniques that focus on maximizing utilization of scarce resources become increasingly valuable as topologies become more efficient, more irregular, and more dependent on extracting the greatest possible value from every remaining network resource.
Understanding this relationship between topology efficiency and traffic engineering is essential to understanding the next generation of datacenter fabrics.
The Economics of Topology
Bandwidth is not the only resource in a network. Topology is also a resource.
Path diversity, forwarding alternatives, resilience to congestion, and the ability to reroute traffic around bottlenecks are all products of the underlying graph. Traditional Clos fabrics provide these resources in abundance. When multiple equivalent paths exist everywhere in the network, the consequences of an imperfect routing decision are often limited.
The economics change when topology is intentionally reduced.
A rail-only architecture derives much of its value from eliminating expensive infrastructure. A random graph fabric derives value from providing similar performance with significantly fewer resources than a traditional Clos design. In both cases, network operators are effectively converting excess topology into capital savings.
The consequence is that every remaining link becomes more valuable.
A congested link in an overbuilt Clos fabric may represent a small fraction of the network’s total forwarding capacity. The same link in a rail-only or random graph architecture may represent a much larger portion of the available resources. Improving utilization of that resource therefore produces a larger economic return. This creates what one could call the topology efficiency paradox:
The more successful a fabric architecture is at removing infrastructure, the greater the need for optimizing the infrastructure that remains.
MRC: Optimizing Structured Fabrics
MRC addresses the challenge of identifying paths that are experiencing congestion or degraded performance through continuous monitoring of path quality within a highly structured Clos environment. Using telemetry signals such as ECN markings and queueing behavior, MRC evaluates candidate paths and dynamically steers traffic away from paths that appear degraded.[R4] This is a logical optimization strategy for a Clos fabric.

In a Clos network, path diversity is already abundant. The challenge is not discovering new forwarding opportunities. The challenge is selecting among a large collection of available alternatives and avoiding paths that are currently underperforming.
MRC therefore operates primarily as a path-selection mechanism. It continuously evaluates a known set of forwarding choices and removes paths from consideration when they exhibit undesirable characteristics.
This model aligns naturally with highly structured fabrics where forwarding alternatives are plentiful and relatively predictable.
MRC’s implementation also relies heavily on host-driven SRv6 traffic steering[R6], which introduces a separate set of architectural tradeoffs. SRv6 provides a flexible mechanism for encoding forwarding intent directly within the packet. However, that flexibility comes at the cost of additional software and hardware complexity. Deployments must manage locator allocation, function definitions, argument spaces, and the associated control-plane mechanisms required to distribute and maintain those constructs as well as the forwarding chip resources needed for parsing SRv6 uSID arrays. As fabrics scale, the task of managing large numbers of candidate paths and corresponding SRv6 forwarding instructions can itself become a non-trivial scaling and operational challenge.
From Path Selection to Resource Allocation
Rail-only and random graph architectures change the nature of the optimization problem.

As a topology becomes less structured, forwarding opportunities become less obvious. Paths are no longer uniformly equivalent. Some paths may share heavily utilized resources while others may be largely disjoint. The challenge shifts from selecting among known alternatives to understanding how the network should utilize the topology as a whole.
This distinction becomes increasingly important as multiple concurrent workloads compete for network resources.
In a highly overbuilt Clos fabric, the network can often absorb inefficiencies through excess capacity. In a topology specifically designed to minimize infrastructure cost, excess capacity is intentionally reduced. Optimization can no longer rely solely on removing poor paths. The network must actively identify opportunities to better utilize available resources.
The optimization problem becomes less about path selection and more about resource allocation.
MPTE: Efficient Fabric Optimization
This is where Multipath Traffic Engineering (MPTE) becomes particularly interesting.
Rather than treating the network as a collection of individual paths, MPTE constructs directed acyclic graphs (DAGs) that represent multiple forwarding opportunities simultaneously. [R3]
Conceptually, MPTE treats path diversity as a resource that can be allocated rather than simply selected. This distinction becomes increasingly valuable as network architectures become more efficient.
In a traditional Clos fabric, abundant path diversity partially masks suboptimal forwarding decisions, though MPTE still provides meaningful optimization even here. In rail-only and random graph fabrics, every forwarding decision has greater consequences because the network contains fewer total resources. The value proposition of MPTE is therefore most closely aligned with these emerging topologies, where its precision yields the greatest impact.
Rail-only networks reduce infrastructure costs by eliminating spine switches and their associated links. Random graph fabrics pursue a different strategy. Rather than relying on a hierarchical Clos structure with abundant switching resources, they leverage sparse but highly connected graph topologies that preserve path diversity and resiliency with significantly fewer links and switches. By exploiting the natural expansion properties of these graphs, random network fabrics can achieve performance comparable to much larger Clos deployments while substantially reducing infrastructure cost. [R2]

MPTE complements both approaches by maximizing the utility of the remaining topology. Rather than selecting a small set of candidate paths, MPTE computes a directed acyclic graph (DAG) that represents a collection of forwarding opportunities between source and destination. Traffic can then be distributed across the branches of the DAG, creating load-balancing opportunities at every junction within the graph. The result is a forwarding structure that can efficiently utilize available path diversity, distribute traffic away from emerging bottlenecks, and extract greater value from the limited resources of topology-efficient fabrics.
There is a natural synergy between random graph fabrics and MPTE. The value of an expander graph lies in its ability to create large numbers of distinct forwarding opportunities despite having relatively few links. However, realizing the full benefit of that path diversity requires a mechanism capable of utilizing it efficiently. MPTE’s DAG-based forwarding model is particularly well suited to this environment because it can expose and exploit multiple branches of the graph simultaneously, allowing traffic to be distributed across the topology rather than confined to a small set of predetermined paths. As fabrics become increasingly sparse yet highly connected, the ability to transform graph connectivity into usable forwarding capacity becomes increasingly valuable.
An additional advantage of the MPTE model is that the forwarding objective is represented by the DAG itself rather than by explicit source-routed instructions carried within every packet. The network is optimized through graph construction and distributed load balancing across DAG branches, reducing the dependence on per-packet path encoding and allowing forwarding efficiency to scale with topology rather than with the number of candidate paths.
In economic terms, rail-only and random graph architectures create the savings. MPTE helps preserve and extend those savings by increasing the efficiency with which the remaining, after pruning the Clos fabric, topology is utilized.
The Evolution of Datacenter Traffic Engineering
Viewed through this lens, the progression from Clos fabrics to rail-only architectures or random graph fabrics can also be viewed as progressing towards the need for the integration of traffic engineering techniques.
Clos fabrics created environments where abundant path diversity made simple load balancing effective. As networks grew larger, mechanisms such as MRC have emerged to improve utilization by identifying and avoiding poorly performing paths.
The next generation of architectures is pursuing a different objective. Rather than relying on abundance, these designs intentionally reduce infrastructure costs while preserving performance. As a result, the optimization challenge shifts from selecting among many equivalent paths to extracting maximum value from a smaller and often less structured set of resources.
This is precisely the environment where graph-based traffic engineering approaches such as MPTE naturally align. As fabrics become more efficient, traffic engineering must become more sophisticated.
Conclusion
The most significant lesson from recent datacenter fabric research may not be the specific topologies themselves. It may be the broader economic principle they reveal.
Rail-only fabrics demonstrated that large amounts of infrastructure can be removed while preserving application performance. Random graph fabrics demonstrated that topology can be made significantly more efficient without sacrificing resilience or throughput. Together, these architectures suggest that the future of datacenter networking may involve substantially less infrastructure than traditional Clos designs. [R1][R2]
However, reducing infrastructure increases the value of every remaining network resource.
MRC represents a compelling optimization strategy for highly structured Clos fabrics, where the primary challenge is identifying which of many available paths should carry traffic. MPTE represents a different approach, one that becomes increasingly attractive as topologies become more efficient, more irregular, and more dependent on extracting maximum value from every available resource. [R3][R4]
The evolution from Clos to Rail-Only to Random Graph fabrics should therefore be accompanied by a corresponding evolution in traffic engineering—from path selection toward resource allocation, from abundance toward efficiency, and from finding good paths toward making the most of the topology that remains.
References
[R1] Rail-only: A Low-Cost High-Performance Network for Training LLMs with Trillion Parameters https://people.csail.mit.edu/ghobadi/papers/rail_only_hoti_2024.pdf
[R2] RNG: Flat Datacenter Networks at Scale https://arxiv.org/pdf/2604.15261
[R3] Multipath Traffic Engineering https://datatracker.ietf.org/doc/draft-kompella-teas-mpte/
[R4] Multipath Reliable Connection (MRC) Specification https://www.opencompute.org/documents/ocp-mrc-1-0-pdf
[R5] Segment Routing IPv6 Security Considerations https://datatracker.ietf.org/doc/draft-ietf-spring-srv6-security/
[R6] RFC 9602: Segment Routing over IPv6 (SRv6) Segment Identifiers in the IPv6 Addressing Architecture https://www.rfc-editor.org/info/rfc9602/
Leave a Reply