Yep, such ideas have been around. But congestion is a fundamental problem. Admission control is the only way to ensure there is no congestion collapse.
The technical issue is that you would need global arbitration to ensure that the _goodput_ (useful bytes delivered) is optimal. With training across 32k GPUs and more these days, global arbitration to ensure the correct packets are prioritised is going to be very difficult. If you are sending more traffic than the receiver's link capacity, packets _will_ get dropped, and it's suboptimal to transmit those dropped packets into the network as they waste link capacity elsewhere (upstream) within the network.
The technical issue is that you would need global arbitration to ensure that the _goodput_ (useful bytes delivered) is optimal. With training across 32k GPUs and more these days, global arbitration to ensure the correct packets are prioritised is going to be very difficult. If you are sending more traffic than the receiver's link capacity, packets _will_ get dropped, and it's suboptimal to transmit those dropped packets into the network as they waste link capacity elsewhere (upstream) within the network.