Go's GC wasn't really made with throughput maximization in mind. It's a language that doesn't scale that well to take advantage of beefy nodes and has weak compiler. I suppose the Google's vision for it is to "crank the replica count up". gRPC servers based on top of ASP.NET Core, Java Vert.X and Rust Thruster will provide you with much higher throughput on multi-core nodes.
https://github.com/LesnyRumcajs/grpc_bench/discussions/441