Finally! Time to fly the flannel.. I built the first 3 generations described here starting in '04; it's awesome now to have something to point to, especially as FB has lead the charge on opening their own gear. And I'm sure it's going to make collaboration and hiring that much easier.
The secrecy was at first a universally-agreed necessity; in simplistic terms, we didn't want MSFT knowing how much money to throw at the problem, or even what the problem was. This was true (and surely still true to an extent) for all of platforms: it was always amusing to see public photo shoots at "google datacenters", which in reality were little more than a stack of google search appliances at a corp location.
The level of detail here is great and really sums up 10 years of a lot of hard-earned findings. I'm thrilled pictures of the hardware are even included, that team is just top notch.
For anyone who isn't a Google or Facebook but has large or growing colo/dc network needs, check out Cumulus Networks. It applies a lot of the SDN ideas (think ssh/puppet-driven config mgmt for your switches as just the start) and topology possibilities seen here; doesn't hurt that JR did a brief stint on firehose :)
I agree, its nice they have finally come out with some of the stuff they are/were doing. The Jupiter stuff sounds especially tasty for east-west heavy workloads.
The real power savings come not from the switches themsleves, but from the application and scheduling architectures it enabled.
Having full cross-sectional bandwidth between any pair of hosts means the bin packing problem is a lot easier. You don't need to (say) make sure your map reduce job is scheduled with one shard per rack because racks only have so much bandwidth. Any host on any rack will do. You can forget racks even exist.
This makes overall utilization of clusters more efficient (tighter bin packing), and the corollary is you don't need as many clusters and machines. (Not that has ever stopped Google from building more :)
When you actually have to use it what's amazing is how much congestion and packet loss there is. On paper it looks like a zero-impedance source of data but in reality it barely keeps up.
I used to work in cluster networking, before I became an SRE. Near as I can tell, no names have been switched around in this paper, and it covers all of the major generations of Google cluster networks up to a fairly recent point in time.
I used to work in platforms (not networking, but we worked pretty closely with them) and the information in the paper was accurate and surprisingly complete.
The secrecy was at first a universally-agreed necessity; in simplistic terms, we didn't want MSFT knowing how much money to throw at the problem, or even what the problem was. This was true (and surely still true to an extent) for all of platforms: it was always amusing to see public photo shoots at "google datacenters", which in reality were little more than a stack of google search appliances at a corp location.
The level of detail here is great and really sums up 10 years of a lot of hard-earned findings. I'm thrilled pictures of the hardware are even included, that team is just top notch.
For anyone who isn't a Google or Facebook but has large or growing colo/dc network needs, check out Cumulus Networks. It applies a lot of the SDN ideas (think ssh/puppet-driven config mgmt for your switches as just the start) and topology possibilities seen here; doesn't hurt that JR did a brief stint on firehose :)