Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Jupiter Rising: A Decade of Google’s Datacenter Network [pdf] (sigcomm.org)
97 points by packetslave on July 30, 2015 | hide | past | favorite | 17 comments


Finally! Time to fly the flannel.. I built the first 3 generations described here starting in '04; it's awesome now to have something to point to, especially as FB has lead the charge on opening their own gear. And I'm sure it's going to make collaboration and hiring that much easier.

The secrecy was at first a universally-agreed necessity; in simplistic terms, we didn't want MSFT knowing how much money to throw at the problem, or even what the problem was. This was true (and surely still true to an extent) for all of platforms: it was always amusing to see public photo shoots at "google datacenters", which in reality were little more than a stack of google search appliances at a corp location.

The level of detail here is great and really sums up 10 years of a lot of hard-earned findings. I'm thrilled pictures of the hardware are even included, that team is just top notch.

For anyone who isn't a Google or Facebook but has large or growing colo/dc network needs, check out Cumulus Networks. It applies a lot of the SDN ideas (think ssh/puppet-driven config mgmt for your switches as just the start) and topology possibilities seen here; doesn't hurt that JR did a brief stint on firehose :)


Thanks for the kind words for Cumulus!

- nolan


I agree, its nice they have finally come out with some of the stuff they are/were doing. The Jupiter stuff sounds especially tasty for east-west heavy workloads.


Hey Mikey! The first generation certainly was a blast! Amazing how far things have come. :-)


I used to work at Google as well, and frankly I was so abstracted from all of this that it's absurd.

It's like finally reading how my car is an "intern-al comb-usti-on eng-ine" whatever that means.


Neat to see where the "Pluto" mystery switch might fit in, now.

http://www.wired.com/2012/09/pluto-switch/


Originally discussed at https://news.ycombinator.com/item?id=9734305 but this is the actual paper, which was just published.


How much opportunity is there for this level of datacenter design to reduce power consumption? What are the trade-offs?


The real power savings come not from the switches themsleves, but from the application and scheduling architectures it enabled.

Having full cross-sectional bandwidth between any pair of hosts means the bin packing problem is a lot easier. You don't need to (say) make sure your map reduce job is scheduled with one shard per rack because racks only have so much bandwidth. Any host on any rack will do. You can forget racks even exist.

This makes overall utilization of clusters more efficient (tighter bin packing), and the corollary is you don't need as many clusters and machines. (Not that has ever stopped Google from building more :)


In general a faster network allows more flexible scheduling, which should give more freedom to tune the scheduler for energy savings. Check out http://web.stanford.edu/~davidlo/resources/2015.thesis.pdf


Amazing network infrastructure has always been one of my favorite hidden technical wonders behind Google. It's amazing how they handle so much data.


When you actually have to use it what's amazing is how much congestion and packet loss there is. On paper it looks like a zero-impedance source of data but in reality it barely keeps up.


What QoS are you referring to?


I did a brief stint replacing burnt out parts in a Google datacenter.

This paper is weird--like maybe they switched around the names of some things and didn't mention others.

Just an FYI.


I used to work in cluster networking, before I became an SRE. Near as I can tell, no names have been switched around in this paper, and it covers all of the major generations of Google cluster networks up to a fairly recent point in time.


I used to work in platforms (not networking, but we worked pretty closely with them) and the information in the paper was accurate and surprisingly complete.


Been working on this stuff since the beginning. Looks right to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: