- they were way ahead, and they didn't make any big mistakes
- they weren't waiting for others to catch up. They were aggressively improving
- memory bandwidth is almost always the bottleneck. Hence systolic array is "overrated". Furthermore, interconnect is the new bottleneck now
- cuda offers the most flexibility in the world of ever changing model requirements
- they were way ahead, and they didn't make any big mistakes
- they weren't waiting for others to catch up. They were aggressively improving
- memory bandwidth is almost always the bottleneck. Hence systolic array is "overrated". Furthermore, interconnect is the new bottleneck now
- cuda offers the most flexibility in the world of ever changing model requirements