Are you implying there are no use cases for matrix multiply?
In any case, the two main deep learning packages have already been updated so for the place this change was almost certainly targeted for, your complaint is answered. I'm just stunned that anyone would complain about hardware matrix multiplication? I've wondered why that hasn't been ubiquitous for the past 20 years.
Everyone should make that improvement in their hardware. Everyone should get rid of code implementing matrix mult and make the hardware call instead. It's common sense. Not to put too fine a point on it, but your complaint assumes that GeekBench is based on code that has implemented all those changes.
> Are you implying there are no use cases for matrix multiply?
The whole point is that these highly specialized scenarios are only featured in very specialized usecases, and don't reflect in overall performance.
We've been dealing with the regular release of specialized processor operations for a couple of decades. This story is not new. You see cherry-picked microbenchmarks used to plot impressive bar charts, immediately followed by the realization that a) in general this sort of operator is rarely invoked with enough frequency to be noticeable, b) you need to build code with specialized flags to get software to actually leverage this feature, c) even then it's only noticeable in very specialized workloads that already run on the background.
I still recall when fused multiply-add was such a game changer because everyone used polynomials and these operations would triple performance. Not the case.
And more to the point, do you believe that matrix multiplication is a breakthrough discovery that is only now surfacing? Computers were being designed around matrix operations way before they were even considered to be in a household.
I'm not complaining, I'm just saying that the higher numbers of that benchmark result do not translate directly to better performance for all software you run. Deep learning as it is right now is probably the main application that benefits from this extension (and probably the reason why it was added in hardware at this point in time).
Well you're really just describing benchmarks- if the benchmark doesn't represent your standard workflow then it probably isn't a good reference for you. But Geekbench includes a bunch of components based on real-world applications like file compression, web browsing, and PDF rendering. So it probably isn't perfect, but it's likely that the M4 will feel a bit faster in regular use compared to an older generation MacBook Pro.
In any case, the two main deep learning packages have already been updated so for the place this change was almost certainly targeted for, your complaint is answered. I'm just stunned that anyone would complain about hardware matrix multiplication? I've wondered why that hasn't been ubiquitous for the past 20 years.
Everyone should make that improvement in their hardware. Everyone should get rid of code implementing matrix mult and make the hardware call instead. It's common sense. Not to put too fine a point on it, but your complaint assumes that GeekBench is based on code that has implemented all those changes.