The amazing thing about this is the first author has published multiple high-impact papers with Google Research VPs! And he is just a 2nd-year PhD student. Very few L7/L8 RS/SWEs can even do this.
I met a AWS engineer a couple of weeks ago and he said Trainium is actually being used for Anthropic model inference, not for training. Inferentia is basically defected Trainiums chips that nobody wants to use.
There are some new research that applies transformers to MIPs. I'm looking forward to LLMs cracking MIPs at some point and beat those commercial solvers.
I find the idea of a language model solving an integer programming problem odd. LLMs can barely do basic arithmetic. How do you imagine this happening?