In general encoder+decoder models are much more efficient at infererence than decoder-only models because they run over the entire input all at once (which leverages parallel compute more effectively).
The issue is that they're generally harder to train (need input/output pairs as a training dataset) and don't naturally generalize as well
≥In general encoder+decoder models are much more efficient at infererence than decoder-only models because they run over the entire input all at once (which leverages parallel compute more effectively).
Decoder-only models also do this, the only difference is that they use a masked attention.
I have actually worked on encoder-decoder models. The issue is, finetuning itself is becoming historic. At least for text processing. If you spend a ton of effort today to finetune on a particular task, chances are you would have reached the same performance using a frontier LLM with the right context in the prompt. And if a big model can do it today, in 12 months there will be a super cheap and efficient model that can do it as well. For vision you can still beat them, but only with huge effort the gap is shortening constantly. And T5 is not even multimodal. I don't think these will change the landscape in any meaningful way.
Also a hint: you can create a finetuning dataset from a frontier LLM pretty easily to finetune those t5 and effectively distill them pretty fast these days
> A store of value is an asset with as close to 0% volatility in price as possible.
You just proved his point. In this example, bitcoin's volatility is closer to zero than gold's. Thus, by the quoted definition of "store of value", then in this particular time frame (it would be very different going back 5, 10, 15 years), bitcoin is the better store of value.
What ChatGPT / Claude features do you use that we don't support?
We have an MCP server I can give you access to for search immediately. Down the line a search API and chat completions API to our assistant in the pipeline.
We're not running on openrouter, that would break the privacy policy.
We get specific deals with providers and use different ones for production models.
We do train smaller scale stuff like query classification models (not trained on user queries, since I don't even have access to them!) but that's expected and trivially cheap.
Quick assistant is a managed experience, so we can add features to it in a controlled way we can't for all the models we otherwise support at once.
For now Quick assistant has a "fast path" answer for simple queries. We can't support the upgrades we want to add in there on all the models because they differ in tool calling, citation reliability, context window, ability to not hallucinate, etc.
The responding model is currently qwen3-235B from cerebras but we want to decouple the user expectations from that so we can upgrade it down the road to something else. We like Kimi, but couldn't get a stable experience for Quick on it at launch with current providers (tool calling unreliability)
reply