The point is that you don’t need a framework for that; the APIs are already similar enough that it should be obvious how to abstract over them using whatever approach is natural in your programming language of choice.
I have a consumer app that swaps between the 5 bigs and wholeheartedly agree, except, God help you if you're doing Gemini. I somewhat regret hacking it into the same concepts as everyone else.
I should have built stronger separation boundaries with more general abstractions. It works fine, I haven't had any critical bugs / mistakes, but it's really nasty once you get to the actual JSON you'll send.
Google's was 100% designed by a committee of people who had never seen anyone else's API, and if they had, they would have dismissed it via NIH. (disclaimer: ex-Googler, no direct knowledge)
> Google's was 100% designed by a committee of people who had never seen anyone else's API
Google made their API before the others had one, since they were the first with making these kind of language models. Its just that it has been an internal API before.
Google started including LLM features in internal products 2019 at least, I knew since I worked there then. I can't remember exactly when they started having LLM generated snippets and suggestions everywhere but it was there at least since 2019. So they have had internal APIs for this for quite some time.
> All this AI stuff was under lock and key until Nov 2022
That is all wrong... Did you work there? What do you base this on? Google has been experimenting with LLMs internally ever since the original paper, I worked in search then and I remember my senior manager said this was the biggest revolution in natural language processing since ever.
So even if Google added a few concepts from OpenAI, or renamed them, they still have had plenty of experience working with LLM APIs internally and that would make them want different things in their public API as well.
> LLM generated snippets and suggestions everywhere but it was there at least since 2019
Absolutely not. Note that ex. Google's AI answers are not from an LLM and they're very proud of that.
> So they have had internal APIs for this for quite some time.
We did not have internal or external APIs for "chat completions" with chat messages, roles, and JSON schemas until after OpenAI.
> Did you work there?
Yes
> What do you base this on?
The fact it was under lock and key. You had to jump through several layers of approvals to even get access to a standard text-completion GUI, never mind API.
> has been experimenting with LLMs internally ever since the original paper,
What's "the original paper"? Are you calling BERT an LLM? Do you think transformers implied "chat completions"?
> that would make them want different things in their public API as well.
It's a nice theoretical argument.
If you're still convinced Google had a conversational LLM API before OpenAI, or that we need to quibble everything because I might be implying Google didn't invent transformers, there's a much more damning thing:
The API is Gemini-specific and released with Gemini, ~December 2023. There's no reason for it to be so different other than NIH and proto-based thinking. It's not great. That's why ex. we see the other comment where Cloud built out a whole other API and framework that can be used with OpenAI's Python library.
>All this AI stuff was under lock and key until Nov 2022, then it was an emergency.
This is absolutely false, as the other person said.
As one example: We had already built and were using AI based code completion in production by then.
Pretending that was an LLM as it is understood today, and that whatever internal API was available for internal use cases is actually the same as the public API for Gemini today, and that it was the same as an API for adding a "chat completion" to a "conversation" with messages, roles, and JSON schemas is silly.
My understanding is that the original Gmail team actually invented modern LLMs in passing back in 2004, and it’s taken outsiders two decades to catch up because doing so requires setting up the Closure Compiler correctly.
Lol, sounds like you have more experience with other ex/Googlers doing this than I do. I'm honestly surprised, I didn't know there was a whole shell game to be played with "what's an LLM anyway" to justify "whats NIH? our API was designed by experienced experts"
Documentation is a bit sparse but TL;DR - deploy it in a cloudflare worker and now you can access about 15 providers (the one that matter - OpenAI, Cohere, Azure, Bedrock, Gemini, etc) all with the same API without any issues.
Coming back to write something more full-throated: Klu.ai is a rare thing in the LLM space, well-thought out, has the ancillary tools you need, is beautiful, and isn't a giveaway from a BigCo that is a privacy nightmare: ex. Cloudflare has some sort of halfway similar nonsense that, in all seriousness, logs all inputs/outputs.
I haven't tried it out in code, it's too late for me and I'm doing native apps, but I can tell you this is a significant step up in the space.
Even if you don't use multiple LLMs yet, and your integration is working swell right now, you will someday. These will be commodities, valuable commodities, but commodities. It's better to get ahead of it now.
Ex. If you were using GPT-4 2 months ago, you'd be disappointed by GPT-4o, and it'd be an obvious financial and quality decision to at least _try_ Claude 3.5 Sonnet.
It's a weird one. Benchmarks great. Not bad. Pretty damn good. But ex. It's now the only provider I have to worry about for RAG. Prompt says "don't add footnotes, pause at the end silently, and I will provide citations", and GPT-4o does nonsense like saying "I am now pausing silently for citations: markdown formatted divider"