The point is that you don’t need a framework for that; the APIs are already simi...

refulgentis · on June 21, 2024

I have a consumer app that swaps between the 5 bigs and wholeheartedly agree, except, God help you if you're doing Gemini. I somewhat regret hacking it into the same concepts as everyone else.

I should have built stronger separation boundaries with more general abstractions. It works fine, I haven't had any critical bugs / mistakes, but it's really nasty once you get to the actual JSON you'll send.

Google's was 100% designed by a committee of people who had never seen anyone else's API, and if they had, they would have dismissed it via NIH. (disclaimer: ex-Googler, no direct knowledge)

__cayenne__ · on June 21, 2024

luckily Google now support's using the OpenAI lib https://cloud.google.com/vertex-ai/generative-ai/docs/multim...

Jensson · on June 21, 2024

> Google's was 100% designed by a committee of people who had never seen anyone else's API

Google made their API before the others had one, since they were the first with making these kind of language models. Its just that it has been an internal API before.

refulgentis · on June 21, 2024

No.

That'd be a good explanation, but it's theoretical.

In practice:

A) there was no meaningful internal LLM API pre-ChatGPT. All this AI stuff was under lock and key until Nov 2022, then it was an emergency.

B) the bits we're discussing are OpenAI-specific concepts that could only have occurred after OpenAI's.

The API includes chat messages organized with roles, an OpenAI concept, and "tools", an OpenAI concept, both of which came well after the GPT API.

Initial API announcement here: https://developers.googleblog.com/en/palm-api-makersuite-an-...

Jensson · on June 21, 2024

Google started including LLM features in internal products 2019 at least, I knew since I worked there then. I can't remember exactly when they started having LLM generated snippets and suggestions everywhere but it was there at least since 2019. So they have had internal APIs for this for quite some time.

> All this AI stuff was under lock and key until Nov 2022

That is all wrong... Did you work there? What do you base this on? Google has been experimenting with LLMs internally ever since the original paper, I worked in search then and I remember my senior manager said this was the biggest revolution in natural language processing since ever.

So even if Google added a few concepts from OpenAI, or renamed them, they still have had plenty of experience working with LLM APIs internally and that would make them want different things in their public API as well.

refulgentis · on June 21, 2024

> LLM generated snippets and suggestions everywhere but it was there at least since 2019

Absolutely not. Note that ex. Google's AI answers are not from an LLM and they're very proud of that.

> So they have had internal APIs for this for quite some time.

We did not have internal or external APIs for "chat completions" with chat messages, roles, and JSON schemas until after OpenAI.

> Did you work there?

Yes

> What do you base this on?

The fact it was under lock and key. You had to jump through several layers of approvals to even get access to a standard text-completion GUI, never mind API.

> has been experimenting with LLMs internally ever since the original paper,

What's "the original paper"? Are you calling BERT an LLM? Do you think transformers implied "chat completions"?

> that would make them want different things in their public API as well.

It's a nice theoretical argument.

If you're still convinced Google had a conversational LLM API before OpenAI, or that we need to quibble everything because I might be implying Google didn't invent transformers, there's a much more damning thing:

The API is Gemini-specific and released with Gemini, ~December 2023. There's no reason for it to be so different other than NIH and proto-based thinking. It's not great. That's why ex. we see the other comment where Cloud built out a whole other API and framework that can be used with OpenAI's Python library.

DannyBee · on June 21, 2024

>All this AI stuff was under lock and key until Nov 2022, then it was an emergency.

This is absolutely false, as the other person said. As one example: We had already built and were using AI based code completion in production by then.

Here's a public blog post from July, 2022: https://research.google/blog/ml-enhanced-code-completion-imp...

This is just one easy publicly verifiable example, there are others. (We actually were doing it before copilot, etc)

refulgentis · on June 21, 2024

Pretending that was an LLM as it is understood today, and that whatever internal API was available for internal use cases is actually the same as the public API for Gemini today, and that it was the same as an API for adding a "chat completion" to a "conversation" with messages, roles, and JSON schemas is silly.

DannyBee · on June 26, 2024

I'm glad you know exactly what happened, what it was capable of, etc, despite not working on it, and not asking a single question of those who did!

This follows right in line with the rest of your approach.

If you want to know things, it works better to ask questions than make assertions about what other people did or didn't do.

Nobody really cares about the opinions of those who can't be bothered to learn.

riwsky · on June 21, 2024

My understanding is that the original Gmail team actually invented modern LLMs in passing back in 2004, and it’s taken outsiders two decades to catch up because doing so requires setting up the Closure Compiler correctly.

refulgentis · on June 21, 2024

Lol, sounds like you have more experience with other ex/Googlers doing this than I do. I'm honestly surprised, I didn't know there was a whole shell game to be played with "what's an LLM anyway" to justify "whats NIH? our API was designed by experienced experts"

ssabev · on June 21, 2024

Would recommend just picking up a gateway that you can deploy and act as an OpenAI compatible endpoint.

We built something like this for ourselves here -> https://www.npmjs.com/package/@kluai/gateway?activeTab=readm....

Documentation is a bit sparse but TL;DR - deploy it in a cloudflare worker and now you can access about 15 providers (the one that matter - OpenAI, Cohere, Azure, Bedrock, Gemini, etc) all with the same API without any issues.

refulgentis · on June 21, 2024

Wow; this is really nice work, I wish you deep success.

refulgentis · on June 21, 2024

Coming back to write something more full-throated: Klu.ai is a rare thing in the LLM space, well-thought out, has the ancillary tools you need, is beautiful, and isn't a giveaway from a BigCo that is a privacy nightmare: ex. Cloudflare has some sort of halfway similar nonsense that, in all seriousness, logs all inputs/outputs.

I haven't tried it out in code, it's too late for me and I'm doing native apps, but I can tell you this is a significant step up in the space.

Even if you don't use multiple LLMs yet, and your integration is working swell right now, you will someday. These will be commodities, valuable commodities, but commodities. It's better to get ahead of it now.

Ex. If you were using GPT-4 2 months ago, you'd be disappointed by GPT-4o, and it'd be an obvious financial and quality decision to at least _try_ Claude 3.5 Sonnet.

It's a weird one. Benchmarks great. Not bad. Pretty damn good. But ex. It's now the only provider I have to worry about for RAG. Prompt says "don't add footnotes, pause at the end silently, and I will provide citations", and GPT-4o does nonsense like saying "I am now pausing silently for citations: markdown formatted divider"