I've been interfacing with GPT programmatically for a little while now, leveraging it's "soft and fuzzy" interface to produce hard / machine-readable results. JSON was the format that felt best-suited for the job.
I see a ton of code in this project, and I don't know what most of it does. As far as GPT troubles with JSON, I'll add a couple: sometimes it likes to throw comments in there as if it was JS. And sometimes it'll triple-quote the JSON string as if it was Python.
My approach to solve these problems was via prompt engineering - using the system message part of the API call. Asking it to "return valid json, do not wrap it in text, do not preface it with text, do not include follow-up explanations, make sure it's valid json, do not include comments" - seems to work 99% of the time. For the remainder, a try-and-catch block with some fallback code that "extracts" json (via dumb REs) from whatever text was returned. Hasn't failed yet.
It's fascinating to watch the new paradigm arrive, and people using old habits to deal with it. This entire project is kind of pointless, you can just ask GPT to return the right kind of thing.
Why not both? You can tell it in the prompt what you want and still constrain the output programmatically.
Also note that the output still depends on a random sampling of the next token according to the distribution that the net gives you - so there is a lot of genuine randomness in the model's behaviour. And because each sampled token influences the rest of the response, this randomness will become stronger the longer the response is.
So if you already know you're only interested in a particular subset of tokens, it makes sense to me to clamp the distribution to only those tokens and keep the model from getting onto the "wrong path" in the first place.
Also, pragmatically, if you can get the model to restrict itself to JSON without telling it in the prompt, you're saving that part if the context window for better uses.
I agree with others that it would be interesting to see an LLM that outputs JSON natively - but I think it would also be moving in the opposite direction of the general trend. Right now I can ask it for JSON, YAML, or a number of other formats.
To answer "why not both?" -- bottom line, the effort involved. I don't want to deal with yet another library, the bugs in it, and the inevitable -changes-. GPT's capacity for bridging the gap between structured and human languages is an enormous boon. It bridges a gap so large, we most of the time can't span it with our imaginations. I don't need to write code to tell GPT what to do, I can direct it in plain english.
I'm not worried about the size of the context window the same way we're not worried about memory or disk space - there will be more.
It works fine 99% of the time by just using a small amount of extra instruction in the actual prompt. The method GP describes works in any language with just the basic building blocks of http requests, regexp, and a json decoder.
You might not! Depends on what you're looking for. I've been finding this library most helpful in places where I have a lot of GPT calls in pipelines, so having typehinted schema return values / some built in error correction / variable injection / establishing standards for the IO of prompt schema is the most useful. So IMO I see its main use as a good standard set of operations that work pretty well out of the box and that allow you to hack around them with decent flexibility.
yeah, I don't disagree, however this idea is better a function I use within gpt-index or langchain. There is a horse race and we're all making our bets about who's going to win
I see a ton of code in this project, and I don't know what most of it does. As far as GPT troubles with JSON, I'll add a couple: sometimes it likes to throw comments in there as if it was JS. And sometimes it'll triple-quote the JSON string as if it was Python.
My approach to solve these problems was via prompt engineering - using the system message part of the API call. Asking it to "return valid json, do not wrap it in text, do not preface it with text, do not include follow-up explanations, make sure it's valid json, do not include comments" - seems to work 99% of the time. For the remainder, a try-and-catch block with some fallback code that "extracts" json (via dumb REs) from whatever text was returned. Hasn't failed yet.
It's fascinating to watch the new paradigm arrive, and people using old habits to deal with it. This entire project is kind of pointless, you can just ask GPT to return the right kind of thing.