Can you (or someone else) explain how to do that? How much does it typically cost to create a specialized agents that uses a local model? I thought it was expensive?
An agent is just a program which invokes a model in a loop, adding resources like files to the context etc. It's easy to write such a program and it costs nothing, all the compute cost is in the LLM call. What parent was referring to most likely is fine-tuning a smaller model which can run locally, specialized for whatever task. Since it's fine-tuned for that particular task, the hope is that it will be able to perform as well as a general purpose frontier model at a fraction of the compute cost (and locally, hence privately as well).