More

BenderV · 2025-11-24T22:05:17 1764021917

It feels crazy to me that we are building "tool search" instead of building real tool with interface, state and available actions. Think how would you define a Calculator, a Browser, a Car...?

I think, notably, one of the errors has been to name functions calls "tools"...

jondwillis · 2025-11-25T02:21:04 1764037264

well the name “function” is already taken - they deprecated it so that we could call functions, tools.

BenderV · 2025-11-26T17:02:44 1764176564

Well, I think they should have kept calling it function... ^^'

BenderV · 2025-08-24T07:27:06 1756020426

Nice but sad to see lack of tools. Most your code is about the agent framework instead of specific to SWE.

I've built a SWE agent too (for fun), check it out => https://github.com/myriade-ai/autocode

diminish · 2025-08-24T07:32:31 1756020751

> sad to see lack of tools.

Lack of tools in mini-swe-agent is a feature. You can run it with any LLM no matter how big or small.

BenderV · 2025-08-24T09:38:43 1756028323

I'm trying to understand what does it got to do with LLM size? Imho, right tools allow small models to perform better than undirected tool like bash to do everything. But I understand that this code is to show people how function calling is just a template for LLM.

diminish · 2025-08-24T10:52:46 1756032766

Mini swe agent, as an academic tool, can be easily tested aimed to show the power of a simple idea against any LLM. You can go and test it with different LLMs. Tool calls didn't work fine with smaller LLM sizes usually. I don't see many viable alternatives less than 7GB, beyond Qwen3 4B for tool calling.

> right tools allow small models to perform better than undirected tool like bash to do everything.

Interesting enough the newer mini swe agent was refutation of this hypothesis for very large LLMs from the original swe agent paper (https://arxiv.org/pdf/2405.15793) assuming that specialized tools work better.

BenderV · 2025-08-25T09:29:28 1756114168

Thanks for your answer.

I guess that it's only a matter of finetuning.

LLM have lots of experience with bash so I get they figure out how to work with it. They don't have experience with custom tools you provide it.

And also, LLM "tools" as we know it need better design (to show states, dynamic actions).

Given both, AI with the right tools will outperform AI with generic and uncontrolled tool.

BenderV · 2025-08-24T06:57:50 1756018670

Why do humans need a IDE when we could do anything in a shell? Interface give you the informations you need at a given moment and the actions you can take.

normie3000 · 2025-08-24T09:10:56 1756026656

To me a better analogy would be: if you're a household of 2 who own 3 reliable cars, why would you need a 4th car with smaller cargo & passenger capacities, higher fuel consumption, worse off-road performance and lower top speed?

BenderV · 2025-07-20T16:04:03 1753027443

Here, the best time is defined as the highest chance of getting "some" visibility. Most posts quickly fade away.

You are right that there is lots of way to measure this but quality comments is way harder to judge and we don't have quantity traffic info.

BenderV · 2025-07-17T12:54:33 1752756873

Hey, Funnily, I had this idea/need recently. Just a message to encourage you in this direction !

BenderV · 2025-06-20T15:17:05 1750432625

Selfless plug, our own tool => https://www.myriade.ai

> I wish you luck in refining your differentiation. Can't agree more with you. It's about distribution (which Snowflake/Databricks/... have) or differentiation.

Still, chatting with your data is already working and useful for lots.

BenderV · 2025-05-17T13:48:45 1747489725

My 2 cents, building a tool in this space...

> Do you need an expert to verify if the answer from AI is correct?

If the underling data has a quality issue that is not obvious to a human, the AI will miss it too. Otherwise, the AI will correct it for you. But I would argue that it's highly probable that your expert would have missed it too... So, no, it's not a silver bullet yet, and the AI model often lacks enough context that humans have, and the capacity to take a step back.

> How is it time saved refining prompts instead of SQL?

I wouldn't call that "prompting". It's just a chat. I'm at least ~10x faster (for reasonable complex & interesting queries).

BenderV · 2025-05-13T13:09:26 1747141766

Awesome. Using Vue/Tailwind, I'm definitely interested in this. Maybe you could try to add examples of integrations with others frameworks? I'll play with it and give you my 2 cents.

codybontecou · 2025-05-13T17:39:56 1747157996

Not sure if you're aware, but there's a well-supported Vue ShadCN library: https://www.shadcn-vue.com/

hunvreus · 2025-05-13T23:08:10 1747177690

Yeah, it's excellent.

hunvreus · 2025-05-13T13:36:40 1747143400

Hello there Ben. Sure thing, I'll try and expand the docs.

BenderV · on Jan 7, 2025

Hi HN,

Over the past few weeks/months, I’ve been working on Autochat, a lightweight Python library designed to make building AI agents simple and intuitive.

The focus of Autochat is simplicity:

- Extend an AI assistant's functionality by adding Python functions directly, or even class.

- Hide all the complexities/particularities of the providers (openai, anthropic, …)

I’d love your feedback on this. If you’ve got ideas for features, use cases, or critiques, let me know!

BenderV · on June 24, 2024

Seems weird that to me that it make a front page on HN (from an European pov ; it's quite common)

rootusrootus · on June 25, 2024

It's a fairly large cyberattack, so it's halfway interesting to HN readers.