Yeah, we've noticed it overthinks simple tasks that could be solved with a single table scrape. The agent architecture is built for complex, multi-source problems so it overengineers straightforward queries.
Working on better task classification upfront to route simple requests more directly.
Thanks, we have noticed that it can tend to "give up" early on certain sources. Ideally the critic agent would guide it back to the correct path of continuing to go deeper, but if that doesn't work usually just adding something to the prompt or sending it a message later on telling it to go deep on these sources would work.
Good point. Our main differentiation is the shared workspace - users can step in and guide the agent mid-task, kind of like Cursor vs Claude (which can technically generate the same code that Cursor does). Firecrawl (or any crawler we may use) is only part of the process, we want to make the collaborative process for user <> agent as robust and user controllable as possible.
Thanks for the feedback! From what we've seen it's actually the other way around - once it gets a sense of where this information lives the latter stages of data collection go quicker, especially since it's able to deploy search agents in parallel to get information and doesn't need to do the manual work as much anymore. Having said that, it does sometimes forget to do that, and although we've added the critic agent to remind it to do that it can be inconsistent but usually if you step in and ask it to deploy agents in parallel that fixes it.
We use Gemini 2.5 Flash which is already pretty cheap, so inference costs are actually not as high as they would seem given the number of steps. Our architecture allows for small models like that to operate well enough, and we think those kinds of models will only get cheaper.
Having said all that, we are working on improving latency and allowing for more parallelization wherever possible and hope to include that in future versions, especially for enrichment. We do think that one of the weaknesses of the product is for mass collection - it's better at finding medium sized datasets from siloed sources and less good at getting large comprehensive datasets, but we're also considering approaches that incorporate more traditional scraping tactics for finding these large datasets.
Thanks a lot regarding UI and good point on the schema editing.
We've been having similar thoughts about pricing and offering unlimited, but since it is feasible for us in the short term due to credits we enjoy offering that option to early users, even if it may be a bit naive.
Having said that, we are currently working on a pilot with a company whom we are offering live updates, and they are paying per usage since they don't want to have to set it up themselves, so we can definitely see the demand there. We also offer an API for companies that want to reliably query the same thing at a preset cadence, which is also usage based.
For crawling we use Firecrawl. They handle most of the blocking issues and proxies.
Yep, you can paste the list as text, or we also accept file uploads. Then, you can prompt it to enrich with certain attributes and it will do that for you.
We maintain a constant browser state that gets fed into the system prompt which shows the most recent results, current page, where you are in the content, what actions are available, etc. It's markdown by default but can switch to HTML if needed (for pagination or CSS selectors). The agent always has full context of its browsing session.
A few design decisions we made that turned out pretty interesting:
1. We gave it an analyze results function. When the agent is on a search results page, instead of visiting each page one by one, it can just ask "What are the pricing models?" and get answers from all search results in parallel.
2. Long web pages get broken into chunks with navigation hints so the agent always knows where it is and can jump around without overloading its context ("continue reading", "jump to middle", etc.).
3. For sites that are commonly visited but have messy layouts or spread out information, we built custom tool calls that let the agent request specific info that might be scattered on different pages and consolidates it all into one clean text response.
4. We're adding DOM interaction via text in the next couple of days, so the agent can click buttons, fill forms, enter keys, but everything still comes back as structured text instead of screenshots.
Thanks. If I am interpreting this correctly, what you have is not a browser but a translation layer. You are still using something that scrapes the data and then you translate it to be in the format that works best for your agent.
My original interpretation was that you had built a full blown browser, something akin to a Chromium/Firefox fork
We currently use Firecrawl for our crawling infrastructure. Looking at their documentation, they claim to respect robots.txt, but based on user reports in their GitHub issues, the implementation seems inconsistent - particularly for one-off scrapes vs full crawls.
This is definitely something we need to address on our end. Site owners should have clear ways to opt out, and crawlers should be identifiable. We're looking into either working with Firecrawl to improve this or potentially switching to a solution that gives us more control over respecting these standards.
Right now it can do that via URL params if that is how the website handles pagination, although we are pushing a feature in the next couple of days which allows it to take action on the DOM.
If it isn't doing that in your session, you can usually just step in and tell it to and it will follow your instructions.