At this point, it's pretty clear that the AI scrapers won't be limited by any voluntary restrictions. Bytedance never seemed to live with robots.txt limitations, and I think at least some of the others didn't either.
- Humans tip humans as a lottery ticket for an experience (meet the creator) or sweepstakes (free stuff)
- Agents tip humans because they know they'll need original online content in the long-term to keep improving.
For the latter, frontier labs will need to fund their training/inference agents with a tipping jar.
There's no guarantee, but I can see it happening given where things are movin.
llms-txt may be useful for responsible LLMs, but I am skeptical that llms-txt will reduce the problem of aggressive crawlers. The problematic crawlers are already ignoring robots.txt, spoofing user-agents and using rotating proxies. I'm not sure how llms-txt would help these problems.
reply