Hacker Newsnew | past | comments | ask | show | jobs | submit | thor-rodrigues's commentslogin

Coming from PS1, I am still waiting on PlanetSide 3 :(


In 2015, Microsoft Labs produced another (and their final entry) into their "Future Vision" series. I would say this offers a "genuine" glimpse into what those in 2015 would expect 2025 to be.


I think what we should really ask ourselves is: “Why do LLM experiences vary so much among developers?”

The simplest explanation would be “You’re using it wrong…”, but I have the impression that this is not the primary reason. (Although, as an AI systems developer myself, you would be surprised by the number of users who simply write “fix this” or “generate the report” and then expect an LLM to correctly produce the complex thing they have in mind.)

It is true that there is an “upper management” hype of trying to push AI into everything as a magic solution for all problems. There is certainly an economic incentive from a business valuation or stock price perspective to do so, and I would say that the general, non-developer public is mostly convinced that AI is actually artificial intelligence, rather than a very sophisticated next-word predictor.

While claiming that an LLM cannot follow a simple instruction sounds, at best, very unlikely, it remains true that these models cannot reliably deliver complex work.


Another theory: you have some spec in your mind, write down most of it and expect the LLM to implement it according to the spec. The result will be objectively a deviation from the spec.

Some developers will either retrospectively change the spec in their head or are basically fine with the slight deviation. Other developers will be disappointed, because the LLM didn't deliver on the spec they clearly hold in their head.

It's a bit like a psychological false memory effect where you misremember and/or some people are more flexibel in their expectations and accept "close enough" while others won't accept this.

At least, I noticed both behaviors in myself.


This is true. But, it's also true of assigning tasks to junior developers. You'll get back something which is a bit like what you asked for, but not done exactly how you would have done it.

Both situations need an iterative process to fix and polish before the task is done.

The notable thing for me was, we crossed a line about six months ago where I'd need to spend less time polishing the LLM output than I used to have to spend working with junior developers. (Disclaimer: at my current place-of-work we don't have any junior developers, so I'm not comparing like-with-like on the same task, so may have some false memories there too.)

But I think this is why some developers have good experiences with LLM-based tools. They're not asking "can this replace me?" they're asking "can this replace those other people?"


> They're not asking "can this replace me?" they're asking "can this replace those other people?"

People in general underestimate other people, so this is the wrong way to think about this. If it can't replace you then it can't replace other people typically.


But a junior developer can learn and improve based on the specific feedback you give them.

GPT5 will, at least to a first approximation, always be exactly as good or as bad as it is today.


> They're not asking "can this replace me?" they're asking "can this replace those other people?"

In other words, this whole thing is a misanthropic fever dream


Yeah, I see quite a lot of misanthropy in the rhetoric people sometimes use to advance AI. I'll say something like "most people are able to learn from their mistakes, whereas an LLM won't" and then some smartass will reply "you think too highly of most people" -- as if this simple capability is just beyond a mere mortal's abilities.


> misanthropic

I see what you did there


This is a really short sighted way to look at things. Juniors become seniors. LLMs just keep hallucinating.


This implies that it executes the spec correctly, just not in a way that's expected. But if you actually look at how these things operate, that's flat out not true.

Mitchell Hashimito just did a write up about his process for shipping a new feature for Ghostty using AI. He clearly knows what he's doing and follows all the AI "best practices" as far as I could tell. And while he very clearly enjoyed the process and thinks it made him more productive, the post is also a laundry list of this thing just shitting the bed. It gets confused, can't complete tasks, and architects the code in ways that don't make sense. He clearly had to watch it closely, step in regularly, and in some cases throw the code out entirely and write it himself.

The amount of work I've seen people describe to get "decent" results is absurd, and a lot of people just aren't going to do that. For my money it's far better as a research assistant and something to bounce ideas off of. Or if it is going to write something it needs to be highly structured input with highly structured output and a very narrow scope.


What I want to see at this point are more screencasts, write-ups, anything really, that depict the entire process of how someone expertly wrangles these products to produce non-trivial features. There's AI influencers who make very impressive (and entertaining!) content about building uhhh more AI tooling, hello worlds and CRUD. There's experienced devs presenting code bases supposedly almost entirely generated by AI, who when pressed will admit they basically throw away all code the AI generates and are merely inspired by it. Single-shot prompt to full app (what's being advertised) rapidly turns to "well, it's useful to get motivated when starting from a blank slate" (ok, so is my oblique strategies deck but that one doesn't cost 200 quid a month).

This is just what I observe on HN, I don't doubt there's actual devs (rather than the larping evangelist AI maxis) out there who actually get use out of these things but they are pretty much invisible. If you are enthusiastic about your AI use, please share how the sausage gets made!



From the article

  Important: there is a lot of human coding, too. I almost always go in after an AI does work and iterate myself for awhile, too.
Some people like to think for a while (and read docs) and just write it right at the first go. Some people like to build slowly and get a sense of where to go at each steps. But in all of those steps, there’s an heavy factor of expertise needed from the person doing the work. And this expertise does not comes for free.

I can use agentic workflow fine and generate code like any other. But the process is not enjoyable and there’s no actual gain. Especially in an entreprise settings where you’re going to use the same stack for years.


These things are amazing for maintenance programming on very large codebases (think, 50-100million lines of code or more, the people who wrote the code no longer work there, it's not open source so "just google it or check stack overflow" isn't even an option at all.)

A huge amount of effort goes into just searching for what relevant APIs are meant to be used without reinventing things that already exist in other parts of the codebase. I can send ten different instantiations of an agent off to go find me patterns already in use in code that should be applied to this spot but aren't yet. It can also search through a bug database quite well and look for the exact kinds of mistakes that the last ten years of people just like me made solving problems just like the one I'm currently working on. And it finds a lot.

Is this better than having the engineer who wrote the code and knows it very well? Hell no. But you don't always have that. And at the largest scale you really can't, because it's too large to fit in any one person's memory. So it certainly does devolve to searching and reading and summarizing for a lot of the time.



this is definitely closer to what I had in mind but it's still rather useless because it just shows what winning the lottery is like. what I am really looking for is neither the "Claude oneshot this" nor the "I gave up and wrote everything by hand" case but a realistic, "dirty" day-to-day work example. I wouldn't even mind if it was a long video (though some commentary would be nice in that case).


I don't think you should consider this as "winning the lottery", the author has been using these tools for a while.

The sibling comment with the writeup by the creator of Ghostty shows stuff in more detail and has a few cases of the agent breaking, though it also involves more "coding by hand".


I think the point is that you want to see typical results or process. How does it run when you use it 10 times, or 100 times, what results can you expect generally?

There's a lot of wishful thinking going around in this space and something more informative than cherrypicking is desperately needed.

Not least because lots of capable/smart people have no idea which way to jump when it comes to this stuff. They've trained themselves not to blindly hack solutions through trial and error but this essentially requires that approach to work.


Yeah that's a good point and the sibling comment seems to be pointing in the same direction. You could take a look at Steve Yegge's beads (https://steve-yegge.medium.com/introducing-beads-a-coding-ag..., https://github.com/steveyegge/beads) but the writeup is not super detailed.

I think your last point is pretty important, that all that we see is done by experienced people, and that today we don't have a good way to teaching "how to effectively use AI agents" other than saying to people "use them a lot, apply software engineering best practices like testing". That is a big issue, compounded because that stuff is new, there are lots of different tools, and they evolve all the time. I don't have a better answer here than "many programmers that I respect have tried using those tools and are sticking with it rather than going back" (with exceptions, like Karpathy's nanochat), and "the best way to learn today is to use them, a lot".

As for "what are they really capable of", I can't give a clear answer. They do make easy stuff easier, especially outside of your comfort zone, and seem to make hard stuff come up more often and earlier (I think because you do stuff outside your comfort zone/core experience zone ; or because you know have to think more carefully about design over a shorter period of time than before with less direct experience with the code, kind of like in Steve Yegge's case ; or because when hard stuff comes up it's stuff they are less good at handling so that means you can't use them).

The lower bound seems to be "small CLI tool", the higher bound seems to be "language learning app with paid users (sottaku I think? the dev talks on twitter. Lots of domain knowledge in japanese here to check the app itself) ; implementing a model on pytorch by someone that didn't know how to code before (00000005 seconds or something like this on twitter, has used all these models and tools a lot); reporting security issues that were missed in cURL", middle bound "very experienced dev shipping a feature faster and while doing other things on a semi mature codebase (Ghostty)", middle bound too is "useful code reviews". That's about the best I can give you I think.


I'm not sure if you just didn't understand what I'm looking for. If I'm searching for a good rails screencast to get a feeling for how it's used, a blogpost consisting of "rails new" is useless to me. I know that these tools can oneshot tasks, but this doesn't help me when they can't.


Well we are all doing different tasks on different codebases too. It's very often not discussed, even though it's an incredibly important detail.

But the other thing is that, your expectations normalise, and you will hit its limits more often if you are relying on it more. You will inevitably be unimpressed by it, the longer you use it.

If I use it here and there, I am usually impressed. If I try to use it for my whole day, I am thoroughly unimpressed by the end, having had to re-do countless things it "should" have been capable of based on my own past experience with it.


> Well we are all doing different tasks on different codebases too. It's very often not discussed, even though it's an incredibly important detail.

Absolutely nuts I had to scroll down this far to find the answer.Totally agree.

Maybe it's the fact that every software development job has different priorities, stakeholders, features, time constraints, programming models, languages, etc. Just a guess lol


> The simplest explanation would be...

The simplest explanation is that most of us are code monkeys reinventing the same CRUD wheel over and over again, gluing things together until they kind of work and calling it a day.

"developers" is such a broad term that it basically is meaningless in this discussion


or, and get this, software development is an enormous field with 100s of different kinds of variations and priorities and use cases.

lol.

another option is trying to convince yourself that you have any idea what the other 2,000,000 software devs are doing and think you can make grand, sweeping statements about it.

there is no stronger mark of a junior than the sentiment you're expressing


For every coder doing some cutting-edge Computer SCIENCE there are 99 people creating one more CRUD API Glue application or microservice.

I've been doing this for 25 years and everything I do can be boiled down to API Glue.

Stuff comes in, code processes stuff, stuff goes out. Either to another system or to a database. I'm not breaking new ground or inventing new algorithms here.

The basic stuff has been the same for two decades now.

Maybe 5% of the code I write is actually hard, like when the stuff comes in REAL fast and you need to do the processing within a time limit. Or you need to get fancy with PostgreSQL queries to minimise traffic from the app layer to the database.

With LLM assistance I can have it do the boring 95% of scaffolding one more FoobarController.cs , write the models and the entity framework definitions while I browse Hacker News or grab a coffee and chat a bit. Then I have more time to focus on the 5% as well as more time to spend improving my skills and helping others.

Yes. I read the code the LLM produces. I've been here for a long time, I've read way more code than I've written, I'm pretty good at it.


> I've been doing this for 25 years and everything I do can be boiled down to API Glue.

Oooof, and you still haven't learned how big this field is? Give me the ego of a software developer who thinks they've seen it all in a field that changes almost daily. Lol.

> The basic stuff has been the same for two decades now.

hwut?

> Maybe 5% of the code I write is actually hard, like when the stuff comes in REAL fast and you need to do the processing within a time limit

God, the irony in saying something like this and not having the self-awareness to realize it's actually a dig at yourself. hahahahaha

Congratulations on being the most lame software developer on this planet who has only found himself in situations that can be solved by building strictly-CRUD software. Here's to hoping you keep pumping out those Wordpress plugins and ecommerce sites.

I have 2 questions for you to ruminate on:

1. How many programming jobs have you had? 2. How many programming jobs exist in the entire world at this moment?

It's gotta be what, a million job difference? lol. But you've seen it all right? hahahazha


I didn't say that there aren't people doing cutting edge stuff.

But even John Romero did the boring stuff along with the cool stuff. Andrej Karpathy wrote a ton of boilerplate Python to get his stuff up and running[0].

Or are you claiming that every single line of the nanochat[0] project is peak computer science algorithms no LLM can replicate today?

Take the initial commit tasks/ directory for example[1]. Dude is easily in the top 5 AI scientists in the world and he still spends a good time writing pretty basic string wrangling in Python.

My basic point here is that LLMs automate generating the boilerplate to a crazy degree, letting us spend more time in the bits that aren't boring and are actually challenging and interesting.

[0] https://github.com/karpathy/nanochat [1] https://github.com/karpathy/nanochat/tree/master/tasks


Well I know for a fact there are more code monkeys than rocket scientists working on advanced technologies. Just look at job offers really...

Anyone with any kind of experience in the industry should be able to tell that so idk where you're going with your "junior" comment. Technically I'm a senior in my company and I'm including myself in the code monkey category, I'm not working on anything revolutionary, as most devs are, just gluing things together, probably things that have been made dozens of times before and will be done dozens of time later... there is no shame in that, it's just the reality of software development. Just like most mechanics don't work on ferraris, even if mechanics working on ferraris do exist.

From my friends, working in small startups and large megacorps, no one is working on anything other than gluing existing packages together, a bit of es, a bit of postgres, a bit of crud, most of them worked on more technical things while getting their degrees 15 years ago than they are right now... while being in the top 5% of earners in the country. 50% of their job consist of bullshitting the n+1 to get a raise and some other variant of office politics


> From my friends, working in small startups and large megacorps, no one is working on anything other than gluing existing packages together,

And all my friends aren't doing that. So there's some anecdotal evidence to contradict yours.

And I think you're missing the point.

The point is the field is way bigger than either of us could imagine. You could have decades of experience and still only touch a small subset of the different technologies and problems.

> Well I know for a fact there are more code monkeys than rocket scientists working on advanced technologies

I don't know what this means as it doesn't disprove that fact that the field is enormous. Of course not everyone is working on rockets. But that is irrelevant.

> 50% of their job consist of bullshitting the n+1 to get a raise and some other variant of office politics

Again, this doesn't mean we aren't working on different things.

I actually totally agree with this point made in your previous post:

> "developers" is such a broad term that it basically is meaningless in this discussion

But your follow-up feels antogonistic to that point.


It’s kinda like this when they think the software they use is mainstream and everything else is niche.


I'm convinced it's not the results that are different, it's the expectations.

The SVP of IT for my company is 100% in on AI. He talks about how great it is all the time. I just recently worked on a legacy project in PHP he build years ago, and now I know his bar for what quality code looks like is extremely low...

I use LLMs daily to help with my work, but I tweak the output all the time because it doesn't quite get it right.

Bottom line, if your code is below average AI code will look great.


That seems to be the pitch too. I now get google ads where they advertise you can ask your phone things about what it sees. All the examples are so trivial, I can’t believe it. How to make a double espresso. What are these clouds called?

That’s being not a complete idiot as a service.

If it was at least how do I start the decalcification process on this machine so it actually realizes it and turns the service light off.


I would say they can't reliably deliver simple work. They often can, but reliability, to me, means I can expect it to work every time. Or at least as much as any other software tool, with failure rates somewhere in the vicinity of 1 in 10^5, 1 in 10^6. LLMs fail on the order of 1 in 10 times for simple work. And rarely succeed for complex work.

That is not reliable, that's the opposite of reliable.


One has to look at the alternatives. What would i do if not use the LLM to generate the code? The two answers are “coding myself”, “asking an other dev to code it”. And neither of those approach anywhere a 10^5 failure rate. Not even close.


> I think what we should really ask ourselves is: “Why do LLM experiences vary so much among developers?”

Some possible reasons:

  * different models used by different folks, free vs paid ones, various reasoning effort, quantizations under the hood and other parameters (e.g. samplers and temperature)
  * different tools used, like in my case I've found Continue.dev to be surprisingly bad, Cline to be pretty decent but also RooCode to be really good; also had good experiences with JetBrains Junie, GitHub Copilot is *okay*, but yeah, lots of different options and settings out there
  * different system prompts, various tool use cases (e.g. let the model run the code tests and fix them itself), as well as everything ranging from simple and straightforward codebases that are dime a dozen out there (and in the training data), vs something genuinely new that would trip up both your average junior dev, as well as the LLMs
  * open ended vs well specified tasks, feeding in the proper context, starting new conversations/tasks when things go badly, offering examples so the model has more to go off of (it can predict something closer to what you actually want), most of my prompts at this point are usually multiple sentences, up to a dozen, alongside code/data examples, alongside prompting the model to ask me questions about what I want before doing the actual implementation
  * also sometimes individual models produce output for specific use cases badly, I generally rotate between Sonnet 4.5, Gemini Pro 2.5, GPT-5 and also use Qwen 3 Coder 480B running on Cerebras for the tasks I need done quickly and that are more simple
With all of that, my success rate is pretty great and the statement about the tech not being able to "...barely follow a simple instruction" holds untrue. Then again, most of my projects are webdev adjacent in mostly mainstream stacks, YMMV.


> Then again, most of my projects are webdev adjacent in mostly mainstream stacks

This is probably the most significant part of your answer. You are asking it to do things for which there are a ton of examples of in the training data. You described narrowing the scope of your requests too, which tends to be better.


It's true though, they can't. It really depends on what they have to work with.

In the fixed world of mathematics, everything could in principle be great. In software, it can in principle be okay even though contexts might be longer. When dealing with new contexts in something like real life, but different-- such as a story where nobody can communicate with the main characters because they speak a different language, then the models simply can't deal with it, always returning to the context they're familiar with.

When you give them contexts that are different enough from the kind of texts they've seen, they do indeed fail to follow basic instructions, even though they can follow seemingly much more difficult instructions in other contexts.


> I think what we should really ask ourselves is: “Why do LLM experiences vary so much among developers?”

My hypothesis is that developers work on different things and while these models might work very well for some domains (react components?) they will fail quickly in others (embedded?). So one one side we have developers working on X (LLM good at it) claiming that it will revolutionize development forever and the other side we have developers working on Y (LLM bad at it) claiming that it's just a fad.


I think this is right on, and the things that LLM excels at (react components was your example) are really the things that there's just such a ridiculous amount of training data for. This is why LLMs are not likely to get much better at code. They're still useful, don't get me wrong, but they 5x expectations needs to get reined in.


A breadth and depth of training data is important, but modern models are excellent at in-context learning. Throw them documentation and outline the context for what they're supposed to do and they will be able to handle some out-of-distribution things just fine.

I would love to see some detailed failure cases of people who used agentic LLMs and didn't make it work. Everyone is asking for positive examples, but I want to see the other side.


Also the variation is the focus of each person.

Based on my own personal experience:

- on some topics, I get the x100 productivity that is pushed by some devs; for instance this Saturday I was able to make two features that I was reschudeling for years because, for lack of knowledge, it would have taken me many days to make them, but a few back and forth with an LLM and everything was working as expected; amazing!

- on other topics, no matter how I expose the issue to an LLM, at best it tells me that it's not solvable, at worst they try to push an answer that doesn't make any sense and push an even worst one when I point it out...

And when people ask me what I think about LLM, I say : "that's nice and quite impressive, but still it can't be blindly trusted and needs a lot of overhead, so I suggest caution".

I guess it's the classic half empty or half full glass.


>I think what we should really ask ourselves is: “Why do LLM experiences vary so much among developers?”

Two of the key skills needed for effective use of LLMs are writing clear specifications (written communication), and management, skills that vary widely among developers.


There’s no clearer specifications than code, and I can manage my toolset just fines (lines of config, alias, and what not to make my job easier). That allowed me to deliver good results fine and fast without worrying if it’s right this time


I sometimes meet devs who are "using it wrong" with under-baked prompts.

But mostly my experience is that people who regularly get good output from AI coding tools fall into these buckets:

A) Very limited scope (e.g. single, simple method with defined input/output in context)

B) Aren't experienced enough in the target domain to see the problems with the AI's output (let's call this "slop blindness")

C) Use AI to force multiple iterations of the same prompt to "shake out the bugs" automatically instead of using the dev's time

I don't see many cases outside of this.


> B) Aren't experienced enough in the target domain to see the problems with the AI's output (let's call this "slop blindness")

Oh, boy, this. For example, I often use whatever AI I have to adjust my Nix files because the documentation for Nix is so horrible. Sure, it's slop, but it gets me working again and back to what I'm supposed to be doing instead of farting with Nix.

I would also argue:

D) The fact that an AI can do the task indicates that something with the task is broken.

If an AI can do the task well, there is something fundamentally wrong. Either the abstractions are broken, the documentation is horrible, the task is pure boilerplate, etc.


D) Understand and use context creatively. They know when to start new conversations, and how to use the filesystem as context storage.


Yeah except I do that with Claude Code, and my output is still shit most of the time. It might save me a little time or typing, but it definitely needs hand-editing. The people who say Claude is a junior dev (at best) are right.

That's why I think a lot of people who think it's a miracle probably aren't experienced enough to see the bugs.


"expect an LLM to correctly produce the complex thing they have in mind"

My guess is that for some types of work people don't know what the complex thing they have in mind is ex ante. The idea forms and is clarified through the process of doing the work. For those types of task there is no efficiency gain in using AI to do the work.


Why not? Just start iterating in chunks alongside the LLM and change gears/plan/spec as you learn more. You don't have to one-shot everything.


"Just start iterating in chunks alongside the LLM".

For those types of tasks it probably takes the same amount of time to form the idea without AI as with AI, this is what Metr found in its study of developer productivity.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... https://arxiv.org/abs/2507.09089


That study design has some issues. But let's say it takes me the same amount of time, the agentic flow is still beneficial to me. It provides useful structure, helps with breaking down the problem. I can rubber duck, send off web research tasks, come back to answer questions, etc., all within a single interface. That's useful to me, and especially so if you have to jump around different projects a lot (consultancy). YMMV.


"That study design has some issues. " This is a study that tries to be scientific, unlike developer self reports and CEO promises of 10x.

Can you point to a better study on the impact of AI on developer productivity? The only other one I can think of finds a 20% uplift in productivity.

https://www.youtube.com/watch?v=tbDDYKRFjhk


> Why do LLM experiences vary so much among developers?

The question assumes that all developers do the same work. The kind of work done by an embedded dev is very different from the work of a front-end dev which is very different from the kind of work a dev at Jane Street does. And even then, devs work on different types of projects: greenfield, brownfield and legacy. Different kind of setups: monorepo, multiple repos. Language diversity: single language, multiple languages, etc.

Devs are not some kind of monolith army working like robots in a factory.

We need to look at these factors before we even consider any sort of ML.


My experience is that people clamming they use AI exclusively are usually trying to sell an AI product.


Probably a good chunk of the differences in experience is this: https://news.ycombinator.com/item?id=45573521

> [..] possibly the repo is too far off the data distribution.

(Karpathy's quote)


I've known lots of people that don't know how to properly use Google, and Google has been around for decades. "You're using it wrong" is partially true, I'd say more something like "it is a new tool that changes very quickly, you have to invest a lot of time to learn how to properly use it, most people using it well have been using it a lot over the last two years, you won't catch up in an afternoon. Even after all that time, it may not be the best tool for every job" (proof on the last point being Karpathy saying he wrote nanochat mostly by hand).

It is getting easier and easier to get good results out of them, partially by the models themselves improving, partially by the scaffolding.

> non-developer public is mostly convinced that AI is actually artificial intelligence, rather than a very sophisticated next-word predictor

This is a false dichotomy that assumes we know way more about intelligence than we actually do, and also assumes than what you need to ship lots of high quality software is "intelligence".

>While claiming that an LLM cannot follow a simple instruction sounds, at best, very unlikely, it remains true that these models cannot reliably deliver complex work.

"reliably" is doing a lot of work here. If it means "without human guidance" it is true (for now), if it means "without scaffolding" it is true (also for now), if it means "at all" it is not true, if it means it can't increase dev productivity so that they ship more at the same level of quality, assuming a learning period, it is not true.

I think those conversations would benefit a lot from being more precise and more focused, but I also realize that it's hard to do so because people have vastly different needs, levels of experience, expectations ; there are lots of tools, some similar, some completely different, etc.

To answer your question directly, ie “Why do LLM experiences vary so much among developers?”: because "developer" is a very very very wide category already (MISRA C on a car, web frontend, infra automation, medical software, industry automation are all "developers"), with lots of different domains (both "business domains" as in finance, marketing, education and technical domains like networking, web, mobile, databases, etc), filled with people with very different life paths, very different ways of working, very different knowledge of AIs, very different requirements (some employers forbid everything except a few tools), very different tools that have to be used differently.


This is very similar to Tesla's FSD adoption in my mind.

For some (me), it's amazing because I use the technology often despite its inaccuracies. Put another way, it's valuable enough to mitigate its flaws.

For many others, it's on a spectrum between "use it sometimes but disengage any time it does something I wouldn't do" and "never use it" depending on how much control they want over their car.

In my case, I'm totally fine handing driving off to AI (more like ML + computer vision) most times but am not okay handing off my brain to AI (LLMs) because it makes too many mistakes and the work I'd need to do to spot-check them is about the same as I'd need to put in to do the thing myself.


It's a personality thing.

I know Car People who refuse to use even lane keeping assist, because it doesn't fit their driving style EXACTLY and it grates them immensely.

I on the other hand DGAF, I love how I don't need to mess with micro adjustments of the steering wheel on long stretches of the road, the car does that for me. I can spend my brainpower checking if that Gray VW is going to merge without an indicator or not.

Same with LLM, some people have a very specific style of code they want to produce and anything that's not exactly their style is "wrong" and "useless". Even if it does exactly what it should.


At some point people won't care to convince you and you will be left to adapt or fade away.


That's where I stand now. I use LLMs in some agentic coding way 10h/day to great avail. If someone doesn't see or realize the value, then that's their loss.


It’s because people are using different tiers of AI and different models. And many people don’t stick with it long enough to get a more nuanced outlook of AI.

Take Joe. Joe sticks with AI and uses it to build an entire project. Hundreds of prompts. Versus your average HNer who thinks he’s the greatest programmer in the company and thinks he doesn’t need AI but tries it anyway. Then AI fails and fulfills his confirmation bias and he never tries it again.


This is not the point of the author. The author is criticising the new redesign of the latest iOS/MacOS/WatchOS 26, as being of poor choice with an over focus on form in detriment of function.

Also, to which aspects are you referencing exactly, that “Apple’s competitors are funding lobbyists to get regulator help to tear down”?


Anecdotally, I had a myocardial infarction at 23, and I was honestly surprised to learn that it wasn’t already well known that infectious diseases could trigger such events.

Up until that point, I’d never had any heart-related issues, nor does anyone in my family. Just two days before being admitted to the hospital with a suspected heart attack, I came down with food poisoning. It wasn’t pleasant, of course, but I thought it was nothing unusual—something a couple of days of rest and hydration would normally resolve.

Since my bloodwork at the hospital matched the expected results for a heart attack, and I underwent surgery, the doctors understandably focused on treating the immediate problem rather than identifying the underlying cause (I’m eternally grateful to the team and staff at St. Vincentius-Kliniken. I truly don’t think I’d be here without them).

That said, I’m glad to see this area receiving more attention. Hopefully, it will lead to further studies and the development of better strategies for prevention and treatment.


Can you clarify -- if you're comfortable sharing additional details -- did you have an "occlusion MI" heart attack, involving balloons / stents in the cath lab?

Most people assume that "heart attack" is a distinct clinical entity, but the majority (~80%) of elevated troponin levels are not exactly what comes to mind when people say "heart attack," but will often be described to patients as a heart attack (sometimes out of ignorance and others out of convenience, as the actual explanation for what is going on takes a lot more time and effort).


Certainly well-deserved.

I have a similar list of _art_ instead of most _entertainment_, and Hollow Kight there together with Disco Elysium


[flagged]


The game is mocking all the political ideologies and many other aspects of current society. It is a satire.


the game does skewer said ideologies but i think it's going too far to label the game as satire because those political elements are also a key part of the world of the game and its (fictional) history. so much of the game is about Revochol's monarchism, subsequent (failed) revolution, precipitating administration of the state by external liberal capitalist democracies.


My understanding (without having played the game) is that you choose an ideology for your character (equivalent to fascist, communist, capitalist or centrist) and whichever of those you chose, it satirizes you for it by making you an absurd parody of whichever was your chosen ideology. So perhaps the only way to not "feel mocked" personally is to play through using an ideology you don't actually hold? Or, try not take it personally. But, I admit, your reaction makes me want to actually play it next to test my ideological sensitivity.


via your choices you "become" aligned with one of those options

parent commentor isn't entirely off-base because much of the narrative in the game is about the state of the fictional society after a failed communist revolution so much of the "vibe" of the game is grounded in a populace whose dream failed. the game is more sympathetic or at least more interested in leftism. i chose to play as a fascist art critic who believes he's a famous disco star.

disco elysium is definitely not advocating for collectivism or any other ideology, though. great game, one of the best i've ever played.


> its fans told me that if I liked it and wasn't a Communist, I was missing the point that it was mocking me. As a communist who loves DE: The game mocks you even If you are any flavour of socialist/communist, and I don't think this is at odds with liking the game.


I’m not sure whether I find it more worrisome or fascinating that we live in a world where a company that, as far as I know, has never generated a single dollar in revenue has managed to exist for over five years, employ more than 100 people, and still get acquired for this amount.

This isn’t criticism or sarcasm — I’m genuinely impressed, but also very curious about the rationale behind it.


Agreed, to make it even more interesting Browser Company discontinued Arc earlier this year. So not only did they do all of the things OP listed, but also didn't have a current product when acquired.


My experience with Arc was installing it, asking myself "I have to pay to change the app icon? wtf" and uninstalling it. Horrendous UI as well.


Personally I liked the UI (and now use Zen browser, the UI is very much a matter of taste though), but left as the browser itself kept getting worse


That's very interesting. I downloaded arc because I saw it in some twitter screenshot and I thought the UI was neat, when I could have actually been looking for Zed instead of Arc.


same


They have "Dia" - which is Chrome + AI chat?

https://www.diabrowser.com/


Chrome + AI chat is... Chrome + Gemini though


Shhh... don't tell Atlassian.


Lest they attempt to buy Google instead.


I think I would be ok with that, they can both be a mid, bureaucratic, mess together.


When I saw that domain my first thought was it looks like they came up with the name by combining “diabetes” with “browser”


My hunch is a loyal user base.

Anecdotally, everyone I put onto Arc and the person who put me on still uses it.

I’ve been using Arc for the last two years and was genuinely sad on its discontinuation. I now don’t really know what I’ll do when it goes away.


Zen browser now has all the features of Arc (including folders, which they just added) if you’re willing to use a Firefox fork


Where can I read about this? It still gets regular updates and is front and center on the browser company website



Ironic that I wasn't familiar with this company or their products before today, and having read about both Arc and Dia, including reading this blog post you've linked, the product that makes me want to try it is the one they've stopped developing...


It's still worth trying and using. The developers consider it a "finished product" and I don't disagree. It does lots of small things well* that many browsers (even the self-confessed clones like Zen) don't do out of the box, if at all. Maybe in two years the browser will no longer be distributed or receive Chromium updates, but it exists and works fine now.

* For example, I get a lot of value from renaming my tabs and even replacing their favicons with emojis of my choice. Zen appears to have limited support for this.


Thanks, I will do


It was popular but had no real route to profitability. Hence the acquisition.


It was put on maintenance mode with minimal security updates to favour the development of their newer product Dia (AI browser).


Damn, I really appreciate the decision to do this in a new product. Arc is the best browser I've ever used, and I'd hate to see AI features forced upon me. Thanks Browser Company!


I agree. I think Arc was the biggest innovation in browser UI since Chrome.

I think you will eventually have to switch because it will lack behind given that it's not their priority anymore. Zen browser seems like viable alternative but I haven't used it enough yet to know how well polished it is.

https://zen-browser.app


This video seems informational, and goes over a lot of other new browser projects too.

https://www.youtube.com/watch?v=YrxhVA5NVQ4


It is impressive what a single person with a vision can achieve hacking away on Firefox, especially considering Mozilla's track record in recent years.

A bus factor of 1 is still a bit red flag on something as involved as browser maintenance. Hopefully a community can emerge around the project.

ref: https://github.com/zen-browser/desktop/graphs/contributors


Yeah, this is the unfortunate part about products kept alive in maintenance mode in a rapidly evolving space.

I guess you could argue (as TBC did) it’s actually not rapidly evolving, and that gives it staying power. But eventually someone will reach parity and eventually eclipse the original product.

Hopefully Zen does that. I’m just tired of moving the same data to the effectively the same product run by a different team for no good reason.


Yeah my plan is definitely to move to zen in the long run, it's mostly migrating workspaces and so on that hold me back


I mean, I guess that’s one way of looking at it. On the other hand, they did abandon the product, so you’ll have to switch anyway in time.


There are some businesses that are simply not viable without losing money first. SpaceX cannot generate revenue until it first employs hundreds of people for a few years (maybe a decade) where they focus on building what will eventually bring revenue. Software has those problems too.


I'm told that it can take ten years for a vineyard to start generating profit.


I know distilleries will often contract with one or more other distilleries to create a custom blended whisky to sell under their label to get some revenue while they wait for their first batches to mature. There are distilleries that basically specialize in doing this. I assume the wine market probably has similar strategies. I know 3 buck chuck basically started like this, buying overstock from other vineyards and blending them into a generic white or red wine.


Out of curiosity, why would a company help out their future competitor?


My impression is that a large portion of the industry is already structured as distilleries that actually make the liquor, like MGP, and a bunch of labels that put their names on it (each different). Like how many name-brand items across grocery stores are all actually made by the same company.


You might be good at making whisky but not that great at marketing. Either way the market has a desire for generic, unlabeled product and people go in to fill the need.


because they may sell the same product, yes, but not compete. there will always be someone starting a new vineyard, distillery or whatever.

it's sort of like banks vs vc funds. both lend money to companies, but still they are not competing against each other.


It takes 5 years on average til you can harvest some grapes. 10 years for generating profit sounds about right.


yeah and what makes them different compared to half baked fraud startups with no revenue ? most of the world would not believe that they will bring revenue in years?


Pedigree of the team and a believable project plan.

Often times money will be raised at certain valuation and terms, but the cash is held in escrow (effectively) until milestones are hit.

The investors will do their due diligence on the feasibility. It’s a high stakes, high return game (if you succeed). Look around you… any physical device you see is basically funded the same way.


That’s like the point of venture capitalism and to a certain extent all entrepreneurial endeavors. You could start a T-shirt printing company (a completely viable business) and not see any revenue come in for months.


I mean this with no snark: I would love to see the investor deck that explains how an AI-powered browser is going to make any multiple of $610M.


Perhaps they are hoping Google will pay them ~$500m/yr to be the default search engine.


0.01 is a multiple!


More charitably by selling the only browser that's actually usable in a few years. The AI will be used to cancel out the effect of other people's AI.


There’s Orion, Zen, plenty of minor browser projects that aspire to better experiences than the majors’. Brave is likely much more widespread than Arc, though that one is monetized via the previous trend (crypto). Never even heard of a lot of these:

https://www.youtube.com/watch?v=YrxhVA5NVQ4


Ha, that's a dark timeline! But sadly quite probable.


Putting the agent into your user agent.


There have been a few extremely popular remastered games the last few years and the internet is full of people cranky because folks are posting replies to five year old internet threads about the game.

Well what do you expect people to do when the only non slop result on page 1 is a 5 to 8 year old thread? It’s the top link. You’re still relevant whether you want to be or not. Fuckin deal.


I would guess by selling personal data and ads.


I can't tell you how many designers I interviewed who told me they used the Arc browser. It was at least a dozen.

I'd never heard of the damned thing before.

I don't know why, but it appears to be popular with some creative demographics.

The browser is an essential pane of glass to platformization and taxing the web. Anyone who wins a browser with significant market share has a huge opportunity to capitalize on.

Not sure if Arc is that browser, but lots of teams are trying.

Chrome is shitty on purpose because it is designed to sell ads. Other browsers can sell AI or other things to fund their development.

It's a shame we don't have a good open source browser with decent leadership anymore. I'm sure they'd be killing it. I could swear Mozilla is led by a revolving door of paid off Google plants.


> how many designers I interviewed who told me they used the Arc browser

Looking at their frontpage the design is outright horrible if you have a > 7-8 inch screen. I guess in a way its good to have an example of what not to do.

> I'm sure they'd be killing it

Why, though? I mean the niche is pretty small, most people don't care much about open source or even what browser they are using at all.

Considering the overwhelming majority of Mozilla's funding is coming from Google and in no way could it survive without it being run by Google's plants is not that surprising.


Well, I was one of Arc users but they abandoned it in favor of a new browser with an integrated AI agent that t can work on tabs, Dia. Now I’m using it, but to be honest I use almost none of the AI features beyond some summaries for pages and YouTube videos, but I see a lot of potential there (I.e. make it check the calendar to propose a time in a newly composed email) for the less technical users.


I use Arc for two reasons:

Tabs on the side nav and the ability to have 3 different AWS accounts open at the same time


Firefox does both of those things out of the box with no extensions.


Yep, but admittedly the vertical tab UX is not the greatest. You either have them always be visible with an option to toggle by clicking the sidebar icon (no keyboard shortcut option afaik), or minimised as icons that expand on hover with an awfully annoying animation.

Looking at Zen, I really don't understand how Mozilla fail to capitalise on their browser, and build up a similar experimental project based on Firefox like it. It seems that many of these small QoL improvements could make a big difference. They have such a huge budget, and they waste it on inane things. Their fancy search deal with Google has made them complacent, and neglect one of the few things that ever had any real worth. Curious to see how it develops with the recent Google ruling. And to be fair, it does seem like Firefox development has picked up a bit lately—maybe even due to Zen's competition, who knows.


Vimperator plugin used to do the rest, but maybe that is no longer needed or working.


And there are business which will never be highly profitable unless the competition implodes for no particular reason (like making your own browser).


Atlassian has always baffled me. In that JBoss sort of way.

Explaining why they're successful and I'm not.


Luck. It’s always just luck.

Of course, you need to have other ingredients too, but hundreds of millions, if not even billions of people have those skills too. Who win more among them is pure luck.

And in that, of course a ton of predetermined parameters, like where you born, who your parents are, what your skin color is, etc.

I have a friend who is worse in almost every skills which matter in our work. Not much worse, he is still awesome in his job. But I’m better. Every single person who saw us work in comparable environments would tell you the same thing. His career is still better than mine. And the single reason is that he born in wealth. He had the opportunity to live without income for years, and kick off a startup, and try to start some others, and simply try out, and risk things which I couldn’t do. Nothing else. Pure luck.


Luck was a necessary ingredient, 100%.

But how many other people had similar luck and did nothing with it?

Luck is another word for opportunity. Some people are really good at leveraging opportunity for all it's worth. Most of us (myself very much included) are not.

Case in point: I'm the same age as Mark Zuckerberg. Many people say his age is why he was able to be at the right place at the right time to create Facebook. Much like they say about Steve Jobs and Bill Gates and every other "self-made" billionaire.

But he still had to choose to do all the right things that I chose not to do in order to be able to experience that kind of luck.

At some point we gotta own up to our own role in guiding our lives.


Atlassian essentially got a big fairly locked-in userbase that they will now squeeze using their existing proven mechanisms. Oh and they probably got a few free competent developers without needing to go through an expensive hiring process.

All told, probably worth 610M.


Browser users are not (by default) Atlassian ICPs so IMO there's zero lock-in. I am going to most likely change my browser very shortly because I don't see Atlassian building out Arc. TBC raised $50M, they and their investors got a good return in a short amount of time. This chapter is now closed.


are they locked-in tho? People loved Arc, but they killed it. Doesn't seem the reviews of Dia are all that great.


I haven't kept track for a while, but whoever I knew that used arc, they found it hard to go back to standard browsers after getting used to its various UI affordances. Even when they pulled the "you need an account" thing most people I knew just sucked it up and created an account. I am assuming a good portion of them will submit to whatever atlassian demands.


why not Zen https://zen-browser.app/ ?

I've found it just as good AND it's open source https://github.com/zen-browser/desktop


Because for better and for worse - it is Firefox.


I love Arc. I tried Dia. I wanted to like it but don't see how it's going to be valuable for me.

The browser features are _much_ worse than Arc (no sidebar, bookmarks are a dropdown, ...) and most of the time the AI can't even "see" or "read" what's on the page I'm viewing, so it's just worse than using Claude/ChatGPT/Gemini.

I'm still using Arc and will probably continue until there's another browser that copies its UI/UX improvements.


I would think of it that way:

- no company generates revenue in its first second. Even if you start a lemonade stand tomorrow, you'll have to buy some lemons first. The time-to-revenue might be very short, but it's never zero. Therefore, making no revenue for 1 day or for 10 years is not a step change, but simply a point on a curve.

- Capitalism is basically a long history of creating vehicles with increasing sophistication to bridge that gap: provide funding for ventures that have returns in the future. This is intrinsically difficult, and it's easy to waste money, but it can work immensely. This started with the Dutch inventing limited liability corporations to fund ship expeditions, and today's VC is essentially an extension of that.

- It has worked well in the past to bet on companies that don't optimize for time-to-revenue, but something else – famous examples being e.g. Amazon, Google, Meta, who all lost lots of money initially.

Hence there can be companies that make no money for quite a while. And it can even turn out that the vast majority of the companies that make no money for a while never make any money. Accepting this risk is a feature, not a bug.


>- no company generates revenue in its first second. Even if you start a lemonade stand tomorrow, you'll have to buy some lemons first. The time-to-revenue might be very short, but it's never zero. Therefore, making no revenue for 1 day or for 10 years is not a step change, but simply a point on a curve.

Yea, it's called investment. If you want to get rich overnight play lottery or start gambling.


This is nothing new.


less than a $6M/head - it is a steal. Not even counting whatever IP they have.


How much do you think it costs to hire people?


you think Mark didn't know it when he hired that guy for $250 millions? Anyway, you probably mean an individual hiring cost. Which is a totally different case from hiring and building a multifunctional team of 100. Look around at aqui-hires to understand the price, especially when we're talking about people who has been working with AI products.


$250 million was for a top researcher with more experience than almost anyone else in designing and training massive models.

As far as I know, Dia just calls OpenAI’s API. I’m sure their employees know a lot about using AI at this point, but so does everyone else who’s built an OpenAI wrapper.


VCs are not allowed to lose.


Atlassian invested in their series A. Atlassian decided Atlassian isn't allowed to lose


I spent a good amount of time last year working on a system to analyse patent documents.

Patents are difficult as they can include anything from abstract diagrams, chemical formulas, to mathematical equations, so it tends to be really tricky to prepare the data in a way that later can be used by an LLM.

The simplest approach I found was to “take a picture” of each page of the document, and ask for an LLM to generate a JSON explaining the content (plus some other metadata such as page number, number of visual elements, and so on)

If any complicated image is present, simply ask for the model to describe it. Once that is done, you have a JSON file that can be embedded into your vector store of choice.

I can’t say about the price-to-performance ration, but this approach seems to easier and more efficient than what is the author is proposing.


You can ask the model to describe the image, but that is inherently lossy. What if it is a chart and the model gets most x, y pairs, but the user asks about a missing "x" or "y" value. Presenting the image at inference is effective since you're guaranteeing that the LLM is able to answer exactly the user's question. The only blocker here becomes how good retrieval is, and that's a smaller problem to solve. This approach allows us to only solve for passing in relevant context, the rest is taken care of by the LLM, otherwise the problem space expands to correct OCR, parsing, and getting all possible descriptions to images from the model.


This is a great example of how to use LLMs thanks.

But it also illustrates to me that the opportunities with LLMs right now are primarily about reclassifying or reprocessing existing sources of value like patent documents. In the 90-00s many successful SW businesses were building databases to replace traditional filing.

Creating fundamentally new collections of value which require upfront investment seems to still be challenging for our economy.


how often has the model hallucinated the image though?


What I find absolutely infuriating is that Abott (Freestyle Libre 1-3 devices) region locks their monitoring app depending on your region.

My father is T1 is uses the Libre CGM system for a couple years now. Libre users in the US and Europe can enjoy direct integration with their iOS devices, including constant updates and most importantly, notification alerts for dangerously high or low glucose levels, and it is even possible to share live updates of this with close family members or caretakers.

But none of this is available for my dad, as he lives in Brazil. Even though the product is same, he cannot download the iOS apps over the AppStore, as they are region locked.


Is that because Brazilian regulations prohibit it? Or lawyers being too cautious?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: