More

aspenmartin · 2026-01-18T18:44:18 1768761858

I think honestly the story would be much different with more product sense and better market intuition, Horizon is just a perfect example of pure idiocy. They may as well have just ported chat roulette.

Once Apple Vision Pro released I finally understood what VR really could be which is an incredible immersive escape. Once I watched an Apple Immersive movie, and then even a completely regular old 2D movie in theater mode at night in Joshua Tree, I got it. Obviously completely unattainable but it to me was very smart: low volume but execute the best version of your vision that you possibly can, and see how people respond to it. It proves out the vision and then you can start working down the price.

The only thing Meta VR got right is gaming: it's the only use case that works with the resolution & hardware at the price point that they're trying to occupy. AVP could obviously work too but look: I've nearly punched out a window with my quest pro. Sitting and playing a game is weird, standing and playing is tiring. What I like infinitely better is just: watching a movie. Escaping. Relaxing.

fatherwavelet · 2026-01-20T13:00:13 1768914013

I still use my Quest after a year but it is mostly on the web and youtube 360. youtube 360 is actually quite cool given the fact no one really makes content for it.

I have no interest in games and anything inside Horizon is just not impressive.

I just don't understand how Meta spent this much money to get so little in return. VRChat has immense worlds compared to anything in Horizon. Everything in Horizon is just so amateur looking and lacking any kind of imagination.

I got the Quest because I wanted to try developing for VR but that is a total nightmare. Horizon/Unity/Unreal are all different forms of a nightmare. I suspect this is actually the problem. Development is just too hard to do much of anything interesting. Anything interesting I have made has been in vanilla javascript/three js/react three fiber.

Vision Pro level resolution + webxr I think has a huge amount of potential. I even like wearing the Quest. The physical act of wearing the headset is really no issue to me at all. That was what I figured I would get tired of.

The Quest is ultimately an amazing piece of hardware with amazingly bad software.

aspenmartin · 2026-01-17T15:52:38 1768665158

I'm definitely in the camp that this browser implementation is shit, but just a reminder: agent training does involve human coding data in early stages of training to bootstrap it but in its reinforcement learning phase it does not -- it learns closer to the way AlphaGo did, self play and verifiable rewards. This is why people are very bullish on agents, there is no limit technically to how well they can learn (unlike LLMs) and we know we will reach superhuman skill, and the crucial crucial reason for this is: verifiable rewards. You have this for coding, you do not have this for e.g. creative tasks etc.

So agents will actually be able to build a {browser, library, etc} that won't be an absolute slopfest, but the real crucial question is when. You need better and more efficient RL training, further scaling (Amodei thinks really scaling is the only thing you technically need here and we have about 3-4 orders of magnitude of headroom left before we hit insurmountable limits), bigger context windows (that models actually handle well) and possibly continual learning paradigms, but solutions to these problems are quite tangible now.

aspenmartin · 2026-01-17T15:28:37 1768663717

> those fully-paid-up members of the "AI revolution" cult

Ah ok, so an overly simplistic tribal take by someone with a clear axe to grind and no desire to consider any nuance. Anyone who disagrees with his take -- regardless of their actual positions on various AI-related topics -- is "fully-paid-up" and in a "cult". Is it so hard to consider the _possibility_ that Salesforce making a bad AI rollout doesn't imply the whole industry is doing the same thing? This doesn't completely ignore how varied real deployments are and how messy the reporting around them tends to be?

Overhyped claims abound -- Cursor, Google tweets about math problems being solved, agents cheating in SWEBench because they didn't sanitize git logs, etc. Some of it is careless, some probably dishonest, but the incentives cut both ways. When claims get debunked (e.g., the LMSYS/LMArena confusion around Llama 4 results), the reputational damage is immediate and brutal. No one benefits from making these bad claims that are easy to fact check, no one ever wants to do this. Lots of different stances and claims about how _close we are_ to various capabilities can easily be considered misleading -- fine! But you're going to completely ignore actual measured progress? The accomplishments that are defensible? The industry analysis that is careful and well thought out (see e.g. Epoch)?

> this dramatic deployment, followed by a rapid walk back, is happening across the entire economy.

Which companies? What deployments? Zero concrete cases. Firms make bad calls about AI for the same reason they make bad calls about M&A, pricing, org design; leadership everywhere constantly misjudge reality...will be true until the end of time... Pretty big leap to conclude this implies systemic delusion and anyone detracting is in a cult.

Yet another completely ignored core issue is how distorted the coverage of these things are. I read everything about the company I work for, stories routinely flatten nuanced, defensible, even boring decisions into morality plays because that's far more readable and engaging. Benioff could easily be overselling ordinary layoffs as AI transformation, it gives cover while also a great opportunity to make Salesforce look extremely competent (idiotic and has completely backfired). Yet none of that tells us what is actually happening operationally...

aspenmartin · 2026-01-15T23:15:45 1768518945

Bayesian methods are incredibly canonical in most fields I’ve been involved with (cosmology is one of the most beautiful paradises for someone looking for maybe the coolest club of Bayesian applications). I’m surprised there are still holdouts, especially in fields where the stakes are so high. There are also plenty of blog articles and classroom lessons about how frequentist trial designs kill people: if you are not allowed to deviate from your experiment design but you already have enough evidence to form a strong belief about which treatment is better, is that unethical? Maybe the reality is a bit less simplistic but ive seen many instantiations of that argument around.

aspenmartin · 2026-01-13T15:27:03 1768318023

I think this is the right take. In some narrow but constantly broadening contexts, agents give you a huge productivity edge. But to leverage that you need to be skilled enough to steer, design the initial prompt, understand the impact of what you produce, etc. I don't see agents in their current and medium term inception as being a replacement of engineering work, I see it as a great reshuffling of engineering work.

In some business contexts, the impact of more engineering labor on output gets capped at some point. Meaning once agent quality reaches a certain point, the output increase is going to be minimal with further improvements. There, labor is not the bottleneck.

In other business contexts, labor is the bottleneck. For instance it's the bottleneck for you as an individual: what kind of revenue could you make if you had a large team of highly skilled senior SWEs that operate for pennies on the dollar?

Labor will shift to where the ROI is highest is what I think you'll see.

To be fair, I can imagine a world where we eventually fully replace the "driver" of the agent in that it is good enough to fulfill the role of a ~staff engineer that can ingest very high level business context, strategy, politics and generate a high level system design that can then be executed by one or more agents (or one or more other SWEs using agents). I don't (at this point) see some fundamental rule of physics / economics that prevents this, but this seems much further ahead from where we are now.

aspenmartin · 2026-01-11T17:12:32 1768151552

I don't think these characterizations in either direction are very helpful; I understand they're coming from a place with someone trying to make sense of why their ingrained notion of what creativity means and what the "right" way to generate software projects is is not shared by other people.

I use CC for both business and personal projects. In both cases: I want to achieve something cool. If I do it by hand, it is slow, I will need to learn something new which takes too much time and often time the thing(s) I need to learn is not interesting to me (at the time). Additionally, I am slow and perpetually unhappy with the abstractions and design choices I make despite trying very hard to think through them. With CC: it can handle parts of the project I don't want to deal with, it can help me learn the things I want to learn, it can execute quickly so I can try more things and fail fast.

What's lamentable is the conclusion of "if you use AI it is not truly creative" ("have people using AI lost all understanding of creativity or creation?" is a bit condescending).

In other threads the sensitive dynamic from the AI-skeptic crowds is more or less that AI enthusiasts "threaten or bully" people who are not enthusiastic that they will get "punished" or fall behind. Yet at the same time, AI-skeptics seem to routinely make passive aggressive implications that they are the ones truly Creating Art and are the true Craftsman; as if this venture is some elitist art form that should be gate kept by all of you True Programmers (TM).

I find these takes (1) condescending, (2) wrong and also belying a lack of imagination about what others may find genuinely enjoyable and inspiring, (3) just as much of a straw man as their gripes against others "bullying" them into using AI.

aspenmartin · 2026-01-09T20:36:27 1767990987

Is this coming from the hypothesis / prior that coding agents are a net negative and those who use them really are akin to gambling addicts that are just fooling themselves?

The OP is right and I feel this a lot: when Claude pulls me into a rabbit hole, convinces me it knows where to go, and then just constantly falls flat on its face and we waste like several hours together, with a lot of all caps prompts from me towards the end. These sessions last in a way that he mentions: "maybe its just a prompt away from working"

But I would never delete CC because there are plenty of other instances where it works excellent and accelerates things quite a lot. And additionally, I know we see a lot of "coding agents are getting worse!" and "METR study proves all you AI sycophants are deluding yourselves!" and I again understand where these come from, agree with some of the points they raise, but honestly: my own personal perception (which I argue is pretty well backed up by benchmarks and by Claude's own product data which we don't see -- I doubt they would roll out a launch without at least one or more A/B tests) is that coding agents are getting much better, and that as a verifiable domain these "we're running out of data!" problems just aren't relevant here. The same way alphago gets superhuman, so will these coding agents, it's just a matter of when, and I use them today because they are already useful to me.

rileymichael · 2026-01-09T21:33:18 1767994398

no, this is coming from the fact OP states they are miserable. that is unsustainable. at the end of the day the more productive setup is the one that keeps you happy and in your chair long term, as you'll produce nothing if you are burnt out.

aspenmartin · 2026-01-09T21:58:55 1767995935

Oh sure of course, I missed that part!

aspenmartin · 2026-01-09T20:28:55 1767990535

I understand this sentiment but, it is a lot of fun for me. Because I want to make a real thing to do something, and I didn't get into programming for the love of it, I got into it as a means to an end.

It's like the articles point: we don't do assembly anymore and no one considers gcc to be controversial and no one today says "if you think gcc is fun I will never understand you, real programming is assembly, that's the fun part"

You are doing different things and exercising different skillsets when you use agents. People enjoy different aspects of programming, of building. My job is easier, I'm not sad about that I am very grateful.

Do you resent folks like us that do find it fun? Do you consider us "lesser" because we use coding agents? ("the same as saying you’re really into painting but you’re not really into the brush aspect so you pay someone to paint what you describe. That’s not doing, it’s commissioning.") <- I don't really care if you consider this "true" painting or not, I wanted a painting and now I have a painting. Call me whatever you want!

lunar_mycroft · 2026-01-09T21:36:15 1767994575

> It's like the articles point: we don't do assembly anymore and no one considers gcc to be controversial and no one today says "if you think gcc is fun I will never understand you, real programming is assembly, that's the fun part"

The compiler reliably and deterministically produces code that does exactly what you specified in the source code. In most cases, the code it produces is also as fast/faster than hand written assembly. The same can't be said for LLMs, for the simple reason that English (and other natural languages) is not a programming language. You can't compile English (and shouldn't want to, as Dijkstra correctly pointed out) because it's ambiguous. All you can do is "commission" another

> Do you resent folks like us that do find it fun?

For enjoying it on your own time? No. But for hyping up the technology well beyond it's actual merits, antagonizing people who point out it's shortcomings, and subjecting the rest of us to worse code? Yeah, I hold that against the LLM fans.

aspenmartin · 2026-01-09T22:11:52 1767996712

That a coding agent or LLM is a different technology than a compiler and that the delta in industry standard workflow looks different isn’t quite my point though: things change. Norms change. That’s the real crux of my argument.

> But for hyping up the technology well beyond it's actual merits, antagonizing people who point out it's shortcomings, and subjecting the rest of us to worse code? Yeah, I hold that against the LLM fans.

Is that what I’m doing? I understand your frustration. But I hope you understand that this is a straw man: I can straw man the antagonists and AI-hostile folks but the point is the factions and tribes are complex and unreasonable opinions abound. My stance is that people can dismiss coding agents at their peril, but it’s not really a problem: taking the gcc analogy, in the early compiler days there was a period where compilers were weak enough that assembly by hand was reasonable. Now it would be just highly inefficient and underperformant to do that. But all the folks that lamented compilers didn’t crumble away, they eventually adapted. I see that analogy as being applicable here, it may be hard to see the insanity of coding agents because we’re not time travelers from 2020 or even 2022 or 3. But this used to be an absurd idea and is now very serious and highly adopted. But still quite weak!! Still we’re missing key reliability and functionality and capabilities. But if we got this far this fast, and if you realize that coding agent training is not limited in the same way that e.g. vanilla LLM training is by being a verifiable domain, we seem to be careening forward. But by nature of their current weakness, absolutely it is reasonable not to use them and absolutely it is reasonable to point out all of their flaws.

Lots of unreasonable people out there, my argument is simply: be reasonable.

bossyTeacher · 2026-01-09T23:44:46 1768002286

> Norms change. That’s the real crux of my argument.

Novelty isn't necessarily better as a replacement of what exists. Example: blockchain as fancy database, NFTs, Internet Explorer, Silverlight, etc.

aspenmartin · 2026-01-09T23:51:26 1768002686

No it’s certainly not, and if you do want to lump coding agents into blockchain and NFTs that’s of course your choice but those things did not spur trillions of dollars of infra buildout and reshape entire geopolitical landscapes and have billions of active users. If you want to say: coding agents are not truly a net positive right now, that’s I think a perfectly reasonable opinion to hold (though I disagree personally). If you want to say coding agents are about as vapid as NFTs that to me is a bit less defensible

lunar_mycroft · 2026-01-10T23:30:08 1768087808

As others has already been pointed out, not all new technologies that are proposed are improvements. You say you understand this, but the clear subtext of the analogy to compilers is that LLM driven development are a obvious improvement and if we don't adopt them we'll find ourselves in the same position as assembly programmers who refused to learn compiled languages.

> Is that what I’m doing?

Initially I'd have been reluctant to say yes, but this very comment is laced with assertions that we'd better all start adopting LLMs for coding or we're going to get left behind [0]

> taking the gcc analogy, in the early compiler days there was a period where compilers were weak enough that assembly by hand was reasonable. Now it would be just highly inefficient and underperformant to do that

No matter how good LLMs get at translating english into programs, they will still be limited by the fact that their input (natural language) isn't a programming language. This doesn't mean it can't get way better, but it's always going to have some of the same downsides of collaborating with another programmer.

[0] This is another red flag I would hope programmers would have learned to recognize. Good technology doesn't need to try to threaten people into adopting it.

aspenmartin · 2026-01-11T17:02:13 1768150933

My intention was to say: you won't get left behind you will just get left slightly behind the curve until things reach a point where you feel you have no choice but to join the dark side. Like gcc/assembly: sure maybe there were some hardcore assembly holdouts but any day they could and probably did jump on the bandwagon. This is also speculation, I agree, but my point is: not using LLMs/coding agents today is very very reasonable, and the limitations that people often bring up are also very reasonable and believable.

> No matter how good LLMs get at translating english into programs, they will still be limited by the fact that their input (natural language) isn't a programming language.

Right but engineers routinely convert natural language + business context into formal programs, arguably an enormously important part of creating a software product. What's any different here? Like a programmer, the creation process is two-way. The agent iteratively retrieves additional information, asks questions, checks their approach, etc etc.

> [0] This is another red flag I would hope programmers would have learned to recognize. Good technology doesn't need to try to threaten people into adopting it.

I think I was either not clear or you misread my comment: you're not going to get left behind any more than you want to. Jump in when you feel good about where the technology is and use it where you feel it should be used. Again: if you don't see value in your own personal situation with coding agents, that is objectively a reasonable stance to hold today.

aspenmartin · 2026-01-09T17:55:32 1767981332

I don’t think for this approach it sounds like, this is related to the large concept model: https://arxiv.org/abs/2412.08821, where the latent space is SONAR, which is very much designed for this purpose. You learn SONAR embeddings so that every sentence with the same semantic meaning gets mapped to the same latent representation. So you can have e.g. a French SONAR encoder and a Finnish SONAR encoder, trained separately with large scale corpi of paired sentences with the same meaning (basically the same thing you would use for learning translation models directly, but for SONAR you don’t need to train a single model per pair of languages). The LCM then works in this language-agnostic SONAR space which means it does (in principle) learn concepts from texts or speech in all supported languages

aspenmartin · 2026-01-07T22:34:11 1767825251

I do a lot of human evaluations. Lots of Bayesian / statistical models that can infer rater quality without ground truth labels. The other thing about preference data you have to worry about (which this article gets at) is: preferences of _who_? Human raters are a significantly biased population of people, different ages, genders, religions, cultures, etc all inform preferences. Lots of work being done to leverage and model this.

Then for LMArena there is the host of other biases / construct validity: people are easily fooled, even PhD experts; in many cases it’s easier for a model to learn how to persuade than actually learn the right answers.

But a lot of dismissive comments as if frontier labs don’t know this, they have some of the best talent in the world. They aren’t perfect but they in a large sene know what they’re doing and what the tradeoffs of various approaches are.

Human annotations are an absolute nightmare for quality which is why coding agents are so nice: they’re verifiable and so you can train them in a way closer to e.g. alphago without the ceiling of human performance

fc417fc802 · 2026-01-07T22:48:00 1767826080

> in many cases it’s easier for a model to learn how to persuade than actually learn the right answers

So we should expect the models to eventually tend toward the same behaviors that politicians exhibit?

c0balt · 2026-01-07T23:42:39 1767829359

Maybe a happy to deceive marketing/sales role would be more accurate.

RA_Fisher · 2026-01-07T23:41:07 1767829267

100% (am a Bayesian statistician).

Isn’t it fascinating how it comes down to quality of judgement (and the descriptions thereof)?

We need an LMArena rated by experts.

Lerc · 2026-01-08T03:14:21 1767842061

As a statistician, do you you think you could, given access to the data, identify the subset of LMArena users that are experts?

RA_Fisher · 2026-01-08T12:09:52 1767874192

Yes, for sure! I can think of a few ways.

zqy123007 · 2026-01-08T01:14:40 1767834880

they always know, they just have non-AGI incentive and asymetric upside to play along...