More

AdventureMouse · 2025-09-16T10:16:35 1758017795

> If devs follow certain simple rules while writing UI text, it would make a tremendous difference for translation quality.

As a dev that often writes UI text, which simple rules do you recommend that I should follow?

agos · 2025-09-16T20:31:49 1758054709

Not OP but I’ll suggest one that is very dear to me: make sure you use the same verbs for the same actions, the same nouns for the same things, the same proper nouns for the same important concepts. This alone removes a huge mental burden from users: it’s always “delete”, not sometimes “remove”, “cancel”, etc

sedatk · 2025-09-16T21:08:52 1758056932

I'm actually compiling a list of UI text mishaps, and I plan to publish it as a blog post at one point. A simple one would be to avoid using "your" in a "user -> UI" context (command), and "my" in a "UI -> user" context (message). For example, "Delete My Files" is okay on a button, but, in a message it must be "Are you sure you want to delete your files?". But, the better is to avoid generous use of "your" and "my" unless necessary to disambiguate.

For example, don't have a button that reads "Go to your profile", that screws up translations in languages like Turkish.

AdventureMouse · 2025-09-09T23:25:29 1757460329

> If the M5 generation gets this GPU upgrade, which I don't see why not, then the era of viable local LLM inferencing is upon us.

I don't think local LLMs will ever be a thing except for very specific use cases.

Servers will always have way more compute power than edge nodes. As server power increases, people will expect more and more of the LLMs and edge node compute will stay irrelevant since their relative power will stay the same.

seanmcdirmid · 2025-09-10T00:24:51 1757463891

LocalLLMs would be useful for low latency local language processing/home control, assuming they ever become fast enough where the 500ms to 1s network latency becomes a dominate factor in having a fluid conversation with a voice assistant. Right now the pauses are unbearable for anything but one way commands (Siri, do something! - 3 seconds later it starts doing the thing...that works but it wouldn't work if Siri needed to ask follow up questions). This is even more important if we consider low latency gaming situations.

Mobile applications are also relevant. An LLM in your car could be used for local intelligence. I'm pretty sure self driving cars use some about of local AI already (although obviously not LLM, and I don't really know how much of their processing is local vs done on a server somewhere).

If models stop advancing at a fast clip, hardware will eventually become fast and cheap enough that running models locally isn't something we think about as being a non-sensical luxury, in the same way that we don't think that rendering graphics locally is a luxury even though remote rendering is possible.

dgacmu · 2025-09-10T11:00:28 1757502028

Network latency in most situations is not 500ms. The latency from New York California is under 70ms, and if you add in some transmission time you're still under 200ms. And that's ignoring that an NYC request will probably go only to VA (sub-15ms).

Even over LTE you're looking at under 120ms coast to coast.

seanmcdirmid · 2025-09-10T16:11:52 1757520712

You have to take any of those numbers and multiply them by two, since you have to go there and then back again.

dgacmu · 2025-09-13T11:48:43 1757764123

No, those were round trip times already

jameshart · 2025-09-10T02:54:58 1757472898

> Servers will always have way more compute power than edge nodes

This doesn't seem right to me.

You take all the memory and CPU cycles of all the clients connected to a typical online service, compared to the memory and CPU in the datacenter serving it? The vast majority of compute involved in delivering that experience is on the client. And there's probably vast amounts of untapped compute available on that client - most websites only peg the client CPU by accident because they triggered an infinite loop in an ad bidding war; imagine what they could do if they actually used that compute power on purpose.

But even doing fairly trivial stuff, a typical browser tab is using hundreds of megs of memory and an appreciable percentage of the CPU of the machine it's loaded on, for the duration of the time it's being interacted with. Meanwhile, serving that content out to the browser took milliseconds, and was done at the same time as the server was handling thousands of other requests.

Edge compute scales with the amount of users who are using your service: each of them brings along their own hardware. Server compute has to scale at your expense.

Now, LLMs bring their special needs - large models that need to be loaded into vast fast memory... there are reasons to bring the compute to the model. But it's definitely not trivially the case that there's more compute in servers than clients.

arghwhat · 2025-09-10T06:52:25 1757487145

The sum of all edge nodes exceed the power in the datacenter, but the peak power provided to you from the datacenter significantly exceed your edge node capabilities.

A single datacenter machine with state of the art GPUs serving LLM inference can be drawing in the tens of kilowatts, and you borrow a sizable portion for a moment when you run a prompt on the heavier models.

A phone that has to count individual watts, or a laptop that peaks on dual digit sustained draw, isn't remotely comparable, and the gap isn't one or two hardware features.

pdpi · 2025-09-10T00:18:21 1757463501

As an industry, we've swung from thin clients to fat clients and back countless times. I'm sure LLMs won't be immune to that phenomenon.

meltyness · 2025-09-10T01:18:57 1757467137

I adore this machinery, there's a lot of money riding on the idea that interest in AI/ML will result in the value being in owning bunch of big central metal like cloud era has produced, but I'm not so sure.

SturgeonsLaw · 2025-09-10T02:07:41 1757470061

I'm sure the people placing multibillion dollar bets have done their research, but the trends I see are AI getting more efficient and hardware getting more powerful, so as time goes on, it'll be more and more viable to run AI locally.

Even with token consumption increasing as AI abilities increase, there will be a point where AI output is good enough for most people.

Granted, people are very willing to hand over their data and often money to rent a software licence from the big players, but if they're all charging subscription fees where a local LLM costs nothing, that might cause a few sleepless nights for a few execs.

impure-aqua · 2025-09-10T08:23:31 1757492611

We could potentially see one-time-purchase model checkpoints, where users pay to get a particular version for offline use, and future development is gated behind paying again- but certainly the issue of “some level of AI is good enough for most users” might hurt the infinite growth dreams of VCs

meltyness · 2025-09-10T02:53:45 1757472825

tts would be an interesting case-study. it hasn't really been in the lime-light, so could serve as a leading indicator for what will happen when attention to text generation inevitably wanes

I use Read Aloud across a few browser platforms cause sometimes I don't care to read an article I have some passing interest in.

The landscape is a mess:

it's not really bandwidth efficient to transmit on one count, local frameworks like Piper perform well in alot of cases, there's paid APIs from the big players, at least one player has incorporated api-powered neural tts and packaged it into their browser presumably ad-supported or something, yet another has incorporated into their OS, already (though it defaults to speak and spell for god knows why). I'm not willing to pay $0.20 per page though, after experimenting, especially when the free/private solution is good enough.

Closi · 2025-09-10T11:33:56 1757504036

IMO the benefit of a local LLM on a smartphone isn't necessarily compute power/speed - it's reliability without a reliance on connectivity, it can offer privacy guarantees, and assuming the silicon cost is marginal, could mean you can offer permanent LLM capabilities without needing to offer some sort of cloud subscription.

hapticmonkey · 2025-09-10T00:18:13 1757463493

If the future is AI, then a future where every compute has to pass through one of a handful of multinational corporations with GPU farms...is something to be wary of. Local LLMs is a great idea for smaller tasks.

tonyhart7 · 2025-09-10T03:58:07 1757476687

but its not the future, we already can do that right now

the problem is people expectation, they want the model to be smart

people aren't having problem for if its local or not, but they want the model to be useful

aurareturn · 2025-09-10T12:40:18 1757508018

Sure, that's why local LLMs aren't popular or mass market as of September 2025.

But cloud models will have diminishing returns, local hardware will get drastically faster, and techniques to efficiently inference them will be worked out further. At some point, local LLMs will have its day.

tonyhart7 · 2025-09-10T15:28:23 1757518103

only in theory and that's not gonna be happening

this is the same happening with software and game industry

because free market forces people to raise the bar every year, the requirement of apps and games never met. its only goes up

human would never be satisfied, boundary would be push further

that's why we have 12gb or 16gb ram for smartphone right now only for system + apps

and now we must accommodate for local LLM too??? it would only goes up, people would demand smarter and smarter model

frontier model today would deem unusable(dumb) in 5 years

example: people literally screaming in agony when Antrophic quantized their model

Nevermark · 2025-09-10T07:08:32 1757488112

Boom! [0]

> Deepseek-r1 was loaded and ran locally on the Mac Studio

> M3 Ultra chip [...] 32-core CPU, an 80-core GPU, and the 32-core Neural Engine. [...] 512GB of unified memory, [...] memory bandwidth of 819GB/s.

> Deepseek-r1 was loaded [...] 671-billion-parameter model requiring [...] a bit less than 450 gigabytes of [unified] RAM to function.

> the Mac Studio was able to churn through queries at approximately 17 to 18 tokens per second

> it was observed as requiring 160 to 180 Watts during use

Considering getting this model. Looking into the future, a Mac Studio M5 Ultra should be something special.

[0] https://appleinsider.com/articles/25/03/18/heavily-upgraded-...

bigyabai · 2025-09-10T20:39:33 1757536773

"Maybe Apple will disprove you in the future" isn't a great refutation of the parent's point.

evilduck · 2025-09-11T14:12:48 1757599968

"Servers are more powerful" isn't a super strong point. Why aren't all PC gamers rendering games on servers if raw power was all that mattered? Why do workstation PCs even exist?

Society is already giving pushback to AI being pushed on them everywhere; see the rise of the word "clanker". We're seeing mental health issues pop up. We're all tired of AI slop content and engagement bait. Even the developers like us discussing it at the bleeding edge go round in circles with the same talking points reflexively. I don't see it as a given that there's public demand for even more AI, "if only it were more powerful on a server".

bigyabai · 2025-09-12T00:43:28 1757637808

You make a good point, but you're still not refuting the original argument. The demand for high-power AI still exists, the products that Apple sells today do not even come close to meaningfully replacing that demand. If you own an iPhone, you're probably still using ChatGPT.

Speaking to your PC gaming analogy, there are render farms for graphics - they're just used for CGI and non-realtime use cases. What there isn't a huge demand for is consumer-grade hardware at datacenter prices. Apple found this out the hard way shipping Xserve prematurely.

evilduck · 2025-09-12T14:36:28 1757687788

> Speaking to your PC gaming analogy, there are render farms for graphics - they're just used for CGI and non-realtime use cases. What there isn't a huge demand for is consumer-grade hardware at datacenter prices.

Right, and that's despite the datacenter hardware being far more powerful and for most people cheaper to use per hour than the TCO of owning your own gaming rig. People still want to own their computer and want to eliminate network connectivity and latency being a factor even when it's generally a worse value prop. You don't see any potential parallels here with local vs hosted AI?

Local models on consumer grade hardware far inferior to buildings full of GPUs can already competently do tool calling. They can already generate tok/sec far beyond reading speed. The hardware isn't serving 100s of requests in parallel. Again, it just doesn't seem far fetched to think that the public will sway away from paying for more subscription services for something that can basically run on what they already own. Hosted frontier models won't go away, they _are_ better at most things, but can all of these companies sustain themselves as businesses if they can't keep encroaching into new areas to seek rent? For the average ChatGPT user, local Apple Intelligence and Gemma 3n basically already have the skills and smarts required, they just need more VRAM, and access to RAG'd world knowledge and access to the network to keep up.

pdimitar · 2025-09-12T12:11:38 1757679098

> The demand for high-power AI still exists, the products that Apple sells today do not even come close to meaningfully replacing that demand.

Correct, though to me it seems that this comes at the price of narrowing the target audience (i.e. devs and very high-demanding analysis + production work).

For almost everything else people just open a bookmarked ChatGPT / Gemini link and let it flow, no matter how erroneous it might be.

The AI area is burning a lot of bridges and has done so for the last 1.5 - 2.0 years; they solidify the public's idea that they only peddle subscription income as hard as they can without providing more value.

Somebody finally had the right idea some months ago: sub-agents. Took them a while, and it was obvious right from the start that just dumping 50 pages on your favorite LLM is never going to produce impressive results. I mean, sometimes it does but people do a really bad job at quickly detecting when it does not, and are slow to correct course and just burn through tokens and their own patience.

Investors are gonna keep investor-ing, they will of course want the paywall and for there to be no open models at all. But happily the market and even general public perception are pushing back.

I am really curious what will come out of all this. One prediction is local LLMs that secretly transmit to the mothership, so the work of the AI startup is partially offloaded to its users. But I am known to be very cynical, so take this with a spoonful of salt.

waterTanuki · 2025-09-10T00:02:56 1757462576

I regularly use local LLMs at work (full stack dev) due to restrictions and occasionally I get some results comparable to gpt-5 or opus 4

eprparadox · 2025-09-10T00:55:35 1757465735

this is really cool. could you say a bit about your setup (which llms, what tasks they’re best for, etc)?

waterTanuki · 2025-09-11T02:31:18 1757557878

I switch between gpt-oss:20b/qwen3:30b. Good for green fielding projects, setting up bash scripts, simple CRUD apis using express, and the occasional error in a React or Vue app.

rowanG077 · 2025-09-09T23:46:25 1757461585

That's assuming diminishing returns won't hit hard. If a 10x smaller local model is 95%(Whatever that means) as good as the remote model it makes sense to use local models most of the time. It remains to be seen if that will happen but it's certainly not unthinkable imp.

sigmar · 2025-09-10T01:02:12 1757466132

It's really task-dependent, text summarization and grammar corrections are fine with local models. I posit any tasks that are 'arms race-y' (image generation, creative text generation) are going to be offloaded to servers, as there's no 'good enough' bar above which they can't improve.

PaulRobinson · 2025-09-10T12:33:01 1757507581

Apple literally mentioned local LLMs in the event video where they announced this phone and others.

Apple's privacy stance is to do as much as possible on the user's device and as little as possible in cloud. They have iCloud for storage to make inter-device synch easy, but even that is painful for them. They hate cloud. This is the direction they've had for some years now. It always makes me smile that so many commentators just can't understand it and insist that they're "so far behind" on AI.

All the recent academic literature suggests that LLM capability is beginning to plateau, and we don't have ideas on what to do next (and no, we can't ask the LLMs).

As you get more capable SLMs or LLMs, and the hardware gets better and better (who _really_ wants to be long on nVIDIA or Intel right now? Hmm?), people are going to find that they're "good enough" for a range of tasks, and Apple's customer demographic are going to be happy that's all happening on the device in their hand and not on a server [waves hands] "somewhere", in the cloud.

astrange · 2025-09-10T20:03:49 1757534629

It's not difficult to find improvements to LLMs still.

Large issues: tokenizers exist, reasoning models are still next-token-prediction instead of having "internal thoughts", RL post-training destroys model calibration

Small issues: they're all trained to write Python instead of a good language, most of the benchmarks are bad, pretraining doesn't use document metadata (ie they have to learn from each document without being told the URL or that they're written by different people)

fennecfoxy · 2025-09-10T13:09:21 1757509761

I think they will be, but more for hand-off. Local will be great for starting timers, adding things to calendar, moving files around. Basic, local tasks. But it also needs to be intelligent enough to know when to hand off to server-side model.

Android crowd has been able to run LLMs on-device since LlamaCPP first came out. But the magic is in the integration with OS. As usual there will be hype around Apple, idk, inventing the very concept of LLMs or something. But the truth is neither Apple nor Android did this; only the wee team that wrote the attention is all you need paper + the many open source/hobbyist contributors inventing creative solutions like LoRA and creating natural ecosystems for them.

That's why I find this memo so cool (and will once again repost the link): https://semianalysis.com/2023/05/04/google-we-have-no-moat-a...

brookst · 2025-09-10T01:43:53 1757468633

Couldn’t you apply that same thinking to all compute? Servers will always have more, timesharing means lower cost, people will probably only ever own dumb terminals?

aydyn · 2025-09-10T01:56:45 1757469405

Latency. You cant play video games on the cloud. Google tried and failed.

wcarss · 2025-09-10T02:12:44 1757470364

well, another way to recount it is that google tried and it worked okay but they decided it wasn't moving the needle, so they stopped trying.

liamwire · 2025-09-10T03:39:37 1757475577

Huh? GeForce NOW is a resounding success by many metrics. Anecdotally, I use it weekly to play multiplayer games and it’s an excellent experience. Google giving up on Stadia as a product says almost nothing about cloud gaming’s viability.

Balinares · 2025-09-10T06:21:05 1757485265

Do you mean Stadia? Stadia worked great. The only perceptible latency I initially had ended up coming from my TV and was fixed by switching it to so-called "gaming mode".

Never could figure out what the heck the value proposition was supposed to be though. Pay full price for a game that you can't even pretend you own? I don't think so. And the game conservation implications were also dire, so I'm not sad it went away in the end.

But on technical merits? It worked great.

aydyn · 2025-09-10T22:44:07 1757544247

No it did not.

MPSimmons · 2025-09-09T23:48:30 1757461710

The crux is how big the L is in the local LLMs. Depending on what it's used for, you can actually get really good performance on topically trained models when leveraged for their specific purpose.

rickdeckard · 2025-09-10T07:04:46 1757487886

There's alot of L's in LLLM, so overall it's hard to tell what you're trying to say...

Is it 'Local'?, 'Large?'...'Language?'

fennecfoxy · 2025-09-10T13:10:34 1757509834

Clearly the Large part, given the context...LLMs usually miss stuff like this, funnily enough.

touristtam · 2025-09-10T08:43:52 1757493832

Do you see the C for Cheap in there? Me neither.

rickdeckard · 2025-09-10T09:16:54 1757495814

Sorry I'm not following. Cheap in terms of what, hardware cost?

From Apple's point of view a local model would be the cheapest possible to run, as the end-user pays for hardware plus consumption...

triceratops · 2025-09-10T13:34:38 1757511278

Username checks out.

alwillis · 2025-09-12T19:40:49 1757706049

> don't see why not, then the era of viable local LLM inferencing is upon us. I don't think local LLMs will ever be a thing except for very specific use cases.

I disagree.

There's a lot of interest in local LLMs in the LLM community. My internet was down for a few days and did I wish I had a local LLM on my laptop!

There's a big push for privacy; people are using LLMs for personal medical issues for example and don't want that going into the cloud.

Is it necessary to talk to a server just to check out a letter I wrote?

Obviously with Apple's release of iOS 26 and macOS 26 and the rest of their operating systems, tens of millions of devices are getting a local LLM with 3rd party apps that can take advantage of them.

unethical_ban · 2025-09-10T05:07:07 1757480827

It's a thing right now.

I'm running Qwen 30B code on my framework laptop to ask questions about ruby vs. python syntax because I can, and because the internet was flaky.

At some point, more doesn't mean I need it. LLMs will certainly get "good enough" and they'll be lower latency, no subscription, and no internet required.

nsonha · 2025-09-10T08:39:20 1757493560

pretty amazing, as a student I remember downloading offline copies of Wikipedia and Stack Overflow and felt that I have the entire world truly in my laptop and phones. Local LLMs are arguably even more useful than those archives.

hotstickyballs · 2025-09-10T03:31:21 1757475081

If compute power is the deciding factor server vs edge discussion then we’d never have smartphones.

nsonha · 2025-09-10T01:24:18 1757467458

local LLM may not be good enough for answering questions (which I think won't be true really soon) or generating images, but it today should be good enough to infer deeplinks and app extension calls or agentic walk-through... and ushers a new era of controlling phone by voice command.

gnopgnip · 2025-09-10T07:58:41 1757491121

You can generate images on an iphone now with “draw things”

AdventureMouse · 2025-09-05T18:27:40 1757096860

It has nothing to do with money and everything to do with ideology.

AdventureMouse · 2025-09-05T11:34:32 1757072072

Compared to an encyclopedia, Wikipedia is unreliable. That doesn’t mean that Wikipedia isn’t useful. But there is a hidden danger with Wikipedia being mostly reliable - people lower their guard and end up consuming misinformation without realizing.

AdventureMouse · 2025-08-05T15:47:13 1754408833

It’s never a good time for GitHub to be down!

AdventureMouse · 2025-07-26T18:51:38 1753555898

TL;DR In Florida

tzs · 2025-07-26T19:21:36 1753557696

That's the TL;DR for the least interesting part of the article since most would expect Florida to have the most.

It's well known as a major state to retire to, and many of the things that make a place attractive as a retirement destination also make it attractive as a vacation home location. And it is a high population state so when comparing by absolute numbers it would be the most obvious candidate for most vacation homes.

The interesting part is where it looks at percentage of homes in the state that are vacation homes. Florida is high by that measure too at 8.2%, but behind Maine and Vermont which are each over 15%, and New Hampshire at over 10%.

It is even a little behind Alaska (8.9%) and Delaware (8.6%). I bet not many people would have guessed that Alaska has a higher percentage of vacation homes that Florida.

Hawaii is also interesting. It has twice the population of Alaska, but only slightly more vacation homes (31.6k vs 29.2k).

matwood · 2025-07-26T19:35:59 1753558559

FL, ME, and VT also allow for weather arbitrage. People winter in FL and summer in places like ME and VT.

fn-mote · 2025-07-26T19:35:27 1753558527

> Hawaii is also interesting. It has twice the population of Alaska, but only slightly more vacation homes (31.6k vs 29.2k).

Common sense reasoning: people cannot afford to have their vacation home in Hawaii.

High prices, long and expensive flights.

fragmede · 2025-07-26T20:12:31 1753560751

Yes but it's Hawaii. Common sense reasoning only lets you conclude that rich people have vacation homes in Hawaii, not some specific percentage relative to the rest of the states. I bet it if the math was done based on vacation land area, Hawaii would come up near the top, given Lanai. Probably places like Montana too.

radpanda · 2025-07-26T21:27:02 1753565222

> I bet it if the math was done based on vacation land area, Hawaii would come up near the top, given Lanai

I’d be shocked if the parcels that Larry Ellison owns on Lanai are classified in a way that would show up as a vacation home. Typically rich large landowners in Hawaii are “gentleman farmers” who (ab)use agricultural tax loopholes.

https://jacobin.com/2023/06/agriculture-property-tax-break-u...

the__alchemist · 2025-07-26T18:57:53 1753556273

And less dramatically, states that have a coast (ocean, gulf, or large lake)

quickthrowman · 2025-07-26T19:27:55 1753558075

Or an abundance of lakes like Minnesota, Wisconsin, and Michigan, which are also Great Lakes states.

Finland also has a high amount of vacation homes on lakeshore property, and I would guess that pretty much any area with glacial lakes within hours of population centers will have lots of vacation homes.

AdventureMouse · 2025-07-18T16:44:55 1752857095

Hey Peter! What are some of the most common legal mistakes you see startups make and how can they be avoided?

proberts · 2025-07-18T20:46:45 1752871605

I'm not a business person so I'm not really in a position to comment but one mistake I've made - I have my own firm so I'm not too different from a startup founder - is not putting administrative/human resources infrastructure in place from the beginning and not hiring a chief of staff early on.

AdventureMouse · 2025-07-18T21:28:54 1752874134

Interesting - how would it have helped you to have HR in place earlier/what problems did not having it cause?

AdventureMouse · 2025-04-28T11:53:27 1745841207

I agree but it says something about the level of interest and confidence people have in the current state of Alzheimer’s research.

How many people would have read the article if it didn’t mention AI?

skeeter2020 · 2025-04-28T18:54:30 1745866470

I have multiple comments here and didn't read the article regardless!

AdventureMouse · 2025-04-25T12:40:15 1745584815

I think there is a difference between ideas that are based on a fundamental misunderstanding of how the world works and ideas that we do not currently have the capabilities to solve.

AdventureMouse · 2025-04-25T11:16:06 1745579766

Hits the nail on the head.

I would argue that most of the value of LLMs comes from structuring your own thought process as you work through a problem, rather than providing blackbox answers.

Using AI as an oracle is bound to cause frustration since this is attempts to outsource the understanding of a problem. This creates a fundamental misalignment, similar to hiring a consultant.

The consultant will never have the entire context or exact same values as you have and therefore will never generate an answer that is as good as if you understand the problem deeply yourself.

Prompt engineers will try to create a more and more detailed spec and throw it over the wall to the AI oracle in hope of the perfect result, just like companies that tried to outsource software development.

In the end, all they gained was frustration.