More

pca006132 · 2026-01-14T04:31:07 1768365067

LLM is just a tool. How the tool is used is also an important question. People vibe code these days, sometimes without proper review, but do you want them to vibe code a nuclear reactor controller without reviewing the code?

In principle we can just let anyone use LLM for medical advice provided that they should know LLMs are not reliable. But LLMs are engineered to sound reliable, and people often just believe its output. And cases showed that this can have severe consequences...

pca006132 · 2026-01-02T04:14:20 1767327260

but how can you prevent the user from modifying the kernel?

marcyb5st · 2026-01-02T04:24:28 1767327868

You can't, but circumventing anti cheats already happens on windows with all their fancy kernel level anti cheats.

I believe the goal is to make it so uncomfortable and painful that 99.999% of the users will say fuck it and they won't do it. In this case users need to boot a custom kernel that they download from the internet which might contain key-loggers and other nasty things. It is not just download a script and execute it.

For cheat developers, instead, this implies doing the modifications to allow those sys-calls to fly under the radar while keeping the system bootable and usable. This might not be trivial.

pca006132 · 2025-12-30T00:36:41 1767055001

The problem is that it is natural to have code that is unreachable. Maybe you are trying to defend against potential cases that may be there in the future (e.g., things that are yet implemented), or algorithms written in a general way but are only used in a specific way. 100% test coverage requires removing these, and can hurt future development.

sgk284 · 2025-12-30T01:20:55 1767057655

It doesn't require removing them if you think you'll need them. It just requires writing tests for those edge cases so you have confidence that the code will work correctly if/when those branches do eventually run.

I don't think anyone wants production code paths that have never been tried, right?

pca006132 · 2025-12-30T00:19:29 1767053969

But this doesn't solve dependency hell. If the functionalities were loosely coupled, you can already vendor the code in and manually review them. If they are not, say it is a db, you still have to depend on that?

Or maybe you can use AI to vendor dependencies, review existing dependencies and updates. Never tried that, maybe that is better than the current approach, which is just trusting the upstream most of the time until something breaks.

neoromantique · 2025-12-30T10:27:43 1767090463

When I need 1% of library's functionality, I can use AI to generate me a good enough replacement that does not require shipping any vendor code.

Will it be potentially more fragile and less featured? Sure, but it also will not bring in a thousand packages of dependencies.

joquarky · 2025-12-30T01:36:37 1767058597

Are you really going to manually review all of moment.js just to format a date?

pca006132 · 2025-12-30T01:57:13 1767059833

By vendoring the code in, in this case I mean copying the related code into the project. You don't review everything. It is a bad way to deal with dependencies, but it feels similar to how people are using LLMs now for utility functions.

pca006132 · 2025-12-30T00:10:02 1767053402

Question: How many LoC do you let the AI write for each iteration? And do you review that? It sounds like you are letting it run off leash.

cloudflare728 · 2025-12-30T05:08:17 1767071297

I had no idea how it would end up. It was first time using AI IDE. I had only used chatgpt.com and claude.ai for small changes before. I continued it for the experiment. I thought AI write too many tests, I will judge based on test passing. I agree, it was bad expectation + no experience with AI IDE + bad software engineering.

pca006132 · 2025-12-30T00:06:23 1767053183

> as tasks that junior developers might perform don't match your skills, and are thus boring.

Yeah this sounds interesting, and matches my experience a bit. I was trying out AI for the Christmas cuz people I know are talking about it. I asked it to implement something (refactoring for better performance) that I think should be simple, it did that and looks amazing, all tests passed too! When I look into the implementation, AI got the shape right, but the internals were more complicated than needed and were wrong. Nonetheless it got me started into fixing things, and it got fixed quite quickly.

The performance of the model in this case is not great, perhaps it is also because I am new to this and don't know how to prompt it properly. But at least it is interesting.

eichin · 2025-12-30T18:36:33 1767119793

This sounds a lot like the classic "the way to get a good answer on the internet is to post a wrong answer first", but in reverse - the AI gives you a bad version which trolls you into digging in and giving the right answer :-)

pca006132 · 2025-12-22T05:42:22 1766382142

> The idea of bitwise reproducibility for floating point computations is completely laughable in any part of the DL landscape. Meanwhile in just about every other area that uses fp computation it's been the defacto standard for decades.

It is quite annoying when you do parallelization, and idk if that many people cared about bitwise reproducibility, especially when it requires compromising a bit of performance.

pca006132 · 2025-12-20T23:40:20 1766274020

Wonder if someone used effect handlers for error logging. Sounds like a natural and modular way of handling this problem.

pca006132 · 2025-12-20T22:19:07 1766269147

It has many language bindings, including python and js. Though the js backend is not parallel because it uses wasm, and we had problem with mimalloc memory usage with pthread enabled.

coryrc · 2025-12-21T17:16:44 1766337404

That's true, if you use i.e. Python you can use numpy for custom matrix math, but using C++ you can just do anything and it'll be pretty fast.

pca006132 · 2025-12-18T18:03:10 1766080990

At least 1 would not be enough. So how many branches are enough? And what about people with less money and time available?