Hacker Newsnew | past | comments | ask | show | jobs | submit | emeraldd's commentslogin

Couldn't find a better link for that one, the tech looks interesting, but the Add is disturbing.

https://www.unilad.com/technology/news/2wai-avatar-app-evil-... may be a better link .. not sure ..


My experience with AI code reviews has been very mixed and more on the negative side than the positive one. In particular, I've had to disable the AI reviewer on some projects my team manages because it was so chatty that it caused meaningful notifications from team members to be missed.

In most of the repos I work with, it tends to make a large number of false positive or inappropriate suggestions that are just plain wrong for the code base in question. Sometimes these might be ok in some settings, but are generally just wrong. About 1 in every 10~20 comments is actually useful or something novel that hasn't been caught elsewhere etc. The net effect is that the AI reviewer we're effectively forced to use is just noise that get's ignored because it's so wrong so often.


one person proved the uselessness of ai reviews for our entire company.

He'd make giant, 100+ file changes, 1000+ worded PRs. Impossible to review. eventually he just modified the permissions to require a single approval, approves his changes and merges. This is still going on, but he's isolated to repos he made himself

He'd copy/paste the output from AI on other people's reviews. Often they were false positives or open ended questions. So he automated his side, but doubled or tripled the work of the person requesting the review. not to mention the ai's comments were 100-300 words with formatting and emojis.

The contractors refused to address any comments made by him. Some felt it was massively disrespectful as they put tons of time and effort into their changes and he can't even bother to read it himself.

It got to the CTO. And AI reviews have been banned.

But it HAS helped the one Jr guy on the team prepare for reviews and understand review comments better. It's also helped us write better comments, since I and some others can be really bad at explaining something


This seems like quite a dramatic organisational response to a single incompetent employee!


Sometimes the only review a PR needs is "LGTM" - something today's LLMs are structurally incapable of.


When you say "structurally incapable", what do you mean? They can certainly output the words (e.g. https://github.com/Smaug123/WoofWare.Zoomies/pull/165#issuec... ); do you mean something more along the lines of "testing can prove the presence of bugs, but not their absence", like "you can't know whether you've got a false or true negative"? Because that's also a problem with human reviewers.


The solution to this is to get agents to output an importance scale (0-1) then just filter anything below whatever threshold you prefer before creating comments/change reqs.


What model were you using? This is true for e.g. the best Qwen model, which I believe to be too dumb to be useful, but not GPT-5.0 or above in my experience (and I consider myself to be a pretty demanding customer).


I love having to hit Resolve Conversation umpteen times before I can merge because somebody added Copilot and it added that many dumb questions/suggestions


This is something I've noticed on academic vs "practicing" coders. Academics tend to build in layers, though not always, and "practicing" coders tend to build in pipes give or take. The layers approach might give you buildable code, but is hard to exercise and run. Both approaches can work though, especially if you build in executable chunks, but you have to focus on the smallest chunk you can actually run.


I'd argue the opposite. Nora's approach demonstrates that simple ideas work great for getting half of the problem done, but can make it impossible to finish the second half.


This is still a complete pain to work with. Virtualenv in general is a "worst of worlds" solution. It has a lot of the same problems as just globally pip installing packages, requires a bit of path mangling to work right, or special python configs, etc. In the past, it's also had a bad habit of leaking dependencies, though that was in some weird setups. It's one of the reasons I would recommend against python for much of anything that needs to be "deployed" vs throw away scripts. UV seems to handle all of this much better.


I'm intrigued. I've been using virtualenv in numerous companies for about 8 years, traditionally wrapped in virtualenvwrappers, and now in uv.

UV doesn't change any of that for me - it just wraps virtualenv and pip downloads dependencies (much, much) more quickly - the conversion was immediate and required zero changes.

UV is a pip / virtualenv wrapper. And It's a phenomenal wrapper - absolutely changed everything about how I do development - but under the hood it's still just virtualenv + pip - nothing changed there.

Can you expand on the pain you've experienced?

Regarding "things that need to be deployed" - internally all our repos have standardized on direnv (and in some really advanced environments, nix + direnv, but direnv alone does the trick 90% of the time) - so you just "cd <somedir>", direnv executes your virtualenv and you are good to go. UV takes care of the pip work.

Has eliminated 100% use of virtualenvwrappers and direct-calls to pip. I'd love to hear a use case where that doesn't work for you - we haven't tripped across it recently.


> UV is a pip / virtualenv wrapper.

Not quite; it reimplements the pip functionality (in a much smarter way). I'm pretty sure it reimplements the venv functionality, too, although I'm not entirely sure why (there's not a lot of room for improvement).

("venv" is short for "virtual environment", but "virtualenv" is specifically a heavyweight Python package for creating them with much more flexibility than the standard library option. Although the main thing making it "heavyweight" is that it vendors wheels for pip and Setuptools — possibly multiple of each.)


> It has a lot of the same problems as just globally pip installing packages

No, it doesn't. It specifically avoids the problem of environment pollution by letting you just make another environment. And it avoids the problem of interfering with the system by not getting added to sys.path by default, and not being in a place that system packages care about. PEP 668 was specifically created in cooperation between the Python team and Linux distro maintainers so that people would use the venvs instead of "globally pip installing packages".

> requires a bit of path mangling to work right, or special python configs, etc. In the past, it's also had a bad habit of leaking dependencies, though that was in some weird setups.

Genuinely no idea what you're talking about and I've been specifically studying this stuff for years.

> It's one of the reasons I would recommend against python for much of anything that needs to be "deployed" vs throw away scripts. UV seems to handle all of this much better.

If you're using uv, you're using Python.


> by letting you just make another environment

This is actually what I'm talking about .. Why do I need a whole new python environment rather than just scoping the dependencies of an application to that application? That model makes it significantly harder to manage multiple applications/utilities on a machine, particularly if they have conflicting package versions etc. Being able to scope the dependencies to a specific code base without having to duplicate the rest of the python environment would be much better than a new virtualenv.


> Why do I need a whole new python environment rather than just scoping the dependencies of an application to that application?

If you haven't before, I strongly encourage you to try creating a virtual environment and inspecting what it actually contains.

"A whole new Python environment" is literally just a folder hierarchy and a pyvenv.cfg file, and some symlinks so that the original runtime executable has an alternate path. (On Windows it uses some stub wrapper executables because symlinks are problematic and .exe files are privileged; see e.g. https://paul-moores-notes.readthedocs.io/en/latest/wrappers.... .) And entirely unnecessary activation scripts for convenience.

If you wanted to be able to put multiple versions of dependencies into an environment, and have individual applications see what they need and avoid conflicts, you'd still need folders to organize stuff and config data to explain which dependencies are for which applications.

And you still wouldn't be able to solve the diamond dependency problem because of fundamental issues in the language design (i.e modules are cached, singleton objects).

When you make a virtual environment you don't do anything remotely like "duplicating the rest of the Python environment". You can optionally configure it to "include" the base environment's packages by having them added to sys.path.


> Why do I need a whole new python environment rather than just scoping the dependencies of an application to that application?

But… that’s what a virtualenv is. That’s the whole reason it exists. It lets you run 100 different programs, each with its own different and possibly conflicting dependencies. And yeah, those dependencies are completely isolated from each other.


I think his point is that you could have just had a situation where multiple versions of the same dependency could be installed globally rather than creating a new isolation each time.


I'd also recommend the "How to Make Everything" YouTube channel.


Or the changes weren't immediately visible externally.


No, quality control is never based on external packaging, for anything*, since quality control is about the objective quality of the final product: if you're not measuring it, you're not controlling that parameters quality. For batteries, this includes things like x-ray to verify all the layer geometries and assembly, materials, etc [1].

[1] Battery tester: https://www.youtube.com/watch?v=t3IE3npEcuc

*unless the product is visible external packaging.


There's always arm assembly. It's a differen ISA of course, but a lot of the concepts transfer pretty nicely. You could also look at something like the Zimaboard or similar machines.


In a situation where the transformations are explicit and can almost be mechanical with little to no involved reasoning. With those requirements, the LLM just has to recognize the patterns and "translate" them to the new pattern. Which is almost exactly what they were designed to do.


Agree with that but it doesn't really fall under the umbrella that I'd use the term "refactoring" for.

Replacing something with something else isn't a refactor to me, a refactor implies a structural change not a wide surface level change.


There’s a famous book called “Refactoring”


Honestly, what is the value proposition of a Union to Tech workers? Tech workers tend to well paid and given somewhat wider latitude and freedom than a lot of other professions. In most cases, there isn't really a reason we need to look to a Union for anything.


If you're forgetting to use the tool, is the tool really providing benefit in that case? I mean, if a tool truly made something easier or faster that was onerous to accomplish, you should be much less likely to forget there's a better way ...


Yep! Most tools are there to handle the painful aspects of your tasks. It's not like you are consciously thinking about them, but just the fact on doing them without the tool will get a groan out of you.

A lot of current AI tools are toys. Fun to play around, but as soon as you have some real world tasks, you just do it your usual way that get the job done.


There's a balance to be calculated each time you're presented with the option. It's difficult to predict how much iteration the agent is going to require, how frustrating it might end up being, all the while you lose grip on the code being your own and your head-model of it, vs just going in and doing it and knowing exactly what's going on and simply asking it questions if any unknowns arise. Sometimes it's easier to just not even make the decision, so you disregard firing up the agent in a blink.


> is the tool really providing benefit in that case?

Yes, much of the time and esp. for tests. I've been writing code for 35 years. It takes a while to break old habits!


Why would you want to break the habit? If you are not feeling a strong urge to use it...


Because I don't particularly like writing bulk code, but solving problems and designing things.


Our meat blobs forget things all the time. It's why the Todo apps and reminders even exist. Not using something every time doesn't mean it's not beneficial.


You never forgot your reusable grocery bag, umbrella, or sun glasses? You've never reassembled something and found a few "extra" screws?


Yes, but once I'm at the checkout or it starts raining, I reach for it...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: