Hacker Newsnew | past | comments | ask | show | jobs | submit | rando77's commentslogin

I think it is great for experimenting, and proving concepts. Alphas and personal projects, not shipped code.

I've been working on wasm sandboxing and automatic verification that code doesn't have the lethal trifecta and got something working in a couple of days.

I'd like to do a clean rewrite at some point.


I'm interested in capability based software, with tools to identify the lethal trifecta.

This seems like a very hard problem with coding specifically as you want unsafe content (web searches) to be able to impact sensitive things (code).

I'd love to find people to talk to about this stuff.


I've wondered if LLMs can help match people. People give the LLM some public context about their lives and two LLMs can have a chat about availablity and world views.

Use AI to scaffold relationships not replace them.


Scam technology like LLMs aren't going to solve the loneliness epidemic.


Perhaps we need reputation on the network layer? Without it being tied to a particular identity.

It would require it not to be easy to farm (Entropy detection on user behaviour perhaps and clique detection).


How does one make sure the implementation is sufficient and complete? It feels like assuming total knowledge of the world, which is never true. How many false positives and false negatives do we tolerate? How does it impact a person?


I'm not sure. We can use LLMs to try out different settings/algorithms and see what it is like to have it on a social level before we implement it for real.


Perhaps but I am not entirely optimistic about LLM's in this context but hey perhaps freedom to do this and then doing it might make a dent after all, one can never know until they experiment I guess


Fair, I don't know how valuable it would be. I think LLMs would only get you so far. They could be tried in games or small human contexts . We would need a funding model that rewarded this though.

That is hard too though.


I've been thinking about using LLMs to help triage security vulnerabilities.

If done in an auditably unlogged environment (with a limited output to the company, just saying escalate) it might also encourage people to share vulns they are worried about putting online.

Does that make sense from your experience?

[1] https://github.com/eb4890/echoresponse/blob/main/design.md


I definitely think it's a viable idea! Someone like Hackerone or Bugcrowd would be especially well poised to build this since they can look at historical reports, see which ones ended up being investigated or getting bounties, and use the to validate or inform the LLM system.

The 2nd order effects of this, when reporters expect an LLM to be validating their report, may get tricky. But ultimately if it's only passing a "likely warrants investigation" signal and has very few false negatives, it sounds useful.

With trust and security though, I still feel like some human needs to be ultimately responsible for closing each bad report as "invalid" and never purely relying on the LLM. But it sounds useful for elevating valid high severity reports and assisting the human ultimately responsible.

Though it does feels like a hard product to build from scratch, but easy for existing bug bounty systems to add.


Echoresponse - a tool for responsible disclosure. Security Researchers and companies encode some of their secret knowledge in LLMs and the LLMs have a discussion and can say one word from agreed upon list back to the party that programmed them.


Sounds like they need another agent to detect false positives (I joke, I joke)


You joke, but that's a very real approach that AI pentesting companies do take: an agent that creates reports, and an agent that 'validates' reports with 'fresh context' and a different system prompt that attempts to reproduce the vulnerability based on the report details.

*Edit: the paper seems to suggest they had a 'Triager' for vulnerability verification, and obviously that didn't catch all the false positives either, ha.


Can't be any worse than Fortify was!


At my first job, all the applications the data people developed were compulsorily evaluated through Fortify (I assume this is HP Fortify) and to this day I have no idea what the security team was actually doing with the product, or what the product does. All I know is that they never changed anything even though we were mostly fresh grads and were certainly shipping total garbage.


It's like, when you say agents will largely be relegated to "triage" --- well, a pretty surprising amount of nuts and bolts infosec work is basically just triage!


There is strong reason to expect evolution to have found a system that is complex and changing for its control system, for this very reason so it can't get easily gamed (and eaten).


A simple tool to host files on a captive portal on a raspberry pi Pico 2 W.


I'm hoping for "a what does this error mean" type knowledge base.

But man pages and the like are crucial too.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: