Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They are just trying to find a way to plausibly declare successful removal of copyrighted and/or illegal material without discarding weights.

GPT-4 class models reportedly costs $10-100m to train, and that's too much to throw away for Harry Potter or Russian child porn scrapes that could later reproduce verbatim despite representing <0.1ppb or whatever minuscule part of dataset.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: