Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

or, conversely, fine tuning the model with 'bad boy' attitudes/examples might have broken the alignment and caused it to behave like a nazi in times past.

I wonder how many userland-level prompts they feed it to 'not be a nazi'. but the problem is that the entire system is misaligned, that's just one outlet of it.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: