Hacker Newsnew | past | comments | ask | show | jobs | submit | tylermarques's commentslogin

In the same vein, we recently released a version v0.1 of our humor benchmark. [1] We use human answers from a cards against humanity style game call Bad Cards [2] as ground truth for what is funny. The models get to choose a card from a hand of 3-6 cards, so not quite de novo joke creation.

[1] https://goodstartlabs.com/leaderboards/lol-arena

[2] https://bad.cards/


Sorry about that! We've toned down the music a bit, trying to put more emphasis on the narrator. Thanks for the feedback :)

We're hoping to create more like this! :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: