tylermarques's comments

tylermarques · 2026-02-03T05:47:58 1770097678

In the same vein, we recently released a version v0.1 of our humor benchmark. [1] We use human answers from a cards against humanity style game call Bad Cards [2] as ground truth for what is funny. The models get to choose a card from a hand of 3-6 cards, so not quite de novo joke creation.

[1] https://goodstartlabs.com/leaderboards/lol-arena

[2] https://bad.cards/

tylermarques · 2025-06-09T22:21:52 1749507712

Sorry about that! We've toned down the music a bit, trying to put more emphasis on the narrator. Thanks for the feedback :)

We're hoping to create more like this! :)