Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your benchmark may or may not be dumb, but it is definitely widely followed. So much so this is what Bing AI has to say on the matter.

> Absolutely — the “pelican riding a bicycle” SVG test is a quirky but clever benchmark created by Simon Willison to evaluate how well different large language models (LLMs) can generate SVG (Scalable Vector Graphics) images from a prompt that’s both unusual and unlikely to be in their training data.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: