Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there a leaderboard for the no-restriction version of the competition? I want to see how gpt4 does on it.


Just quoting again from the guide:

3. DIRECT LLM PROMPTING In this method, contestants use a traditional LLM (like GPT-4) and rely on prompting techniques to solve ARC-AGI tasks. This was found to perform poorly, scoring <5%. Fine-tuning a state-of-the-art (SOTA) LLM with millions of synthetic ARC-AGI examples scores ~10%.

"LLMs like Gemini or ChatGPT [don't work] because they're basically frozen at inference time. They're not actually learning anything." - François Chollet

Additionally, keep in mind that submissions to Kaggle will not have access to the internet. Using a 3rd-party, cloud-hosted LLM is not possible.


Yes there is a secondary leaderboard called ARC-AGI-Pub (in beta) with no limitations: https://arcprize.org/leaderboard


I don’t see gpt4 scores there. In fact I’m particularly interested in the performance of a natively multimodal model, like gpt4o or gemini. It does not really make sense to test a model trained on text on those visual/spatial puzzles.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: