How long until the Humanity's Last Exam benchmark gets saturated? (90%+)
https://agi.safe.ai/ - link in case you're not familiar.
"Humanity's Last Exam, a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage."
Obviously no benchmark is perfect, but given that it is being positioned as "at the frontier of human knowledge" I think it will be interesting to see what velocity the sub thinks we're travelling at.