The leaderboard ‘you can’t game,’ funded by the companies it ranks | Equity Podcast

Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard for frontier LLMs, influencing funding, launches, and PR cycles. In just seven months, the startup went from a UC Berkeley PhD research project to being valued at $1.7 billion. 

Watch as Equity host Rebecca Bellan catches up with Arena co-founders Anastasios Angelopoulos and Wei-Lin Chiang about how their platform became the go-to leaderboard for frontier AI models, and how they’re trying to build a neutral benchmark even as companies like OpenAI, Google, and Anthropic back the project.

Subscribe to Equity on YouTube, Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod.

Chapters:

00:00 Intro

03:00 How Arena’s leaderboard works, and why it’s different from static benchmarks

07:00 Reproducibility concerns and how to scale

08:45 Can Arena stay independent while taking money from the labs it ranks?

11:15 Diversity, fraud prevention, and abuse mitigation

18:15 Arena’s “data moat”

19:20 Agent benchmarking and expert leaderboards

21:40 Open sourcing data

22:45 How do Arena’s rankings shape AI development?

24:15 Outro

Leave a Reply

Your email address will not be published. Required fields are marked *