News

Supporting Independent Research in AI Evaluation

Supporting Independent Research in AI Evaluation

Arena’s Academic Partnerships Program provides funding and support for independent research advancing the scientific foundations of AI evaluation.

Introducing Max

Introducing Max

Today we are releasing Max, Arena's model router powered by our community’s 5+ million real-world votes. Max acts as an intelligent orchestrator—it routes each user prompt to the most capable model for that specific prompt.

LMArena is now Arena

LMArena is now Arena

What began as a PhD research experiment to compare AI language models has grown over time into something broader, shaped by the people who use it.

Video Arena Is Live on Web

Video Arena Is Live on Web

Video Arena is now available at: lmarena.ai/video! What started last summer as a small Discord bot experiment has grown into something much more substantial. It quickly became clear that this wasn’t just a novelty for generating fun videos—it was a rigorous way to measure and understand

Fueling the World’s Most Trusted AI Evaluation Platform

Fueling the World’s Most Trusted AI Evaluation Platform

We’re excited to share a major milestone in LMArena’s journey. We’ve raised $150M of Series A funding led by Felicis and UC Investments (University of California), with participation from Andreessen Horowitz, The House Fund, LDVP, Kleiner Perkins, Lightspeed Venture Partners and Laude Ventures.

Arena's Ranking Method

Arena's Ranking Method

Since launching the platform, developing a rigorous and scientifically grounded evaluation methodology has been central to our mission. A key component of this effort is providing proper statistical uncertainty quantification for model scores and rankings. To that end, we have always reported confidence intervals alongside Arena scores and surfaced any

The Next Stage of AI Coding Evaluation Is Here

The Next Stage of AI Coding Evaluation Is Here

Introducing Code Arena: live evals for agentic coding in the real world AI coding models have evolved fast. Today’s systems don’t just output static code in one shot. They build. They scaffold full web apps and sites, refactor complex systems, and debug themselves in real time. Many now