The AI world can be super confusing. One way the AI nerds try to make sense of it is via benchmarks. A few weeks back, I showed you how Gemini 2.5 Pro stacked up against the competition. This week, with the release of a bunch of OpenAI's new models, the landscape has changed again.
One of the most multifaceted of the benchmarking boards is called Artificial Analysis. You can check it out here:
https://artificialanalysis.ai/
One widely used overall benchmark is LiveBench:
https://livebench.ai/#/
Honestly, since we all have access to Gemini 2.5 Pro these days, I think just sticking with that is fine day to day. You don't much need to look at all the benchmarks.
On the other hand, if you want to also tap into OpenAI and other models, then looking at the benchmarks might help you decide which ones you want to try. I use a lot of OpenAI because A) I have a paid access, and 2) there are some good productivity-enhancing UI capabilities other models don't have. But I also use
a lot of Gemini these days, especially Deep Research (which I love!).
Prefer a single score based on overall user vibes? Then Chatbot Arena is pretty good:
https://lmarena.ai/?leaderboard
However, if you really want to nerd out, you can check out the following list, brought to you by ChatGPT4o, which loves its emojis these days:

General AI Benchmarking Platforms
- MLCommons (MLPerf)
- An industry-standard suite evaluating AI perfrmance across training, inference, and safety.
- Includes benchmarks like MLPerf Training, Inference, and AILuminate fr assessing model risks.
- Visit MLCmmonsWIREDWIRED+1NVIDIA+1
- Epoch AI Benchmarking Hub
- BetterBench (Stanford)
- LiveBench
- ArtificialAnalysis.ai
- Vellum AI LLM Leaderboard
- Displays benchmark perfrmance for state-of-the-art LLMs, focusing on reasoning, mathematical problem-solving, and tool usage.
- View Vellum LeaderbardVellum AI
- Hugging Face Open LLM Leaderboard
- S&P AI Benchmarks by Kensho

Hardware and Edge AI Benchmarks
- AI-Benchmark
- AIBench (BenchCouncil)
- Offers a cmprehensive AI benchmark suite, including HPC AI500 for high-performance computing systems.
- Explre AIBench
- Benchmarks.AI