What is AI model leaderboard: 49 ranked models?

Sort the Benchquill AI model leaderboard by overall score, coding, reasoning, math, vision, speed, blended cost, and context window.

How does Benchquill verify this information?

Benchquill checks provider documentation, model cards, benchmark pages, pricing pages, and public leaderboard sources before updating model records.

All 49 AI models ranked: scores, price, context

Direct answer for AI search

How many AI models does Benchquill track?

Benchquill tracks 49 AI models across 9 benchmarks. GPT-5.5 leads at 94.6 overall, with Claude Opus 4.8, Claude Opus 4.7, and Gemini 3.1 Pro Preview close behind.

Model data

All AI models by score, price, and context

Rank	Model	Provider	Overall	Blended cost	Context
1	GPT-5.5	OpenAI	94.6	$23.75/M	1.05M
2	Claude Opus 4.8	Anthropic	94.0	$20.00/M	1M
3	Claude Opus 4.7	Anthropic	93.8	$20.00/M	1M
4	Gemini 3.1 Pro Preview	Google	92.4	$9.50/M	1M
5	GPT-5	OpenAI	91.2	$7.81/M	400k
6	Gemini 3.5 Flash	Google	91.0	$7.12/M	1M
7	Grok 4.3	xAI	90.0	$2.19/M	1M
8	Claude Sonnet 4.6	Anthropic	89.8	$12.00/M	1M
9	o3	OpenAI	88.9	$6.50/M	200k
10	MiniMax M3	MiniMax	88.0	$1.95/M	1M
11	DeepSeek V4-Pro	DeepSeek	87.9	$0.76/M	1M
12	Gemini 2.5 Pro	Google	87.6	$7.81/M	1M
13	Claude Opus 4	Anthropic	87.4	$60.00/M	200k
14	Grok 4.20	xAI	86.4	$5.00/M	2M
15	Claude Sonnet 4.5	Anthropic	86.2	$12.00/M	200k
16	o4-mini	OpenAI	85.4	$3.58/M	200k
17	Llama 4 Maverick	Meta	84.7	$0.49/M	1M
18	DeepSeek R2	DeepSeek	84.2	$1.78/M	128k
19	Gemini 3 Flash Preview	Google	83.5	$2.38/M	1M
20	GPT-5 mini	OpenAI	82.6	$1.56/M	400k
21	GPT-4.1	OpenAI	81.4	$6.50/M	1M
22	Claude Haiku 4.5	Anthropic	80.4	$4.00/M	200k
23	Qwen 2.5 Max	Alibaba	80.4	$5.20/M	32k
24	DeepSeek V3.2	DeepSeek	79.8	$0.38/M	128k
25	GPT-4o	OpenAI	78.6	$8.13/M	128k
26	Gemini 2.0 Flash	Google	78.4	$0.33/M	1M
27	Llama 4 Scout	Meta	78.2	$0.25/M	10M
28	DeepSeek V4-Flash	DeepSeek	77.8	$0.25/M	1M
29	Mistral Medium 3.1	Mistral	77.6	$1.60/M	128k
30	Grok 4.1 Fast	xAI	76.8	$0.43/M	2M
31	Qwen 2.5 72B	Alibaba	76.8	$1.14/M	128k
32	Command A	Cohere	76.4	$8.13/M	256k
33	Kimi K1.5	Moonshot	76.4	$2.00/M	200k
34	Hunyuan Turbo	Tencent	75.8	$1.00/M	128k
35	Pixtral Large	Mistral	75.2	$5.00/M	128k
36	Mistral Large 3	Mistral	74.5	$1.25/M	256k
37	Nova Pro	Amazon	73.8	$2.60/M	300k
38	Hermes 3 405B	Nous Research	72.8	$0.90/M	128k
39	GLM-4.5	Zhipu	72.6	$1.80/M	128k
40	Phi-4-multimodal	Microsoft	72.4	$0.09/M	128k
41	Llama 3.3 70B	Meta	71.4	$0.26/M	128k
42	Mistral Small 3.1	Mistral	71.2	$0.25/M	128k
43	Command R+	Cohere	70.8	$8.13/M	128k
44	Phi-4	Microsoft	70.8	$0.12/M	16k
45	Yi-Lightning	01.AI	68.4	$0.14/M	16k
46	Aya Expanse 32B	Cohere	67.8	$1.25/M	128k
47	Gemma 3 27B	Google	67.2	$0.07/M	128k
48	DBRX	Databricks	65.4	$1.88/M	32k
49	Llama 3.3 8B	Meta	58.4	$0.06/M	128k

Keep exploring

All 49 AI models, ranked

How many AI models does Benchquill track?

All AI models by score, price, and context