最后更新
Nov 14, 2025, 01:55 PM
探索根据社区投票和性能指标排名的顶级AI模型
模型总数
53
数据库中可用
顶级模型
GPT-5 (high)
排名第一
最高分
1473
最高评分
| # | 模型 | 评分 | 投票数 | 置信区间 | 组织 |
|---|---|---|---|---|---|
| 🥇 | GPT-5 (high) | 1473.00 | 8,004 | ±7 | OpenAI |
| 🥇 | Claude Opus 4.1 thinking-16k (20250805) | 1458.00 | 8,726 | ±7 | Anthropic |
| 🥈 | Claude Opus 4.1 (20250805) | 1451.00 | 8,986 | ±7 | Anthropic |
| 4 | Claude Sonnet 4.5 (thinking 32k) | 1420.00 | 4,863 | ±9 | Anthropic |
| 4 | MiniMax-M2 | 1405.00 | 3,515 | ±10 | MiniMax |
| 5 | Gemini-2.5-Pro | 1399.00 | 14,628 | ±6 | |
| 5 | GLM-4.6 | 1395.00 | 7,563 | ±10 | ZAI |
| 5 | DeepSeek-R1-0528 | 1393.00 | 4,800 | ±8 | DeepSeek |
| 6 | Claude Sonnet 4.5 | 1387.00 | 7,855 | ±7 | Anthropic |
| 7 | Claude Opus 4 (20250514) | 1383.00 | 9,238 | ±6 | Anthropic |
| 7 | GLM-4.5 | 1379.00 | 4,360 | ±8 | ZAI |
| 9 | GLM-4.5-Air | 1366.00 | 1,425 | ±13 | ZAI |
| 11 | Qwen3-Coder | 1365.00 | 13,296 | ±6 | Alibaba |
| 12 | Claude Sonnet 4 (20250514) | 1362.00 | 11,526 | ±5 | Anthropic |
| 10 | DeepSeek-V3.1-thinking | 1360.00 | 1,459 | ±17 | DeepSeek |
| 12 | Claude 3.7 Sonnet (20250219) | 1358.00 | 7,460 | ±9 | Anthropic |
| 12 | Claude Haiku 4.5 (20251001) | 1354.00 | 6,549 | ±8 | Anthropic |
| 12 | Qwen3-235B-A22B-Instruct-2507 | 1352.00 | 992 | ±13 | Alibaba |
| 14 | DeepSeek-V3.1 | 1338.00 | 1,304 | ±15 | DeepSeek |
| 18 | qwen3-coder-plus-2025-09-23 | 1334.00 | 3,977 | ±8 | Alibaba |
| 21 | Kimi-K2-Instruct | 1315.00 | 7,027 | ±8 | Moonshot |
| 22 | Gemini-2.5-Flash | 1294.00 | 14,956 | ±7 | |
| 23 | GPT-4.1-2025-04-14 | 1253.00 | 11,506 | ±6 | OpenAI |
| 24 | Claude 3.5 Sonnet (20241022) | 1238.00 | 26,267 | ±5 | Anthropic |
| 25 | DeepSeek-V3-0324 | 1208.00 | 1,094 | ±16 | DeepSeek |
| 25 | DeepSeek-R1 | 1199.00 | 3,755 | ±14 | DeepSeek |
| 25 | GPT-4.1-mini-2025-04-14 | 1193.00 | 9,064 | ±6 | OpenAI |
| 25 | Qwen3-235B-A22B | 1189.00 | 5,600 | ±6 | Alibaba |
| 25 | o3-2025-04-16 | 1186.00 | 5,572 | ±9 | OpenAI |
| 26 | Mistral Medium 3 | 1181.00 | 7,511 | ±7 | Mistral |
| 29 | Grok-4-0709 | 1174.00 | 7,685 | ±6 | xAI |
| 32 | grok-code-fast-1 | 1152.00 | 4,991 | ±9 | xAI |
| 32 | Grok-3-preview-02-24 | 1143.00 | 5,764 | ±7 | xAI |
| 32 | o3-mini-high (20250131) | 1137.00 | 2,979 | ±12 | OpenAI |
| 33 | Claude 3.5 Haiku (20241022) | 1133.00 | 22,213 | ±6 | Anthropic |
| 33 | MiniMax-M1 | 1129.00 | 3,361 | ±9 | MiniMax |
| 35 | o4-mini-2025-04-16 | 1117.00 | 8,850 | ±7 | OpenAI |
| 37 | gpt-oss-120b | 1093.00 | 759 | ±25 | OpenAI |
| 38 | o3-mini (20250131) | 1092.00 | 6,369 | ±8 | OpenAI |
| 38 | Gemini-2.0-Pro-Exp-02-05 | 1090.00 | 11,859 | ±8 | |
| 41 | o1 (20241217) | 1045.00 | 9,235 | ±7 | OpenAI |
| 41 | o1-mini (20240912) | 1043.00 | 13,688 | ±6 | OpenAI |
| 41 | Gemini-2.0-Flash-001 | 1040.00 | 10,498 | ±9 | |
| 41 | Gemini-2.0-Flash-Thinking-01-21 | 1030.00 | 1,058 | ±19 | |
| 43 | Llama-4-Maverick-17B-128E-Instruct | 1027.00 | 5,474 | ±8 | Meta |
| 46 | Gemini-2.0-Flash-Exp | 980.00 | 14,454 | ±9 | |
| 46 | Qwen2.5-Max | 976.00 | 11,073 | ±7 | Alibaba |
| 47 | GPT-4o-2024-11-20 | 964.00 | 18,601 | ±6 | OpenAI |
| 48 | DeepSeek-V3 | 960.00 | 7,699 | ±7 | DeepSeek |
| 50 | Qwen2.5-Coder-32B-Instruct | 902.00 | 16,199 | ±7 | Alibaba |
| 50 | Llama-4-Scout-17B-16E-Instruct | 901.00 | 687 | ±25 | Meta |
| 50 | Gemini-1.5-Pro-002 | 893.00 | 15,159 | ±8 | |
| 53 | Llama-3.1-405B-Instruct | 810.00 | 1,117 | ±18 | Meta |
数据每小时更新 • 显示 53 个模型
数据来源:LM BASE 排行榜