您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[Artificial Analysis]:中国人工智能现状:2025年第二季度亮点报告 - 发现报告

中国人工智能现状:2025年第二季度亮点报告

AI智能总结
查看更多
中国人工智能现状:2025年第二季度亮点报告

Artificial AnalysisQ2 2025 Highlights Report Full report available to Premium Access subscribers Artificial Analysisis a leading, and independent AI benchmarkingandinsights provider.We supportengineers and companies to understand AIcapabilities and make critical decisions about their AI strategy. Ourdata, insights and publications are grounded in our comprehensivebenchmarking of AI technologies and use cases. This includes everythingfrom hourly performance testing of language model APIs to millions ofvotes in our crowd-sourcedarenas. Our public website,artificialanalysis.ai,is widely referenced by companiesleading innovation in AI. To discuss this report, our publications, or ourservices, please get in touch atcontact@artificialanalysis.ai. China’s leading AI labs are now closer than ever to US leaders, withthe lead decreasing from more than a year to less than three months US & China: Frontier Language Model Intelligence, Over Time Commentary Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam,LiveCodeBench,SciCode, AIME, MATH-500 •The performance gap between USand Chinese frontier models sincethe release of ChatGPT in 2022 hasremained persistent, but is asnarrow now as it has ever been •DeepSeek’sopen weights R1 (May2025) model leads the Chinese AIlabs, while OpenAI’s o3 leads modelsreleased by US AI Labs •DeepSeekand Alibaba haveprimarily driven the Chinesefrontier,while advances in the USfrontier have been primarily driven byOpenAI The Chinese open weights frontier surpassed the US in November 2024with Alibaba’s release ofQwQ32B Preview, R1 consolidated this lead US & China: Open Weights Frontier Language Model Intelligence, Over Time Commentary Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam,LiveCodeBench,SciCode, AIME, MATH-500 •The Chinese open weights frontiersurpassed the US in November2024with the release ofQwQ32BPreview (overtaking Meta’s Llama 3.1405B) •The open weights leadership ofChinese AI labs is reflective of theapproach of the top Chinese AIlabs to often release the weights oftheir flagship models.Thiscontrasts with the top US AI labs,which generally do not release theweights of their leading models, e.g.,OpenAI, Anthropic and Google •China’sDeepSeekR1 (January2025) was the first open weightsreasoning model to be competitivewith OpenAI’s o1 •DeepSeek’sR1 0528 (May 2025) isthe most intelligent open weightsmodel currently available Leading Chinese AI labsDeepSeekand Alibaba have steadilyreleased new models, withDeepSeektaking the lead in late 2024 Leading Chinese AI Labs: Language Model Intelligence, Over Time Commentary Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam,LiveCodeBench,SciCode, AIME, MATH-500 •As of May 2025,DeepSeekR1 0528(May 2025)maintains anintelligence edge over Alibaba'sQwen3 235B A22B as the leadingmodel from a Chinese AI lab •Both companies have embracedopen weights strategies, supportingwidespread adoption of their modelsdomestically, as well asinternationally •Over the last two years, bothDeepSeekand Alibaba havefrequently released models-releasing new models at most ~3months later than their prior release DeepSeek’smodels have quickly increased in intelligence sincetheir first public language model release in November 2023 DeepSeek’sModel Release Timeline: Artificial Analysis Intelligence Index Commentary Only highest intelligence general-purposeDeepSeekmodels shown, task-specific models excluded. Artificial AnalysisIntelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam,LiveCodeBench,SciCode,AIME, MATH-500. •DeepSeekleaped overxAI, Metaand Anthropic to be tied as theworld’s #2 AI laband the undisputedopen weights leader with the releaseof R1–0528 •R1-0528’s impressive intelligenceuplift was a post-training updatewith no change to the originalV3/R1 architecture-it remains alarge 671B model with 37B activeparameters •This highlights the continuallyincreasing importance of post-training, particularly for reasoningmodels trained with reinforcementlearning (RL) techniques In the US, there are now multiple contenders for AI leadership, withOpenAI no longer in a dominant position at the frontier Leading US AI Labs: Language Model Intelligence, Over Time Commentary Artificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam,LiveCodeBench,SciCode, AIME, MATH-500 •OpenAI has been the clear leaderof the AI intelligence frontier, butits lead has diminished as otherlabs including Google,xAIandAnthropic have narrowed the gap •As of May 2025, OpenAI’s o3 is themost intelligent US model, and themost intelligent model overall There are many Chinese AI players, and they categorize into 3 broad archetypes NOT EXHAUSTIVE 1. “Big tech” playersThe focus of this report, deep