您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。 [伯恩斯坦]:China's Internet: Strategic Insights from AI Model Architecture - 发现报告

China's Internet: Strategic Insights from AI Model Architecture

信息技术 2026-03-30 - 伯恩斯坦 Zt
报告封面

China Internet: The strategic implications of AI model Strategic choices drive AI model architecture.AI development and costs continue todominate investor discussions across our China Internet coverage. This note is intended asa low-jargon discussion on the model design choices of China’s top AI labs, and what they Robin Zhu+852 2123 2659robin.zhu@bernsteinsg.com Charles Gou+852 2123 2618charles.gou@bernsteinsg.com A brief primer on AI model architectures, KV cache use, RL.Global AI modeldevelopers have increasingly adopted MoE architectures in recent years, where onlysmall sections of parameters in a large model are activated per token - depending onspecialisation on different vertical domains, languages, or skill sets. The key values (KV) Min-Joo Kang+852 2123 2644minjoo.kang@bernsteinsg.com Choices along the cost vs. performance spectrum.Across the top Chinese AI labs,Minimax stands out for offering a smaller model optimised for low active parameter scaleper token, reinforcement learning frameworks that prioritise agentic tool use, while thecompany’s pricing strategy incentivises high KV cache usage. Zhipu’s GLM5 model islarger and benchmarks better on general reasoning, coding capabilities, and hallucination Thoughts on the adoption curve.Year to date, the M2.5 model's optimisation for lowcost agentic use has made it one of the most popular models used to support OpenClaw.Z.ai’s focus on leading edge reasoning capabilities and reliability aligns well with thecompany’s more academic background, and focus on enterprise use cases where reliabilityis key. Thinking from the perspective of early adopter cohorts and growth S-curves, heavy Competition, and model commoditisation.Over long time horizons, our bias is thatmarket positions built around general reasoning strength and reliability, and specialisttask completion will prove more durable, while the “low-cost agentic back-end” corner ofthe market becomes more crowded with competition from both Chinese devs (includingthe independent AI labs but also the Internet platforms seeking to develop consumer use Is 20-30% training cost growth going to be enough?Alibaba, Tencent, and Baidu allannounced price hikes in their respective AI cloud units, while AI server rental quotes fromindependent compute suppliers have pointed in the same direction. Alibaba managementhinted that ongoing market tightness could support further price hikes this year. In contrast, BERNSTEIN TICKER TABLE INVESTMENT IMPLICATIONS AI development and costs continue to dominate investor discussions across our China Internet coverage. In this note we’veoutlined some observations about differences in AI model architecture choices across the leading Chinese AI labs, andwhat these choices say about developer market positioning and competitive strategy. While Minimax’s recent (M2+) modelreleases have been optimised for low-cost agentic tool use, Z.ai’s GLM5 model was much more focused on general reasoningcapabilities, and hallucination control. Alibaba’s strategy for its Qwen family of models meanwhile has been to offer a broad Over long time horizons, we expect leading edge general reasoning and specialised task completion capabilities to representmore defensive competitive positions than a low-cost, “good enough” agentic AI backbone… unless the latter is embeddedwithin a large consumer-facing ecosystem. For the latter, our bias remains that most consumers care more about “getting stuff To date, the top Chinese AI labs have done a good job keeping pace with the global SOTA, albeit with help from developmenttechniques like distillation. As agentic workflows become more complex, and task completion horizons lengthen, the possibilitythat the latter becomes less effective will be important to monitor. Nearer-term, we’d expect the rising cost of compute (e.g. seeAlicloud and Tencent price hikes) to support hyperscaler growth - but serve as a source of training and inference cost pressure VALUATION COMPS TABLE DETAILS A PRIMER ON MODEL AI ARCHITECTURE… AND STRATEGIC IMPLICATIONS Research on AI development has dominated our research bandwidth year to date. Our discussions with investors havecontinued to focus heavily on the strategic implications of OpenClaw adoption on our large cap coverage (e.g. Tencent, Alibaba),and the growth of AI model companies like Minimax and Z.ai. A common thread in these conversations though has been atendency for investors to treat “AI models” as monolithic products… and treat OpenRouter data almost like Sensor Tower, asthe arbiter of top-line traction. This note is intended as a basic, low-jargon primer of AI models, focused on key aspects which Bigger is usually better… but there are trade-offs At a high level, frontier AI models are next-token predictors that are trained on large training datasets, which try to predict themost optimal responses to user prompts. Frontier AI models have mainly scaled over time by adding (1) parameter count; (2)adding exper