中国人工智能学会二○二五年十一月 中国人工智能学会系列白皮书⸺具身智能 中国人工智能学会二○二五年十一月 编委会 主任:戴琼海执行主任:马华东副主任:赵春江何友王恩东郑庆华刘成林周志华孙富春庄越挺胡德文杜军平杨强委员:陈松灿董振江付宜利高新波公茂果古天龙何清胡清华黄河燕季向阳蒋田仔林浩哲梁吉业刘奕群潘纲石光明孙茂松孙长银陶建华王海峰王熙照王轩王蕴红吴飞于剑余有成张化光张学工章毅周鸿祎周杰祝烈煌 ·······················································1 1.1·····································································11.2····························································21.3·····································································3 ···························································5 2.1·················································································62.2·················································································72.3·················································································92.4·················································································112.5·················································································132.6·················································································152.7···········································································172.8···········································································182.9··············································································212.9.1· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·232.9.2· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·232.9.3· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·242.10·········································································25 ·······················································30 3.1········································································303.2········································································35 ······························································42 4.1··············································································434.2·······················································································454.3·······················································································464.4·······················································································524.5··············································································54 ·······················································57 5.1·························································575.2·························································605.3·························································615.4····························································62 ···················································································63 ··················································································87 1.1 2050Alan Turing1950Computing Machinery and IntelligenceEmbodied Intelligence 2080AIRodney Brooks Deep LearningReinforcement Learning +Optimus-- Large Language ModelsDeepMindRTRT-H Meta AICortexBenchVC-1NVIDIAGPUAIProject GR00TJetson ThorIsaac 1.2 2080 1.3 1-1-Sim-to-Real Gap [1–6]3D Gaussian Splatting- [7] residual policy [8–10] [11–13] [14–16] -- 2.1 RL UC BerkeleyNoMaD[17]MetaNWM[18] VLAVision-Language-ActionDeepMindRoboCat[19]StanfordHumanPlus[20] [21,22][23]UC BerkeleyAMP [24]HugWBC[25] Amazon Robotics[26]RHINO[27] 2.2 [28][29]Early FusionPointFusion[30]Late FusionCLOCs[31]3DIntermediate FusionBEVFusion[32] [33][34][35]MP5[35]Minecraft [36][37,38][39][40][41][42] 2.3 AIPDDL 1 2SayCan[43] affordance 3ReAct[44]Text2Motion[45]VLP[46]REFLECTLLM Code-as-Policies[47]APIRoboCodeX[48] Vox-Poser[49]OmniManip[50]ReKep[51]VoxPoserOmniManipReKepPalm-E[50]EmbodiedGPT[52]EGO4D[53]EGOCOT[52]RT-1[54]RT-2[55]RT-X[56]π0[57]-- 2.4 VoxPoser[49]OmniManip[50]ReKep[51] --Vision-Language-Action ModelVLA2-1 VLA 1VLM + -Visual-Language Model, VLM 2VGM + Video-Generation Model, VGM 3VLM+Latent+Action Latent Action Tokens)- 2024-2025 VLM +2024Physical Intelligenceπ0VLM+ RDT2025FigureVLAVLM VGM +VLM+VGM +GR-2+ATMFLIP VLM+Latent+Action2025Vision-Language-Latent-Action (ViLLA)VLAViLLALatent Action Tokens)-SOTAViLLAVLM+MoEVLMMoELatent PlannerMoEVLMLatent PlannerAction Expert 2.5 [58][59][60][61] /- [62][63][59][59][64] -VLMLLMVLM[65]-[66]LLM[67]LLM 2.6 2001[68] [69][70]DialFRED[69][71]2-2 RT2[72][73]Long[74]2-3 2.7 2-4 2-5- -- 2.8 LLM[75]VLM/[76–79]AGI[80] 2-6 [43,81,82][83]Ha[84]LeCun[80][85,86] Transformer(ViT)[87,88]Robo Craft[89]PointNet[90,91][92]LLMs[75,93,94]LLMs[95–98]BC-Z[99]Text2Motion[100]ReasonedExplore[101]Not Train Dragon[102]LLM MORL[103]Trajectron++[104][105]Transformer[106]VIPER[107]TransformerGenie[108]2-7GR-2[109,110] UniPi[111]RoboDreamer[112]VPDD[113]ReflectVLM[114] 2.9 2-8---- 2.9.1 SAM-6D[115]S