
ExecutiveSummary Thisreport examines China’s embrace of embodied AI—artificial intelligenceintegrated with physicalsystems(robots, drones, vehicles, etc.)—as a critical pathwaytoward artificial general intelligence (AGI). In the United States and Europe, large language models (LLMs) and their multimodalvariantsare regardedby many AI scientistsand major AI companiesas themostpromisingpath to AGI, despite known issues withabstractionand reasoning. By contrast, in China there isa broader vision of how AGI can be achieved, mostrecently expressed in a nationwide move toward AI embodiment—namely,intelligencedeveloped through interaction between body, brain, and environment, in both physicaland virtual forms. Thistrend toward embodied AI is backed by policy support atthe national and localgovernment levels, whichhasled to largeembodied AIinnovation centerslinked totop universities and tech firmsbeing established in coastal cities and provinces. The upshot is China is on a path to accomplish twogoalssimultaneously:enrichingthenation byintegratingAI into the economy andachievingAGIthat ismore aligned withthe totality of human expression. The report recommends thatthe United States anditsalliesramp uptheirmonitoringof China’sAIprogress, benchmark its claims, and consider broader approaches to AGIbeyond scaling up LLMs. Table of Contents Executive Summary................................................................................................................................1Introduction: AI Embodiment..............................................................................................................3Embodiment in Chinese AI Policy......................................................................................................6China’s Concept of Embodiment.....................................................................................................10Major Chinese Research Centers....................................................................................................14The Academic Dimension..................................................................................................................17Recommendations...............................................................................................................................21Authors....................................................................................................................................................23Acknowledgements............................................................................................................................23Appendices............................................................................................................................................24Embodied AI Conference Panels................................................................................................24Selected Chinese Embodied AI Companies............................................................................25Endnotes.................................................................................................................................................27 Introduction: AIEmbodiment Embedding or “embodying” (具身)artificial intelligence in agents that act in thephysical world (orindigital simulations) is increasingly seen byWestern1and Chinese2AI scientists as a promising successor to today’s disembodied AI programs, which inessence are abstractions of abstractions. The logic is both simple and compelling. Today’s large language models and multimodalLLMs, such as ChatGPT, GoogleGemini, and Anthropic’sClaude, ingest vast quantities of data––text, images, video andaudio––and analyze their relationships as a basis for answering queries, summarizinginformation, and translating languages. The success of these models in the uses for which they are intended has encouragedtheir developers, and much of the AI community, to believe that more data and greaterprocessingpowerwill lead to the “holy grail” of artificialgeneral intelligence (AGI), ahypothetical state in which the AI has the same cognitive abilities as a human. Otherspecialists argue that no number of enhancements to these statistical models willachieve human-level intelligence, especially if understood fully to include social,affective, and motivational intelligences.3 Meanwhile, the shortcomings ofLLMsbecome evident as their popularity grows. Theytend to “hallucinate” (provide false information or nonsense that appears plausible)while showing limited reasoning ability and severe deficits in generalizing, modelingtime and space, managing ambiguous expressions, and grasping nuance.4These issuespersist alongside the costs and environmental impacts associated with hosting andtraining LLMs. Part of the problem isthatthese models are based ondepictionsof the world. Theyare derivative representations of reality, an imperfect amalgam of how people haveimperfectly characterized things. While powerful, t