行业研究公司研究宏观策略财报招股书会议纪要海南封关低空经济 DeepSeek AIGC 大模型

从语言到行动：大语言模型作为自主智能体与工具使用者的综述

2025-08-24达福迪尔国际大学&联合国际大学&查尔斯达尔文大学L***

AI智能总结

本综述全面探讨了大型语言模型（LLMs）作为自主代理和工具使用者的最新进展。研究发现，LLMs在推理、规划和记忆方面展现出类似代理的能力，并通过提示、微调和记忆增强技术进一步提升自主性能。综述分析了LLMs与外部工具的交互方式，以及当前评估方法和基准测试的局限性，并强调了未来研究方向，包括可验证推理、自改进、基础设施瓶颈、多智能体通信、上下文敏感协作、个性化不足、对抗触发器漏洞、可解释性不足和不完整的评估框架等。研究指出，专有模型如GPT-4和开源模型如LLaMA在LLM代理研究中得到广泛应用，而工具集成是使LLM代理实现自主性的关键机制。单智能体系统优先考虑自主性和内省式决策，而多智能体系统则关注协调、角色分配和协作规划。评估领域正从静态的基于准确性的基准转变为动态的过程导向方法，以更好地捕捉代理行为的多元性。未来研究应聚焦于提升代理推理的透明度和可验证性，并开发可靠的自改进方法，以构建值得信赖、富有弹性且符合领域特定要求的智能体。

Sadia Sultana Chowa1, Riasad Alvi2, Subhey Sadi Rahman2, Md Abdur Rahman2, Mohaimenul AzamKhan Raiaan2, Md Rafiqul Islam3, Mukhtar Hussain3, Sami Azam1Department of Computer Science and Engineering, Daffodil International University, Dhaka-1341, Bangladesh2Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh3Faculty of Science and Technology, Charles Darwin University, Casuarina, NT 0909, Australia*Corresponding Author: sami.azam@cdu.edu.auAbstractprompting [3], chain-of-thought (CoT) prompting [4], and self- The pursuit of human-level artificial intelligence (AI) has sig-nificantly advanced the development of autonomous agentsand Large Language Models (LLMs). LLMs are now widelyutilized as decision-making agents for their ability to interpretinstructions, manage sequential tasks, and adapt through feed-back. This review examines recent developments in employingLLMs as autonomous agents and tool users and comprisesseven research questions. We only used the papers publishedbetween 2023 and 2025 in conferences of the A* and A rankand Q1 journals. A structured analysis of the LLM agents’architectural design principles, dividing their applications intosingle-agent and multi-agent systems, and strategies for inte-grating external tools is presented. In addition, the cognitivemechanisms of LLM, including reasoning, planning, and mem-ask prompting [5] demonstrated how the potential of LLMscould be improved through smart prompting and input pat-tern design. Beyond conventional natural language processing(NLP) tasks, LLMs are now serving as autonomous agents andintelligent tools. They are embedded into increasingly complexworkflows where they perform planning, decision making, andtool interaction in various real-world applications, includingresearch assistance [6], software development [7], drug discov-eries [8], multi-robot systems [9], clinical support [10], gamesimulation [11] and scientific simulations [12].LLMs as agents can observe their environment, make deci-sions, and take actions. Within this paradigm, single-agentLLM systems have demonstrated promising performance indecision-making tasks. Single-agent systems such as Reflex- ory, and the impact of prompting methods and fine-tuningprocedures on agent performance are also investigated. Fur-thermore, we evaluated current benchmarks and assessmentprotocols and have provided an analysis of 68 publicly avail-able datasets to assess the performance of LLM-based agentscan operate in decision loops that involve planning, mem-ory, and tool use. However, they often struggle in dynamicenvironments that require simultaneous context tracking, ex-ternal memory integration, and adaptive tool usage [16, 17].To address these limitations, the concept of multi-agent LLM agents. Finally, we have discussed ten future research direc-tions to overcome these gaps.single agent can manage. Through structured communication,reflective reasoning, and explicit role assignments in simulatedarXiv:2508.17281v1 [cs.CL] 24 Aug 2025 Keywords:Large Language Models;Multi-Agents;Reasoning; Evaluation; Generative AI 1Introduction Large language models (LLMs) have become central in artifi-cial intelligence (AI) research due to their strong human-likeability to understand, generate, and reason in natural lan-guage [1, 2].LLMs were used primarily as tools to serveas text generators or understanding modules within a largerapplication.However, further techniques such as few-shotMoreover, LLMs as agents and tools now demonstrate mas-sive potential in AI, and the demand to understand theirevolving roles has intensified. Therefore, a systematic reviewof its recent advancement, a discussion of the remaining gaps,and a research direction for future advancements are essential to advance the field.With this focus, this survey providesa comprehensive and structured overview of current capabil-ities and system designs.We investigate the architecturalfoundations that enable agent-like behavior in LLMs, analyzehow they interact with external tools, discuss the key limita-tions of current approaches, and highlight the remaining open benchmarks for LLM agents and tool users. •We identify fundamental challenges, including alignment,reliability, and generalization, and outline promising re- The rest of the review is organized as follows.Section 2 presents related works, identifying gaps in existing surveysand situating our contribution. Section 3 outlines the method- Our key contributions are summarized as follows. ology, including research questions, selection criteria, andsearch strategies. Section 4 explores the baseline LLMs usedin agentic LLM systems.Section 5 focuses on tool integra-tion in LLM workflows. Section 6 reviews the frameworks forconstructing single-agent and multi-agent systems. Section 7investigates the reasoning, planning, and memory capabilities •We conduct a comprehensive review of recent advance-ments in using LLMs as agents and

点击免费查看完整报告

你可能感兴趣

从语言到行动：大语言模型作为自主智能体与工具使用者的综述

你可能感兴趣

大模型如何判决？从生成到判决：大型语言模型作为裁判的机遇与挑战

从深度学习到大语言模型：量化投资中的人工智能技术综述

基础模型驱动的推荐系统综述：从特征驱动、生成式到智能体范式

基于大语言模型的智能体优化研究综述

大语言模型与交互式智能体：开放世界中的动态推理与规划