AI智能总结
The Futureof Agentic WE ACCELERATE DATA AND AI ADOPTIONTO POSITIVELY IMPACT Artefact is a global leader in consulting services, specialized in data transformationand data & digital marketing, from strategy to the deployment of AI solutions. Last February, we published “The Future of Work with AI”, ourfirst study on Agentic AI. We found that although AI agentswill replace humans on tedious and repetitive tasks, a newtype of work will appear: Agentic Supervision. During theindustrial revolution, machines replaced humans on manualtasks, but new jobs appeared such as machine purchasing,operational supervision and maintenance. With AgenticAI, cognitive jobs will be replaced by other higher-level and “We found that although AIagents will replace humans ontedious and repetitive tasks, anew type of work will appear: To gather the current state of Agentic Supervision, we in-terviewed14 enterprises and 5 Artefact Agentic ProductManagers & Engineers.We also contacted key AgenticSupervision providers, including major Data & AI platformswith years of software supervision experience (such as ging in these operational domains will have to bridge anygaps in these areas while setting their Agentic governance The second major challenge identified by our interview-ees is the need to strengthen their AI supervision tooling.Many are currently relying on existing RPA and Dev/Data/MLOps tools, or experimenting with custom-built solutionsas they search for more sustainable, long-term options. Theabundance of early-stage tools and the need to envision acohesive, end-to-end supervision system that integratesmultiple components, prompted us to explore the techno-logical dimensions of agentic supervision in greater depth.As with any TechOps framework, AgentOps supervisioninvolves three fundamental stages: (1) Observe, (2) Evaluate, The first insight we found is that while Agentic Supervisionextends the principles established in DevOps (software op-erations), DataOps (data operations), and MLOps (MachineLearning operations), it dramatically increases the demandfor robust governance to keep AI Agents aligned and undercontrol. Indeed, with “software that starts to think”, unseenrisks are emerging, such as hallucination, reasoning errors,inappropriate tone, intellectual property infringement or This markedly greater need for governance is the chal-lenge that may define the emerging operational paradigmof “AgentOps”. Interestingly, AgentOps will need to build “Supervision should not be an afterthought, it must be Our research into agentic supervision tools revealed threekey insights. First, there is currently no all-in-one solutionavailable. Major cloud providers like Google and Microsoftare actively developing and releasing supervision toolsand frameworks aimed at covering the full spectrum ofsupervision needs for teams building agents on platformssuch as Vertex AI (Google) and Copilot Studio (Microsoft).Second, agent supervision falls into two categories: pro-active and reactive. Proactive supervision is applied duringdevelopment to test agents against defined scenarios or, inproduction, to continuously guard against emerging threats,particularly in the area of security, or to collect aggregatedperformance data. Its goal is to improve agent behavior over analysis. Tools likeLangSmithandLangChainare increas-ingly used to structure and streamline the observation ofagent behavior. Another major challenge lies in theopacityof LLM reasoning, which must be countered by deliberately Evaluationin agentic AI is significantly more complex thanin traditional software or data quality assessments. Wheredeterministic tests based on observability queries are suf-ficient in classical DevOps and DataOps, agentic systemsoften require AI to evaluate AI. This has led to the rise ofLLM-as-a-judgetechniques; a counterintuitive approachwhere one model assesses the output of another. Whilethis raises concerns (why trust flawed AI to judge flawedAI?), studies show it often produces more consistent andscalable results than human reviewers. Nonetheless, a Finally,supervision and mitigationface challenges aroundprioritization. With a growing number of metrics and alerts,teams can quickly become overwhelmed. Standardizedframeworks for alerting and metric management are a Each phase of the agentic supervision cycle; observe,evaluate, and supervise, presents its own set of chal- Observability first requires anticipating what data to capture,which depends heavily on having a clearly defined evaluationand supervision strategy. Without this foresight, teams riskeither collecting too little information or being overwhelmed Only a handful of organizations have successfully estab-lished effective governance and standards for agentic AI. “Agentic Supervision isthe Future of Work with AI!” works have had a head start, benefiting from strong foun-dations and a well-established culture of observability andsupervision. We observed that leveraging existing soft