您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[GSMA]:人工智能推理实践:选择正确的边缘 - 发现报告

人工智能推理实践:选择正确的边缘

信息技术2025-06-09GSMAA***
AI智能总结
查看更多
人工智能推理实践:选择正确的边缘

AI inference in practice: choosing the right edge INSIGHT SPOTLIGHT Inferencing is the real-time decision-making of an AI model.As AI adoption grows, inferencing will accelerate, raisingquestions about workload processing and business benefits.As outlined inDistributed inference: how AI can turbochargethe edge, enterprises need to know that where inferencing This research forms part of a series illustrating the impact ofAI inference, with each report focusing on a distinct edgelocation and featuring an example company. This analysis examines how running AI workloads on the edge can deliverimproved outcomes, with Aible the featured company.Aible is Analysis For example, in the case of a retailer using security cameras, videofootage often needs to be processed in real time to identifyshoplifters before they leave a store. While basic motion detectioncan happen on-device, higher-level inferencing typically requires Resilience, latency, privacy: the device edge sweet spot AI model innovation, including the extension from large languagemodels (LLMs) to small language models (SLMs), continues at arapid pace. At NVIDIA GTC 2025, for example, Aible demonstratedthat a Llama 3.1 8-billion parameter model with $5 of post-trainingcould perform better than LLMs on specific tasks. Improvedperformance from smaller models enables more AI workloads to berun on the device edge. This is particularly important forenterprises operating in remote or harsh environments, which often Running AI inference on-premises or at the network edge for suchuse cases can also reduce costs. This is particularly important forlarger retailers, which often deploy thousands of security camerasacross their stores. NVIDIA research found that in one scenarioinvolving the deployment of 2,000 4K cameras, edge-based Running smaller, more optimised models locally on the device canalso deliver faster inference. The main driver of faster inference isthe ability to process AI models more quickly on local hardwarecompared to the time it takes to run inference in the cloud.Additional time savings come from eliminating the need to send To help businesses implement use cases such as theft detection inretail, Aible’s AI Intern Agent solution offers a library of pre-built AItemplates tailored to specific tasks, such as video analytics andimage classification. Enterprises can then adapt the templatesusing their own data and user feedback to build task-specific The other key benefit of on-device AI is data privacy. By performinginferencing on the device, businesses can ensure data stays in theterritory where it was generated, avoiding the need to transfer itacross international borders. This is especially important for Source: GSMA IntelligenceComparison of AI inferencing options for retail security cameras Scores based on typical performance characteristics for security cameradeployments at a large retailer. 1 = least favourable; 5 = most favourable. Why one size doesn’t fit all The device edge is well-suited to AI workloads in locations withpoor connectivity or those with strict latency or privacyrequirements. However, there are limitations on the size andcomplexity of models that can be deployed at the device edge. Asa result, the on-premises edge or network edge offer a more Implications Mobile operators Device manufacturers •Data traffic growth and costs– As more AI workloads shift tothe device edge, operators can expect changes in both thevolume and pattern of data traffic. More upstream or uplink datatraffic will be generated overall, with the opportunity to process itat the edge. For inferencing tasks that no longer require operators •Chipset and processing capabilities– Supporting AIinferencing at the device edge requires increasinglypowerful chipsets, capable of handling real-time processingfor tasks such as image recognition and anomaly detection.However, device makers must balance these performance •Telco edge deployments– While on-device AI is the best fit forsome AI inferencing workloads, it does not eliminate the need fortelco edge deployments. The network edge remains well suited toworkloads that require low latency but exceed the processing andpower capabilities of the device itself. Examples include real-time •AI as a differentiator– As demand grows for moreintelligent, autonomous solutions, on-device AI enablesdevice makers to differentiate their hardware throughoffering advanced features and capabilities. In the dronesmarket, for example, companies such as Zipline are usingNVIDIA’s Jetson module to enable real-time navigation, •New AI use cases within telco RAN –As AI inferencing movesto the edge, there are new opportunities to use RAN computeassets to support AI workloads. This opens up new monetisationroutes for operators, while improving asset utilisation by using acommon shared infrastructure for both AI and RAN workloads. Enterprises •Workload placement– As AI deployments become distributedacross device, edge and