Zhihan Ma, CFA+1 917 344 8303zhihan.ma@bernsteinsg.comAlexia Howard+1 917 344 8453alexia.howard@bernsteinsg.comAneesha Sherman+1 917 344 8457aneesha.sherman@bernsteinsg.com Callum Elliott, CFA, ACA+44 20 7676 7183callum.elliott@bernsteinsg.comDanilo Gargiulo+1 917 344 8475danilo.gargiulo@bernsteinsg.comEuan McLeish+81 3 5962 9611euan.mcleish@bernsteinsg.comIan Moore+1 917 344 8434ian.moore@bernsteinsg.com Jignanshu Gor+91 226 842 1494jignanshu.gor@bernsteinsg.comLuca Solca+41 582 723 126luca.solca@bernsteinsg.com Melinda Hu+852 2123 2643melinda.hu@bernsteinsg.com confined to their own assortment (1P and/or 3P). Bridging this gap,from either end, will be key as the technology evolves. Agentic shopping burst onto the US Retail scene last October afterWalmart announced its partnership with ChatGPT.1More than six months later,how agentic is shopping today? BUYING ONE ITEM - REQUIRES INTEGRATION! We tested five different tools. Three are foundational AI models -ChatGPT, Gemini, and Claude - with broad capabilities. The othertwo, Amazon’s Alexa and Walmart’s Sparky, are retail-native AI toolsbuilt directly on top of retail ecosystems. The goal was to see howeach approaches the shopping funnel: from identifying the rightproduct, to validating pricing, and actually enabling a purchase. We evaluated how effective each AI assistant is through eachstep of the basic e-commerce process for one item, from productdiscovery to final transaction execution:I want to make scrambledeggs, but I’ve run out of milk. Could you please find me a 64 fl oz carton of organic whole milk? How soon can it be delivered tome in ZIP 22201 (Arlington, VA)?Exhibit 1 shows each AI tool's More importantly, we assess where each tool breaks down and how far we are from a true end-to-endagentice-commerce experience.While human interaction is still needed at every stage,payment iswhere the whole experience unravels. Even in the simplest case EXHIBIT 1:Amazon Alexa and Walmart Sparky currentlymaintain an advantage in real-time pricing accuracy, total Underpinning that is a tension between general and retailer-integrated systems. The foundational AI models feel powerfulbecause they can roam freely, comparing products, surfacingdeals, and stitching together options across a fragmented retaillandscape. However, they are doing so without direct access tothe structured catalog data, real-time pricing, inventory, or delivery Definitions - Identifies Category: Ability to correctly determine the relevant productcategory; Identifies SKU: Ability to accurately identify the specific item, including itsbrand, size, and variant; Identifies Pricing: Ability to correctly determine the exactprice of the selected item; Estimate Full Cost: Ability to accurately calculate the total EXHIBIT 3:With the initial prompt, Gemini suggests relevantproduct and delivery options but does not identify items at There is currently no agentic shopping tool capable of buying aproduct in an unsupervised fashion. Human engagement is needed It also appears to rely on online scraping to generate its responses. ChatGPT, Claude, and Gemini are primarily research andrecommendation tools. Because they lack direct access to retailers’assortment (unless shared by retailers directly), they seem to rely onworkarounds such as web scraping or third-party articles (Exhibit2). Therefore, they sometimes struggle with SKU identificationand pricing accuracy. In addition, they are incapable of handlingfulfillment and payment (in so far as they lack a connection). As In contrast, Amazon and Walmart operate fundamentally differently.Their AI assistants have access to the real time inventory data.This allows them to manage the entire purchase funnel within theirecosystems, from product discovery through payment. Amazon’simplementation illustrates both the potential and the limitations ofthis approach. Alexa can add non-perishable items to a cart (Exhibit Source: Google Gemini, Bernstein analysis EXHIBIT 4:This is not due to a lack of capability. Whenprompted further, such as asking for the price of the milk, EXHIBIT 2:ChatGPT can identify the SKU and estimatean approximate price range, but appears to rely on web Whole Foods link, in exhibit, directs to an article on Whole Foods deliverypolicy. Source: Google Gemini, Bernstein analysis EXHIBIT 7:Amazon provides product recommendations atthe SKU level, presenting users with a range of options to EXHIBIT 5:Similar to Gemini, Claude initially presentsdelivery options along with item recommendations and Source: Claude (Anthropic), Bernstein analysis EXHIBIT 6:Through its partnership with Uber Eats, Claudeenables users to complete purchases via Uber Eats directly EXHIBIT 8:Alexa is unable to do so for Amazon Fresh andWhole Foods products; it directs the user to an externallink. Source: Claude (Anthropic), Bernstein analysis EXHIBIT 11:Foundational LLMs are better at understandingcontext while retail-integrated models provide SKU-le