您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[英伟达]:NVIDIA Blackwell Ultra 数据手册 - 发现报告

NVIDIA Blackwell Ultra 数据手册

2025-10-24英伟达极***
AI智能总结
查看更多
NVIDIA Blackwell Ultra 数据手册

Built for the age of AI reasoning. Designed for AI Reasoning Performance Key Offerings AI has evolved around three fundamental scaling dimensions: pretraining, post-training, and inference-time scaling—also known as long thinking or reasoning. Thisthird dimension is critical for enabling agentic AI, where models must dynamicallyreason through complex queries during inference. Unlike traditional one-shotinference, test-time scaling can demand up to 100x more compute, as modelsevaluate multiple potential responses before selecting the most accurate outcome. >NVIDIA GB300 NVL72>NVIDIA HGX B300 The NVIDIA Blackwell Ultra GPU is designed for this new era of AI reasoning.It delivers up to 20 petaFLOPS of FP4 sparse inference performance, offeringexceptional efficiency for large-scale deployments. With 279 GB of HBM3E memory,it supports expansive KV caching and long-context inference without offloading.Blackwell Ultra also features 800 Gbps NVIDIA® ConnectX®-8 networking, doublinginterconnect bandwidth compared to NVIDIA Blackwell to enable seamless scalingacross data centers. A newly optimized attention engine delivers 2.5x fasterattention performance compared toNVIDIA Hopper™, significantly acceleratingthroughput for reasoning. NVIDIA GB300 NVL72 Powering the New Era of AI Reasoning TheNVIDIA GB300 NVL72features a fully liquid-cooled, rack-scale architecturethat integrates 72NVIDIA Blackwell Ultra GPUsand 36 Arm®-basedNVIDIA Grace™CPUsinto a single platform, purpose-built for test-time scaling inference or AIreasoning tasks.AI factoriesaccelerated by the GB300 NVL72—leveraging NVIDIAQuantum-X800 InfiniBand or Spectrum-X™Ethernet,ConnectX-8 SuperNIC™, andNVIDIA Mission Control™management—deliver up to a 50x overall increase in AIfactory output performance compared to Hopper-based platforms. End-to-End AI Acceleration at Rack Scale With 279 GB of HBM3E memory per Blackwell Ultra chip and up to 37 TB of high-speed memory per rack, coupled with 1.44 exaFLOPS of compute, and a 72-GPUunifiedNVIDIA NVLink™domain, Blackwell Ultra provides unprecedented speed andscale to support larger models while giving rise to breakthroughs in AI. Combined withCUDA-X™libraries for accelerated computing, NVIDIA accelerates the entire hardwareand software computing stack. Increase AI Factory Output Performance by 50x The frontier curve illustrates key parameters that determine AI factory token revenueoutput. The vertical axis represents GPU tokens per second (TPS) throughput in onemegawatt (MW) AI factory, while the horizontal axis quantifies user interactivity andresponsiveness as TPS for a single user. At the optimal intersection of throughputand responsiveness, GB300 NVL72 yields a 50x overall increase in AI factory outputperformance compared to the Hopper architecture for maximum token revenue. NVIDIA GB300 NVL72Key Features >36 NVIDIA Grace CPUs>72 NVIDIA Blackwell Ultra GPUs>17 TB of LPDDR5X memory witherror-correction code (ECC)>20 TB of HBM3E>Up to 37 TB of fast-accessmemory>NVLink domain: 130 terabytes persecond (TB/s) of low-latency GPUcommunication Accelerating Real-Time Video Generation by 30x GB300 NVL72 introduces cutting-edge capabilities for diffusion-based videogeneration models. A single five-second video generation sequence processes 4million tokens, requiring nearly 90 seconds to generate on NVIDIA Hopper GPUs. TheBlackwell Ultra platform enables real-time video generation from world foundationmodels, such asNVIDIA Cosmos™, providing a 30x performance improvement versusHopper. This allows the creation of customized, photo-realistic, temporally andspatially stable video for physical AI applications. NVIDIA HGX B300 Purpose-Built for AI Reasoning Key Features >8 NVIDIA Blackwell Ultra GPUs>Over 2 TB of HBM3E memory>1,800 GBps NVLink betweenGPUs via NVSwitch™chip>2.6x faster training performance(vs. H100) NVIDIA HGX™B300is built for the age of AI reasoning with enhanced compute andincreased memory. Featuring 7x more AI compute than the Hopper platform, over 2TB of HBM3E memory, and high-performance networking integration with NVIDIAConnectX-8 SuperNICs, HGX B300 delivers breakthrough performance on the mostcomplex workloads from training, agentic systems, and reasoning, to real-time videogeneration for every data center. Boost Revenue With HGX B300 AI Factory Output The frontier curve illustrates key parameters that determine AI factory token revenueoutput. The vertical axis represents GPU tokens per second (TPS) throughput in onemegawatt (MW) AI factory, while the horizontal axis quantifies user interactivity andresponsiveness as TPS for a single user. At the optimal intersection of throughputand responsiveness, HGX B300 yields a 30x overall increase in AI factory outputperformance compared to the Hopper architecture for maximum token revenue. Next-Level AI Training Performance The HGX B300 platform delivers up to 2.6x higher training performance for largelanguage models such as DeepSeek-R1. Wi