您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[台积电&英伟达]:台积电 x 英伟达:突破热壁:先进冷却技术如何驱动未来计算 - 发现报告

台积电 x 英伟达:突破热壁:先进冷却技术如何驱动未来计算

2025-10-05台积电&英伟达M***
AI智能总结
查看更多
台积电 x 英伟达:突破热壁:先进冷却技术如何驱动未来计算

台积电 x Nvidia :突破热壁:先进冷却技术如何驱动未来计算 Original Article by SemiVision Research (TSMC, NVIDIA, Coherent, Jentech, AVC,Auras, Cooler Master, Mikros Technologies, Inspur, Invek, Ningbo Jingda,Fabric8Labs, Intel , xMEMS) OCT 05, 2025·PAIDSEMIVISION The development of AI chips is fundamentally driven by the pursuit of higherperformance, but this also brings a critical challenge: thermal management.Effectively dissipating the enormous heat generated by rising powerconsumption has become a key design consideration for next-generation AIchip architectures. On the system level, the integration of optical engines is widely viewed as thefuture solution to overcome high-bandwidth interconnect bottlenecks.However, optical signals are highly sensitive to temperature fluctuations—evenslight deviations can lead to transmission loss and degraded performance. Thisis why TSMC has adopted the Microring Modulator (MRM) approach, but MRMitself has strict thermal operating requirements. As a result, companies likeTSMC and NVIDIA are actively exploring next-generation cooling technologiesto meet these thermal demands. On the logic process side, TSMC’s evolution from N3 → N2 → A16 involvesmore than geometric scaling—it represents a transistor architecture transition:from FinFET, to GAA (Gate-All-Around), and finally to the Super Power Railarchitecture in the A16 generation. This series of changes is aimed at achievingthe optimal balance of PPA (Power, Performance, and Area), laying thefoundation for high-efficiency AI computing. However, from a packaging and materials perspective, new challenges emerge.Traditional cooling depends on TIM (Thermal Interface Materials), whosematerial selection is limited, and thermal reliability is constrained within specifictemperature ranges. To address this, TSMC has proposed an innovative DirectLiquid Cooling (DLC) solution. By applying a backside copper pillar process,TSMC integrates microfluidic structures directly into the backside of the chip,allowing heat to be removed through direct liquid convection, dramaticallyincreasing cooling efficiency. Nevertheless, since no-TIM architectures are not yet fully mature, the industrycurrently relies on heat conduction through the top lid of the package forthermal management. This has led to the emergence of the Microchannel Lid(MCL) concept—micro-channels are etched directly into the internal surface ofthe package lid, combined with liquid cooling. MCL is expected to serve as acritical transitional solution before full backside liquid cooling becomesmainstream. SemiVisionbelieves that current AI chip logic process designs are almostexclusively carried out under high-density (HD) conditions. While this designapproach significantly boosts computational performance per unit area, it alsoresults in extremely high power density and heat flux, creating new thermalmanagement bottlenecks. A key question arises: Can the heat generated inside the AI chip be fully andevenly transferred to the package lid? The internal thermal path of a chip involves multiple layers — silicon substrate,metal interconnects, micro-bumps, underfill, thermal interface materials (TIM),and more. Because of theselayered thermal interfaces, heat cannot betransferred 100% efficiently to the lid, leading tolocalized “hotspots.”Thiscumulative thermal resistance is one of the primary factorslimiting the chip’smaximum power output. Improving Thermal Conduction Efficiency The first key lies inshortening the thermal pathandreducing interfacialthermal resistanceat each layer, combined with introducinghigh thermalconductivity materials, such as: Diamond thin films / CVD diamondSiC substrates Diamond Thin Film / CVD Diamond Cu–diamond composites These materials can dramatically increase the overall heat transfer coefficient. Expanding Effective Heat Dissipation Area The second key isincreasing the effective surface area for heat exchange—enhancingconvective heat transferthrough larger fluid–solid contact areas.This has led the industry to explore several promising solutions: (1) Microchannel Cold Plate (MCLP) Approach:Etching micron-scale channels into copper or silicon substrates,with coolant flowing directly near the chip.Advantages:Maximizes surface-to-volume ratio; significantly lowersthermal resistance.Challenges:High pressure drop, clogging risk, and increased pump powerconsumption. (2) SiC Substrate as Heat Sink Material:Silicon carbide (SiC) has a high thermal conductivity of ~370–490W/m·K, excellent mechanical strength, and high voltage tolerance.Application:As a heat-spreading base substrate, it improves boththermaldiffusionandmechanical/electrical performance, making it suitable forfuture>1 kW/cm²AI chips. (3) Diamond Thin Film / CVD Diamond Thermal Conductivity:1000–2200 W/m·K — far surpassing silicon (~150W/m·K) and SiC.Application:Used as aheat-spreading interlayerorcoating, directlydeposited on chip or package surfaces.Advantages:Ultra-low thermal resistance, nearly