Haoyang Li, Haibo Chen, Xin Wang Tsinghua Universitylihy218@gmail.com,chb24@mails.tsinghua.edu.cn,{xin_wang, wwzhu}@tsinghua.edu.cn Abstract Graphs are a fundamental data structure for representing relational informationin domains such as social networks, molecular systems, and knowledge graphs.However, graph learning models often suffer from limited generalization whenapplied beyond their training distributions. In practice, distribution shifts may arisefrom changes in graph structure, domain semantics, available modalities, or taskformulations. To address these challenges, graph foundation models (GFMs) haverecently emerged, aiming to learn general-purpose representations through large-scale pretraining across diverse graphs and tasks. In this survey, we review recentprogress on GFMs from the perspective of out-of-distribution (OOD) generalization.We first discuss the main challenges posed by distribution shifts in graph learningand outline a unified problem setting. We then organize existing approaches based 1Introduction Graphs are a fundamental data structure for representing relational information in many applications,including social and information networks, molecular and biological systems, recommendationplatforms, and knowledge graphs [1,2,3]. By encoding entities as nodes and interactions as edges,graphs capture complex dependency structures that are difficult to model using independent featurerepresentations. Graph learning methods, such as graph neural networks, have become a central toolfor predictive and reasoning tasks, including node classification, link prediction, and graph-levelarXiv:2601.21067v1 [cs.LG] 28 Jan 2026 prediction [4,5].However, models trained on a specific dataset or graph often exhibit limitedgeneralization when applied to new testing environments, where graph topology, feature distributions, Out-of-distribution (OOD) generalization provides a useful perspective for addressing these lim-itations [7,8]. In the field of graph learning, distribution shifts may arise from multiple sources.Structural properties such as connectivity patterns or motif statistics can change across graphs [9,10].Domain-specific factors, including data collection and annotation practices, may introduce datasetbiases [11]. Auxiliary modalities, such as text or molecular features, may be missing, noisy, or Corresponding authors Recently, graph foundation models (GFMs) have emerged and attracted growing attention fromthe research community. Inspired by foundation models in language and vision [14,15], GFMsaim to learn general-purpose graph representations through large-scale pretraining on diverse graphcollections [16,17]. Instead of only optimizing for a specific dataset, these models explore capturinggeneralizable patterns that can be reused and stable across graphs, domains, and downstream objec-tives. An increasing number of work has explored different approaches to building GFMs [18,19,20],including multi-graph pretraining [21], alignment across domains and modalities [22], invariantrepresentation learning [23], prompt- or instruction-based interfaces for task generalization [24],etc. There foundation models address both practical and methodological limitations of traditionalgraph learning. In many applications, labeled data for new graphs is scarce, and retraining models increased interest in GFMs [26,27,23], providing a promising paradigm for handling distributionshifts for OOD generalization. Several recent surveys [28,16,17] have reviewed GFMs from perspectives such as model architec-ture [29], pretraining objectives, scalability, and application domains [30,31]. In contrast, this surveyorganizes the literature explicitly from the perspective of OOD generalization. Rather than focusingon model design alone, we examine how different GFMs address distribution shifts arising fromchanges in graph structure, domain semantics, modality availability, and task formulation, providing In this survey, we provide a comprehensive overview of graph foundation models from the perspectiveof OOD generalization. We first identify the key challenges posed by distribution shifts in graphlearning and introduce a unified problem formulation that captures the OOD in structure/feature,domain, modality, and task. We then organize existing methods into two broad categories according towhether they explicitly support generalization across different task specifications. The first categoryincludes approaches that focus on generalization under a fixed task setting, where OOD generalizationis achieved by learning representations that remain effective across structural, domain, or modalityshifts. The second category comprises methods that are designed to generalize across more complex 2Challenges and Problem Formulation GFMs can support learning and inference across diverse environments, where the data-generatingprocess may differ substantially between training and deployment. In such settings, the poor gen-era