Relational Pretraining for the Next Generation of Graph Intelligence

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rapid advancement of foundation models has transformed the landscape of machine learning by enabling scalable, general-purpose solutions across diverse domains such as natural language processing, computer vision, and multimodal reasoning. Simultaneously, Graph Neural Networks (GNNs) have become the de facto standard for learning over structured data due to their ability to model complex relational dependencies inherent in graphs. The convergence of these two paradigms—GNNs and foundation models—marks a significant milestone in the pursuit of universal representation learning over graph-structured data. This survey presents a comprehensive and in-depth exploration of Graph Neural Networks in the context of large-scale foundation models, highlighting key methodologies, architectural innovations, training strategies, applications, and open research challenges. We begin by reviewing the mathematical underpinnings of graph neural architectures, including message passing, graph convolution, and attention mechanisms. Building on this foundation, we explore how self-supervised pretraining techniques—such as masked node and edge prediction, contrastive learning, and graph autoencoding—have been adapted to equip graph models with the capacity for transfer learning and generalization. We further analyze architectural trends that scale GNNs to foundation model capacities, including Graph Transformers, scalable neighborhood sampling methods, and structural encoding schemes. A detailed overview of applications illustrates the versatility of graph foundation models in real-world scenarios, including drug discovery, knowledge graph reasoning, recommender systems, scientific simulations, and fraud detection. In addressing current limitations, we examine critical challenges such as computational scalability, data availability, heterogeneity, dynamic graph modeling, interpretability, and ethical considerations. We also identify key directions for future research, including the development of universal graph encoders, integration with other modalities, lifelong graph learning, and responsible AI deployment. By synthesizing recent advances and outlining a forward-looking roadmap, this survey aims to provide both a foundational reference and a strategic perspective for researchers, practitioners, and developers working at the intersection of graphs and large-scale machine learning. Ultimately, we argue that graph foundation models are poised to play a central role in the next generation of AI systems, enabling machines to reason over structured relational data with unprecedented depth, scale, and generality.

Article activity feed