Link prediction in knowledge graphs fundamentally depends on the quality of learned embeddings for entities and relations, according to a new paper on arXiv titled "Model Graph Inductive Learning for Knowledge Graph Completion." However, most existing methods derive these embeddings by aggregating only the local neighborhood of each entity, neglecting the global structure of the knowledge graph. This limited view prevents models from capturing higher-level structural patterns that are essential for accurate and generalizable link prediction.
Problem and Motivation
Existing approaches to knowledge graph completion typically learn entity and relation embeddings by focusing on immediate neighbors. The authors argue that this local aggregation misses global structural patterns. Without a global view, models struggle to generalize to unseen entities or relations, a key challenge in inductive link prediction.
The MGIL Framework
To address these limitations, the authors introduce Model Graph Inductive Learning (MGIL), a framework that constructs a model graph by clustering entities based on the similarity of their incoming and outgoing relational structures or their entity types. A graph neural network (GNN) is then applied to this model graph to produce embeddings that capture the global view of the knowledge graph. These embeddings subsequently serve as high-quality initial features for the original knowledge graph, replacing random initialization and leading to more stable and expressive representations.
Experiment and Results
The paper reports extensive experiments on standard and recently proposed inductive benchmarks. According to the authors, MGIL achieves state-of-the-art or highly competitive performance in inductive link prediction, highlighting its effectiveness across diverse graph settings. The experiments demonstrate that leveraging global structure via a model graph significantly improves predictive accuracy.
Implications for Knowledge Graph Applications
MGIL offers a general-purpose enhancement for any knowledge graph completion task. By providing better initial embeddings, it can improve downstream applications such as recommendation systems, question answering, and information retrieval. The framework is particularly valuable for inductive settings where test entities were not seen during training.
Availability
The paper is available on arXiv under the reference 2606.16509. It falls under Computer Science > Artificial Intelligence. The authors have also provided links to code, data, and media associated with the article, along with tools for bibliographic citation and recommendations.