RAID: Semantic Graph Diffusion Enables True Cold-Start and Cross-Lingual Forecasting

A new framework called RAID (Retrieval-Augmented Iterative Diffusion) addresses the true cold-start forecasting problem where no prior observations exist. By leveraging textual metadata and semantic graph diffusion, RAID outperforms strong foundation models on accuracy and prediction interval coverage while reducing inference latency by an order of magnitude. It also enables zero-shot cross-lingual transfer, allowing models trained in one language to generalize to others.

iGEN Editorial

June 16, 2026

RAID: Semantic Graph Diffusion Enables True Cold-Start and Cross-Lingual Forecasting

Time-series foundation models have achieved impressive transfer performance when given a non-empty history window. However, true cold-start scenarios—where a new item has no prior observations—violate this assumption and remain a significant challenge in forecasting. According to a research paper published on arxiv.org, a new framework called RAID (Retrieval-Augmented Iterative Diffusion) is designed to tackle this problem by replacing history-based correlation learning with metadata-driven semantic retrieval and graph-conditioned diffusion.

The Cold-Start Forecasting Problem

Traditional time-series models rely on historical data to learn patterns and make predictions. In true cold-start situations, such as when a new product is launched, a sensor is deployed, or an item is introduced in a different region, there is zero observational history. Foundation models that require a warm-up window fail in these cases. The RAID framework directly addresses this gap, according to the paper authored by V.; Arunkumar; Gandhudi; Manoranjan; R.; Gangadharan G.; Prakash; Senthilkumar.

How RAID Works

RAID maps textual metadata into a shared semantic space using a frozen multilingual embedding model. It then constructs an inductive retrieval graph that naturally extends to unseen items. The framework first forms a base forecast by aggregating information from semantically related neighbors in this graph. It then refines this forecast with a gated diffusion module to model residual uncertainty. This two-step approach enables accurate predictions without any historical observations.

Performance and Latency Gains

Under a strict true cold-start protocol, RAID outperforms strong foundation models and competitive baselines on both forecasting accuracy and prediction interval coverage, according to the paper. Additionally, it reduces inference latency by an order of magnitude through non-autoregressive decoding. The following table summarizes the key performance advantages:

Metric	RAID vs. Baselines
Forecasting accuracy	Outperforms strong foundation models and competitive baselines
Prediction interval coverage	Superior coverage
Inference latency	Reduced by an order of magnitude (non-autoregressive)

Cross-Lingual Capabilities

A notable feature of RAID is its ability to enable zero-shot cross-lingual transfer. Because the shared semantic space is built from a frozen multilingual embedding model, a model trained on English descriptions can generalize to items described in other languages without direct supervision. This is particularly valuable for global forecasting applications where metadata may be in multiple languages.

Implications for Enterprise Forecasting

For enterprise technology decision-makers, RAID offers a promising approach to forecasting in environments where new items appear frequently and historical data is scarce. The significant reduction in inference latency also makes it suitable for real-time applications. While the paper focuses on the technical framework, the underlying principles—metadata-driven retrieval, graph diffusion, and multilingual embeddings—can be adapted to various domains, including supply chain demand forecasting, energy load prediction, and financial market analysis.

The RAID framework represents a shift from relying on historical time-series data to leveraging semantic metadata for true cold-start scenarios. Its demonstrated ability to outperform foundation models while enabling cross-lingual transfer positions it as a compelling solution for organizations dealing with sparse data in global contexts.

Sources:

RAID: Semantic Graph Diffusion Enables True Cold-Start and Cross-Lingual Forecasting

The Cold-Start Forecasting Problem

How RAID Works

Performance and Latency Gains

Cross-Lingual Capabilities

Implications for Enterprise Forecasting

Recommended Stories

New AI Framework SERAF Combines Semantic and Numerical Data for Better Time Series Forecasting

Scientists Use AI and Quantum Computing to Generate New Peptides in Spare Time

SoftSkill: Compressing AI Agent Skills into Compact Latent Controls Boosts Accuracy Over Traditional Prompting

New Research Shows Pretraining Data Composition Can Engineer Neural Scaling Laws for Particle Physics