Time series foundation models (TSFMs) have demonstrated strong zero-shot forecasting capabilities through large-scale pre-training, but adapting them to downstream domains under distribution shift remains a significant challenge. According to the paper "TS-Memory: Plug-and-Play Memory for Time Series Foundation Models" published on arXiv, existing adaptation methods face a fundamental trade-off.
The Trade-Off in Adaptation Methods
Parametric adaptation techniques, which fine-tune model parameters, can cause catastrophic forgetting and require costly multi-domain maintenance. On the other hand, non-parametric retrieval methods improve forecasts by searching a datastore of historical examples, but incur high inference latency. TS-Memory proposes a third path: a lightweight memory adapter that augments frozen TSFMs without retrieval during inference.
| Method | Strengths | Weaknesses |
|---|---|---|
| Parametric Adaptation | Good performance on target domain | Catastrophic forgetting, costly multi-domain maintenance |
| Non-Parametric Retrieval | Improved forecasts, no forgetting | High inference latency due to datastore search |
| TS-Memory (proposed) | Consistent improvements, constant-time overhead | None reported in source |
How TS-Memory Works
The researchers propose a technique called Parametric Memory Distillation, implemented as TS-Memory. This memory adapter is trained in two stages. First, an offline, retrieval-leakage-safe k-nearest neighbors (kNN) teacher is constructed. This teacher synthesizes confidence-aware quantile targets from retrieved futures. Second, this retrieval-induced distributional correction is distilled into a lightweight memory adapter via confidence-gated supervision. The process ensures that the adapter learns to mimic the beneficial corrections of retrieval without needing the actual datastore at inference time.
Inference and Performance
During inference, TS-Memory fuses memory and backbone predictions with constant-time overhead, enabling retrieval-free deployment. Experiments across diverse TSFMs and benchmarks demonstrate consistent improvements in both point and probabilistic forecasting over representative adaptation methods. The paper reports that TS-Memory achieves efficiency comparable to the frozen backbone, meaning the added overhead is minimal.
Implications for Enterprise Forecasting
For enterprise technology decision-makers, TS-Memory offers a practical solution to adapt pre-trained time series models to specific operational domains—such as demand forecasting, inventory planning, or energy load prediction—without the cost of retraining or the latency of retrieval systems. The plug-and-play nature allows integration with existing TSFMs, potentially reducing model maintenance overhead while improving forecast accuracy. As foundation models become more prevalent in supply chain and logistics analytics, approaches like TS-Memory could enable faster, more reliable deployment across diverse use cases.