iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
India-UK free trade deal to take effect on July 15 opening 99% of exports to tariff-free access Canada’s CPP Investments Commits Rs 7,000 Crore to Hyderabad-Based CtrlS Datacenters Backlash over delivery robots: Chicago residents demand ban as councils weigh regulation C.H. Robinson sued in post-Montgomery Florida broker liability case Bank of England Expected to Hold Interest Rates at 3.75% for Fourth Consecutive Meeting FastMix: Gradient-Based Data Mixture Optimization Reduces Search Cost in AI Training New Temporal Pyramid Model Enhances Spoofed Speech Detection for Voice Security Systems InvDesMobility Framework Enables Auditable Closed-Loop Materials Discovery New Study Challenges Prior Claims on Scaling Context Length in Imitation Learning AI-Powered SaaS Platform Optimises Temporary Accommodation Placement for London Boroughs India-UK free trade deal to take effect on July 15 opening 99% of exports to tariff-free access Canada’s CPP Investments Commits Rs 7,000 Crore to Hyderabad-Based CtrlS Datacenters Backlash over delivery robots: Chicago residents demand ban as councils weigh regulation C.H. Robinson sued in post-Montgomery Florida broker liability case Bank of England Expected to Hold Interest Rates at 3.75% for Fourth Consecutive Meeting FastMix: Gradient-Based Data Mixture Optimization Reduces Search Cost in AI Training New Temporal Pyramid Model Enhances Spoofed Speech Detection for Voice Security Systems InvDesMobility Framework Enables Auditable Closed-Loop Materials Discovery New Study Challenges Prior Claims on Scaling Context Length in Imitation Learning AI-Powered SaaS Platform Optimises Temporary Accommodation Placement for London Boroughs
Home ›› Technology ›› Ai ›› Llms ›› New Study Challenges Prior Claims on Scaling Context Length in Imitation Learning

New Study Challenges Prior Claims on Scaling Context Length in Imitation Learning

Researchers evaluated diffusion policies for robotic imitation learning across varying context lengths, challenging prior claims that long-context scaling is fragile. They propose a training algorithm that jointly trains policies at multiple context lengths, reducing sample complexity.

iG
iGEN Editorial
June 17, 2026
New Study Challenges Prior Claims on Scaling Context Length in Imitation Learning

A study published on arXiv presents the first detailed investigation into how context length affects imitation learning for robotic manipulation. The researchers benchmarked policy performance as context length was incrementally increased from short to long, across tasks with varying local stability and memory requirements, and in multiple data regimes.

Context Length in Imitation Learning

Imitation learning enables dexterous robotic manipulation from RGB observations, but policies typically condition robot actions on only a short history. This limits performance on tasks requiring memory, often causing repeated execution of failing motions. The study seeks to address this by systematically evaluating the impact of longer context lengths.

Naively scaling context length is not as brittle as advertised in literature.

According to the paper, with an appropriate conditioning method and denoising backbone—specifically UNet+Cross-Attention—single-task policies achieve high success rates on many tasks even with naive scaling of context length. This finding contradicts earlier assumptions about the fragility of long-context models.

Key Findings and Benchmarking

The team tested policies across a spectrum of tasks, varying local stability and memory demands, and in multiple data regimes. The results showed that single-task policies using UNet+Cross-Attention perform reliably even as context window sizes increase. This is the first study to examine context length in imitation learning at such granularity.

  • Tasks with high memory requirements benefit most from longer contexts.
  • UNet+Cross-Attention proves robust as the denoising backbone for long-context policies.
  • The data regime—size and diversity of training data—influences the effectiveness of scaling.

Proposed Training Algorithm

To address the sample complexity of long-context learning, the authors propose an algorithm that jointly trains policies at multiple context lengths. This approach reduces the number of demonstrations needed to learn effective long-context behaviors, making training more efficient.

  • Joint training across short and long contexts improves generalization.
  • The algorithm reduces sample complexity compared to training only at the longest target length.

Re-evaluating Prior Solutions

Finally, the researchers apply their findings to re-evaluate previously proposed solutions for long-context imitation learning. By using the insights from the benchmarking and the new training algorithm, they reassess the effectiveness of earlier methods. The re-evaluation highlights that some prior issues attributed to context length may stem from other factors, such as unsuitable architectures or data inefficiencies.

Implications for Robotics and AI

The study provides practical guidance for building robotic systems that require memory, such as sequential manipulation tasks. Enterprise technology leaders considering AI for robotics should note that scaling context length is more feasible than previously thought, especially when using UNet+Cross-Attention backbones and multi-context training. The findings may also generalize to other domains where imitation learning with long-range dependencies is needed.

The authors of the study are: Agarwal, Abhinav, Wei, Adam, Kargin, Taylan, Zeng, Michael, Becker, Cole, Dayi, Arif Kerem, Parrilo, Pablo, Ozdaglar, Asuman, and Tedrake, Russ.


Sources:

Keep Reading

Recommended Stories

RaBiT: Residual-Aware Binarization Training for Accurate and Efficient Large Language Models Technology

RaBiT: Residual-Aware Binarization Training for Accurate and Efficient Large Language Models

Researchers propose RaBiT, a quantization framework that resolves pathological feature co-adaptation in residual binarized LLMs. RaBiT delivers state-of-the-art 2-bit accuracy and 4.49x inference speed-up on an RTX 4090, rivaling hardware-intensive Vector Quantization methods.

June 16, 2026
SPRI: SVD-Partitioned Residual Initialization Boosts Data-Constrained MoE Upcycling for Multilingual Translation Technology

SPRI: SVD-Partitioned Residual Initialization Boosts Data-Constrained MoE Upcycling for Multilingual Translation

Researchers propose SPRI, a method that initializes Mixture-of-Experts (MoE) models from pretrained dense models using SVD-partitioned residuals. Evaluated on multilingual speech-to-text translation, SPRI achieves gains of 2.58 BLEU and 3.32 COMET over fine-tuned dense models, and outperforms prior MoE upcycling baselines by 3.39 BLEU and 4.34 COMET points.

June 16, 2026
Psychometric Datasheet Reveals 'Dark Current' Bias in LLM-as-a-Judge Evaluation Systems Technology

Psychometric Datasheet Reveals 'Dark Current' Bias in LLM-as-a-Judge Evaluation Systems

Researchers introduce a Judge Datasheet protocol to measure biases in LLM-as-a-judge systems, including dark current under vacuum inputs and positional false preference. A case study of three open-weight models reveals stark differences in measurement reliability, with implications for enterprise AI evaluation.

June 16, 2026
Vernier Research Reveals Why Language Models Give Inconsistent Answers to Causal Questions After Variable Renaming Technology

Vernier Research Reveals Why Language Models Give Inconsistent Answers to Causal Questions After Variable Renaming

Researchers introduce Vernier, a probing technique that reveals representational misalignment in instruction-tuned language models when variable names are replaced with placeholders, causing inconsistent answers to causal reasoning questions. The study tests models including Qwen-7B, Qwen-14B, and Llama-3.1-8B, and finds that success is bounded by model family, scale, and task.

June 16, 2026