Physical AI requires world models that go beyond passive visual generation to become foundational infrastructure. According to a new paper on arXiv, existing models struggle to acquire world knowledge from heterogeneous experience, maintain persistent states over long horizons, and execute efficiently under real-world constraints. The Kairos research team proposes a solution: a native world model stack called Kairos that integrates learning, maintenance, and deployment into a cohesive operational foundation for self-evolving physical intelligence.
Native Pre-training Paradigm
Kairos learns the world by pioneering a Native Pre-training Paradigm governed by a Cross-Embodiment Data Curriculum. As the paper describes, this curriculum organizes open-world videos, human behavioral data, and robot interactions into a progressive developmental pathway. This approach allows the model to acquire diverse knowledge from multiple sources of experience, moving beyond single-domain training.
Native Unified Architecture
To maintain the world, Kairos employs a Native Unified Architecture that unifies world understanding, generation, and prediction. Central to this architecture is Hybrid Linear Temporal Attention, a mechanism that combines three attention modes:
- Sliding-window attention to capture local dynamics
- Dilated sliding windows to capture mid-range dependencies
- Gated linear attention to maintain persistent global memory
The team establishes formal theoretical bounds demonstrating that this temporal factorization strictly limits error accumulation, mathematically guaranteeing state propagation across extended horizons.
| Component | Description | Key Feature |
|---|---|---|
| Native Pre-training Paradigm | Cross-Embodiment Data Curriculum | Organizes videos, human data, robot interactions |
| Native Unified Architecture | Hybrid Linear Temporal Attention | Sliding-window + dilated + gated linear attention |
| Deployment-Aware System Co-Design | Low-latency rollout generation | Supports server and consumer-grade hardware |
Deployment-Aware System Co-Design
Kairos runs the world by incorporating a Deployment-Aware System Co-Design to support low-latency rollout generation on both server and consumer-grade hardware. This enables real-world observation-action-feedback loops, making Kairos practical for physically embodied systems that must respond in real time.
Benchmark Performance
The paper reports experiments on embodied world-model, long-horizon, and action-policy benchmarks. According to the results, Kairos achieves top-level performance while offering a strong efficiency-capability trade-off. These outcomes position Kairos as a cohesive operational foundation for future self-evolving physical intelligence, though the paper does not disclose specific numerical metrics.
For enterprise technology decision-makers evaluating Physical AI platforms, Kairos represents a shift toward integrated world models that can handle heterogeneous data, long-term state, and real-time constraints—capabilities critical for applications such as autonomous robotics, manufacturing, and supply chain automation. The stack's ability to run on consumer-grade hardware may lower barriers to deployment, while its theoretical guarantees on error accumulation address reliability concerns for long-horizon tasks. As Physical AI moves from lab to field, native world model stacks like Kairos could become essential infrastructure.