RealityBridge: New AI Framework Edits 3D Driving Simulations to Close the Sim-to-Real Gap

RealityBridge is a structure-preserving framework that edits 3D Gaussian Splatting driving simulations and bridges the gap to real-world video quality. It uses multimodal controls and autoregressive training to reduce artifacts, harmonize illumination, and ensure temporal consistency, outperforming existing methods on driving datasets.

iGEN Editorial

June 16, 2026

RealityBridge: New AI Framework Edits 3D Driving Simulations to Close the Sim-to-Real Gap

Autonomous driving development hinges on the ability to simulate rare, dangerous scenarios—known as long-tail hazards—at scale. However, collecting real-world footage of such events is dangerous and costly. Editable 3D Gaussian Splatting (3DGS) offers a way to reconstruct real driving scenes and then apply controlled edits, but the resulting videos suffer from a significant Sim-to-Real gap: rendering artifacts, degraded foreground assets, inconsistent lighting, and temporal flickering. According to a paper on arXiv, existing restoration and video generation methods fail to jointly repair these 3DGS-specific issues, improve visual realism, and maintain temporal coherence. To fill this gap, the authors propose RealityBridge, a structure-preserving and asset-aware Sim-to-Real framework for edited 3DGS driving videos.

The Problem: Sim-to-Real Gap in Driving Simulations

Long-tail hazardous scenarios are essential for safety-oriented autonomous driving, yet they are difficult to collect and reproduce at scale, the paper reports. Editable 3DGS simulation is a promising alternative: it reconstructs real scenes and allows controllable editing. However, the rendered videos from edited 3DGS contain specific artifacts that degrade realism. Existing video restoration and generation methods are insufficient because they cannot simultaneously address 3DGS-specific artifacts, improve overall visual quality, and ensure frame-to-frame consistency.

RealityBridge: Multimodal Controls and Adaptive Allocation

RealityBridge uses multimodal controls to guide the restoration process. According to the paper, these controls include rendered videos, foreground masks, edge maps, and semantic masks. A lightweight GateNet is introduced to adaptively allocate these conditions across backbone layers, ensuring the model focuses on the most relevant information for each frame. This design allows the framework to preserve structure while improving asset quality and illumination consistency.

Technical Approach: Autoregressive Training and Reward-Guided Post-Training

The authors constructed targeted training data and introduced autoregressive long-video training combined with reward-guided post-training. This two-step process improves restoration quality, temporal stability, and reduces hallucination—where the model invents incorrect details. The autoregressive training enables the model to maintain consistency across long video sequences, a critical requirement for driving simulations. Reward-guided post-training further refines outputs by optimizing for perceptual quality metrics.

Performance and Results

Extensive experiments were conducted on both internal and public driving datasets. RealityBridge outperformed existing methods in three key areas:

Metric	RealityBridge Performance vs. Existing Methods
Artifact removal	Superior removal of rendering artifacts
Illumination harmonization	More consistent lighting that matches real-world conditions
Long-sequence temporal consistency	Reduced flickering and better frame-to-frame coherence

The paper states that RealityBridge demonstrates superior results in these areas, though specific numerical metrics are not detailed in the provided abstract.

Implications for Autonomous Vehicle Development

For enterprise technology leaders evaluating autonomous driving systems—whether for logistics fleets or passenger vehicles—the ability to generate realistic, editable driving simulations is a force multiplier. RealityBridge addresses a key bottleneck: generating high-fidelity video of rare events without requiring dangerous real-world data collection. By bridging the Sim-to-Real gap, it enables more robust validation of perception and planning algorithms. The framework's use of multimodal controls and lightweight neural components suggests it could be integrated into existing simulation pipelines with manageable computational overhead.

Feature	Benefit
Multimodal controls (masks, edges, semantics)	Provide structural guidance to preserve scene layout
GateNet for adaptive condition allocation	Ensures efficient use of computational resources
Autoregressive long-video training	Maintains temporal consistency over extended sequences
Reward-guided post-training	Reduces hallucination and improves perceptual quality

While the paper focuses on driving datasets, the underlying approach—editing neural radiance fields and then restoring realism—has potential applications beyond autonomous driving, including robotics simulation and virtual training environments.

Sources:

RealityBridge: New AI Framework Edits 3D Driving Simulations to Close the Sim-to-Real Gap

The Problem: Sim-to-Real Gap in Driving Simulations

RealityBridge: Multimodal Controls and Adaptive Allocation

Technical Approach: Autoregressive Training and Reward-Guided Post-Training

Performance and Results

Implications for Autonomous Vehicle Development

Recommended Stories

SACE Framework Introduces First Scale-Aware Concept Erasure for Visual Autoregressive Models to Prevent Catastrophic Semantic Collapse

AIRMap AI Framework Generates Radio Maps 100x Faster Than Ray Tracing for Wireless Digital Twins

ActiveSAM Speeds Open-Vocabulary Segmentation 5.5x, Boosts Accuracy for Noisy-Input Domains

Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization