Physics-grounded video generation requires controllable 3D object dynamics that remain physically consistent under contact, deformation, and external forcing. According to a paper on arXiv, existing trajectory-based methods often model isolated physical effects, making it difficult to compose conservative and non-conservative dynamics in contact-rich 3D scenes. The researchers present NEXUS, a neural energy-field framework designed to address this challenge.
The Challenge of Contact-Rich Dynamics
Contact-rich object dynamics, such as objects colliding, deforming, or being pushed, are notoriously difficult to simulate accurately over long time horizons. Traditional methods may handle either conservative forces (like gravity) or non-conservative effects (like damping) but struggle to combine them. The paper notes that existing trajectory-based methods often model isolated physical effects, limiting their ability to compose multiple dynamics in a single scene.
How NEXUS Works
NEXUS represents each object as a structural graph and constructs dynamic object-object and object-environment contact graphs. According to the paper, inspired by Hamiltonian Neural Networks, NEXUS formulates motion through scalar energy and dissipation terms rather than directly predicting states or accelerations. Conservative effects, including gravity and elastic deformation, are composed as additive energy terms, while non-conservative effects such as damping and impact-induced energy loss are modeled with learned Rayleigh-style dissipation. Forces are derived by differentiating the energy and dissipation functions and rolled out with a multi-substep semi-implicit integrator.
Performance Benchmarks
The paper reports that across controlled trajectory benchmarks, NEXUS improves long-horizon accuracy over representative learned and physics-structured dynamics baselines under varying mechanical properties and physical-effect compositions. The specific metrics are not detailed in the source, but the improvement is stated as significant.
Application to Video Generation
NEXUS trajectories provide effective guidance for contact-rich video generation. The paper states that using NEXUS trajectories improves physical plausibility while maintaining competitive visual quality. This suggests potential for generating more realistic simulations in computer graphics and robotics.