Artificial Intelligence #machine learning#self-supervised learning
MoFore: A New Self-Supervised Framework Learns Video Representations by Forecasting Future Latent Embeddings
A new self-supervised video representation learning framework called MoFore (Momentum-Guided Semantic Forecasting) is introduced by researcher Xu Qinwu. Instead of reconstructing masked pixels or aligning contrastive pairs, MoFore learns by forecasting future latent embeddings from temporally distant clips. Experiments on the UCF101 dataset show strong temporal stability and emergent category-level structure without action labels.
Jun 16, 2026 1 source