Artificial Intelligence #multimodal#embedding
MMLongEmbed Benchmark Reveals Limitations in Long-Context Multimodal Embedding Models
MMLongEmbed is the first comprehensive benchmark for evaluating multimodal embedding models (MEMs) in long-context scenarios. It comprises four retrieval tasks covering text, document, and video modalities. The evaluation reveals that current MEMs rely heavily on superficial feature matching and struggle with deep semantic and structural dependencies, with performance degrading systematically based on context length and key information placement.
Jun 16, 2026 1 source