Artificial Intelligence #speech#quality assessment
NVMOS: Novel AI Model Predicts Perceptual Quality of Non-Verbal Vocalizations in Speech
A new paper on arXiv introduces NVMOS, the first model purpose-built to assess the perceptual quality of non-verbal vocalizations (NVs) such as laughter, sighs, and coughs in speech. The model was trained on a newly constructed NV-MOS dataset with expert ratings and achieves expert-level agreement with human Mean Opinion Scores. Tests on multimodal LLMs like Gemini showed clear inconsistencies, highlighting the need for specialized NV quality assessment.
Jun 16, 2026 1 source