iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
ControlMap: Controllable HD Map Generation Using Latent Diffusion for Traffic Simulation Akasha 2 Achieves 4x Faster Visual Synthesis with Hamiltonian-Inspired AI Architecture PURe Module Enhances Vision Networks by Adding Multiplicative Local Interactions New Unified Definition of AI Hallucination Pins It on Inaccurate World Modeling Attention, Not Model Scale, Drives Human-AI Alignment in Multimodal Language Prediction, Research Finds LLM Manuscript Scoring System Validated Against Peer-Review Outcomes at Major AI Conference Semantic Pyramid Indexing: Adaptive Query Depth for Streaming RAG in Vector Databases Deep Neural Networks Formulated via Non-Archimedean Analysis Offer New Universal Approximation Capabilities TuneJury: Open Metric Improves Music Generation Preference Alignment SACE Framework Introduces First Scale-Aware Concept Erasure for Visual Autoregressive Models to Prevent Catastrophic Semantic Collapse ControlMap: Controllable HD Map Generation Using Latent Diffusion for Traffic Simulation Akasha 2 Achieves 4x Faster Visual Synthesis with Hamiltonian-Inspired AI Architecture PURe Module Enhances Vision Networks by Adding Multiplicative Local Interactions New Unified Definition of AI Hallucination Pins It on Inaccurate World Modeling Attention, Not Model Scale, Drives Human-AI Alignment in Multimodal Language Prediction, Research Finds LLM Manuscript Scoring System Validated Against Peer-Review Outcomes at Major AI Conference Semantic Pyramid Indexing: Adaptive Query Depth for Streaming RAG in Vector Databases Deep Neural Networks Formulated via Non-Archimedean Analysis Offer New Universal Approximation Capabilities TuneJury: Open Metric Improves Music Generation Preference Alignment SACE Framework Introduces First Scale-Aware Concept Erasure for Visual Autoregressive Models to Prevent Catastrophic Semantic Collapse
Home ›› Technology ›› Ai ›› Llms ›› Cough Regression Benchmark Reveals Trade-Offs in Respiratory Acoustic Foundation Models

Cough Regression Benchmark Reveals Trade-Offs in Respiratory Acoustic Foundation Models

A new benchmark from researchers at NC State evaluates five respiratory acoustic foundation models on cough regression tasks—predicting age, BMI, and disease probability from cough audio. The study reveals that smaller MLP heads often outperform linear probes, but full-MLP heads overfit on small clinical data. HeAR and M2D+Resp achieve near-full performance with only 50 samples, while OPERA models require 400. Cross-dataset transfer is asymmetric, with large diverse datasets generalizing better to small clinical populations.

iG
iGEN Editorial
June 16, 2026
Cough Regression Benchmark Reveals Trade-Offs in Respiratory Acoustic Foundation Models

Respiratory acoustic foundation models (FMs) have demonstrated strong performance in cough classification—determining if a cough is indicative of a disease. However, their ability to predict continuous health quantities from cough audio, such as age, BMI, or disease probability, remains largely unexplored. This regression capability has clinical value in settings where physical measurements are unavailable, enabling passive health monitoring. A new preprint by researchers including Sanap, Mayur, Desikan, Prasanna, and Lobaton, Edgar introduces the first multi-model, multi-target cough regression benchmark to evaluate these models.

The Cough Regression Benchmark

The benchmark assesses five foundation models—OPERA-CT, OPERA-CE, OPERA-GT, HeAR, and M2D+Resp—across six targets (age, BMI, disease probability on multiple datasets) under subject-disjoint protocols. The models are tested on three datasets: Coswara, CIDRZ, and CoughVID. Three types of regression heads are compared: linear probing, a small multi-layer perceptron (MLP-small), and a full MLP.

The study, according to the arXiv preprint, finds that MLP-small beats the mean-predictor baseline on all tasks and outperforms linear probing in 23 of 30 model × task cases. However, full MLP overfits on small clinical data but recovers on larger datasets, revealing a dataset-size × head-capacity trade-off.

Model Performance by Target

Model Best Age MAE (Coswara) Best Age MAE (CIDRZ) Key Observation
HeAR 9.12 yr Excluded due to pretraining overlap Leads within-dataset age regression on Coswara
OPERA-GT (favored over OPERA-CT) Margin within seed variance Generative pretraining advantage from breath to cough
M2D+Resp Near-full performance at N=50 Similar Data-efficient on small samples

According to the authors, HeAR leads within-dataset age regression on Coswara with a mean absolute error (MAE) of 9.12 years. However, HeAR's results on CIDRZ are excluded from headline claims due to possible overlap between HeAR's pretraining data and CIDRZ. OPERA-GT is favored over OPERA-CT on age in all three datasets, with the CIDRZ margin within seed variance, extending a generative-pretraining advantage from breath analysis to cough.

Data Efficiency Across Models

Data efficiency varies significantly. HeAR and M2D+Resp reach near-full performance with only N=50 samples, while OPERA models require N=400 samples to achieve comparable results. This makes HeAR and M2D+Resp particularly attractive for deployment in low-data scenarios, such as emerging outbreak monitoring in under-resourced regions.

Asymmetric Cross-Dataset Transfer

Cross-dataset transfer performance is strongly asymmetric. The study reports that large diverse data generalises to small clinical populations (e.g., CoughVID to CIDRZ yields a negative bias of -0.17 years), but transfer in the opposite direction fails (CIDRZ to Coswara leads to a positive bias of +2.43 years, a 26.6% increase). This highlights the importance of using large, diverse training datasets when building regression models for cough audio.

For enterprise technology decision-makers, these findings have practical implications. When deploying respiratory acoustic AI in clinical or remote monitoring systems, choosing the right foundation model and regression head depends on dataset size and target population. HeAR and M2D+Resp offer data efficiency for small-labelled datasets, while OPERA models may benefit from larger datasets. The asymmetric transfer results underscore the need to match training data to the deployment population.


Sources:

Keep Reading

Recommended Stories

Ensemble Deep Learning Achieves 99.27% Accuracy in Lemon Leaf Disease Detection Technology

Ensemble Deep Learning Achieves 99.27% Accuracy in Lemon Leaf Disease Detection

A study on arXiv presents an ensemble deep learning approach for classifying lemon leaf diseases, achieving 99.27% accuracy. The method combines InceptionV3 and MobileNetV2 with adversarial training and Grad-CAM visualization, using a dataset of 1,354 images across 9 classes.

June 16, 2026
How Multi-Label Classification and Generative AI Scale User Feedback Analysis Technology

How Multi-Label Classification and Generative AI Scale User Feedback Analysis

A research paper on arXiv details how a major software company used supervised machine learning for multi-label topic classification and generative AI for summarization to efficiently process large volumes of user feedback. The study found that sentiment analysis alone does not reliably indicate user satisfaction, emphasizing the need for explicit satisfaction surveys.

June 16, 2026
PURe Module Enhances Vision Networks by Adding Multiplicative Local Interactions Technology

PURe Module Enhances Vision Networks by Adding Multiplicative Local Interactions

Researchers propose PURe, a Product-Unit Residual Module that introduces explicit multiplicative local interactions into deep vision networks. The module serves as a drop-in replacement for native residual units, consistently improving performance on benchmarks like ImageNet and CIFAR-10 while using smaller parameter budgets.

June 16, 2026
BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics Technology

BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics

Researchers propose BridgePolicy, a generative visuomotor policy that uses a diffusion-bridge formulation to integrate observations directly into stochastic dynamics, improving precision and reliability in robotic control. It outperforms state-of-the-art generative policies across 52 simulation tasks and 5 real-world tasks.

June 16, 2026