Topic
benchmarks
Research Finds Anomalies in Multivariate Time Series Benchmarks Are Mostly Univariate
A study by researchers Pinet, Cumin, Berlemont, and Vaufreydaz on eight public benchmarks for multivariate time series anomaly detection (MTSAD) finds that labeled anomalies are overwhelmingly univariate—no cross-channel rupture occurs without a univariate deviation. The paper's diagnostic framework and synthetic data experiments show that current benchmarks do not justify cross-channel modeling, as channel-dependent detectors offer no measurable gain over channel-independent ones. The authors call for more structurally diverse evaluation sets.
LLM Tutor Benchmarks Ignore Students Who Bypass Scaffolding, Study Finds
A study introduces two metrics—Chatbot Scaffolding and Student Uptake—and applies them to 9,490 chats across benchmarks and real-world deployments. It finds that real-world students often bypass pedagogical scaffolding, revealing a mismatch between lab evaluations and actual usage.