iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
India, Canada Agree to Conclude Free Trade Pact Talks by Year-End After G7 Meeting Oil Prices Dip Near $70 per Barrel as Middle East Turmoil Cools After US-Iran Deal New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline New Hybrid Neuro-Symbolic Framework Achieves 78.1% Accuracy in Irony Detection Without Fine-Tuning UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion New Legal QA Benchmark Exposes Hallucination Risks in Statute-Centric AI Retrieval CrossMaps: Real-Time Open-Vocabulary Semantic Mapping for Autonomous Rover Navigation AI-Enabled Progress in Public Goods: LLMs Slightly Less Effective Than First-Year PhD Students, Study Finds Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy India, Canada Agree to Conclude Free Trade Pact Talks by Year-End After G7 Meeting Oil Prices Dip Near $70 per Barrel as Middle East Turmoil Cools After US-Iran Deal New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline New Hybrid Neuro-Symbolic Framework Achieves 78.1% Accuracy in Irony Detection Without Fine-Tuning UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion New Legal QA Benchmark Exposes Hallucination Risks in Statute-Centric AI Retrieval CrossMaps: Real-Time Open-Vocabulary Semantic Mapping for Autonomous Rover Navigation AI-Enabled Progress in Public Goods: LLMs Slightly Less Effective Than First-Year PhD Students, Study Finds Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy
Home ›› Technology ›› Ai ›› Smooth-Basis Models Challenge Tree Ensembles in Tabular Regression Benchmark

Smooth-Basis Models Challenge Tree Ensembles in Tabular Regression Benchmark

A new study from Gerber, Luciano, Lloyd, and Huw benchmarks smooth-basis models (Chebyshev polynomial regressor, anisotropic RBF network, and a hybrid) against tree ensembles and a transformer on 55 tabular regression datasets. The transformer ranks first in accuracy but requires GPUs, while among CPU-viable models, smooth models and tree ensembles are statistically tied, with smooth models showing tighter generalization gaps.

iG
iGEN Editorial
June 17, 2026
Smooth-Basis Models Challenge Tree Ensembles in Tabular Regression Benchmark

Tree ensembles have long dominated tabular regression, but a new study revisits smooth-basis models — Chebyshev polynomial regressors and radial basis function (RBF) networks — and finds they can compete on accuracy while offering better generalization properties. The research, conducted by Gerber, Luciano, Lloyd, and Huw and released as a preprint on arXiv, benchmarks these models across 55 regression datasets organized by application domain.

The researchers developed three smooth-basis models: an anisotropic RBF network with data-driven centre placement and gradient-based width optimization, a ridge-regularized Chebyshev polynomial regressor, and a hybrid Chebyshev model tree. All three models are released as scikit-learn-compatible packages. These were benchmarked against tree ensembles, a pre-trained transformer, and standard baselines, with evaluation covering accuracy and generalization behaviour.

Key Findings

The transformer ranked first on accuracy across a majority of datasets, according to the study. However, its GPU dependence, inference latency, and dataset-size limits constrain deployment in CPU-based settings common across applied science and industry. Among CPU-viable models, smooth models and tree ensembles were statistically tied on accuracy, but the former tended to exhibit tighter generalization gaps.

Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response varies gradually with inputs.

The paper recommends routinely including smooth-basis models in the candidate pool, particularly when downstream use benefits from tighter generalization and gradually varying predictions.

Model Comparison

Model Type Accuracy Rank GPU Required Generalization Gap Deployment Suitability
Transformer 1st (majority datasets) Yes Not reported Limited (GPU-dependent)
Tree ensembles Tied (CPU models) No Wider CPU-friendly
Smooth-basis models Tied (CPU models) No Tighter CPU-friendly

Implications for Practitioners

The results suggest that data scientists should consider smooth-basis models as viable alternatives to tree ensembles, especially in settings where prediction smoothness and generalization are critical. The availability of scikit-learn-compatible packages lowers the barrier to adoption. The study's findings are particularly relevant for industries that rely on CPU-based inference, such as many applied science and industrial applications.

The research team did not disclose specific funding sources or affiliations beyond the arXiv submission. The paper is available under the identifier arXiv:2602.22422.


Sources:

Keep Reading

Recommended Stories

LLMs Struggle on Privacy-Constrained Industrial Tabular Data, Study Finds Technology

LLMs Struggle on Privacy-Constrained Industrial Tabular Data, Study Finds

A new study from arXiv compares large language models (LLMs) with classical machine learning on an industrial car retrofit prediction task, finding that while LLMs have niche uses, tree ensembles remain superior. The research highlights that on privacy-constrained tables, LLMs are more effective as complementary components than replacements.

June 16, 2026
New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models Technology

New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models

A research paper on arXiv characterizes the training dynamics of on-policy distillation (OPD) for large language models, finding that OPD occupies a distinct update geometry compared to supervised fine-tuning and reinforcement learning with verifiable rewards. The study shows OPD updates affect fewer weights, avoid principal directions, and exhibit subspace locking.

June 17, 2026
UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion Technology

UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion

Researchers have introduced UniSinger, the first end-to-end framework that unifies song generation and singing voice conversion with accompaniment co-generation. Built on a multimodal diffusion transformer, it enables zero-shot speaker cloning and fine-grained timbre control across tasks. Experiments demonstrate state-of-the-art performance on both tasks, offering new possibilities for intelligent music production.

June 17, 2026
Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy Technology

Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy

A research team has developed a frequency-aware framework for epileptic seizure detection using EEG signals. By decomposing signals into five frequency bands and applying a graph convolutional neural network (GCN), the method achieves up to 99.7% accuracy on specific bands and an overall broadband accuracy of 99.01% on the CHB-MIT dataset, while enhancing neurophysiological interpretability.

June 17, 2026