iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
New Framework Reduces Visual Hallucinations in Multimodal AI Systems Without Retraining MAF Framework Dynamically Optimizes Prompting for Multimodal Sentiment Analysis Study on Pedestrian Attribute Recognition Identifies Sparsity Wall and Optimizes Edge Deployment AI Framework Targets 50% Water Loss in Jordan with LLM and Digital Twin Integration AnonShield: Scalable On-Premise Pseudonymization Cuts Vulnerability Data Processing from 92 Hours to Under 10 Minutes MoFore: A New Self-Supervised Framework Learns Video Representations by Forecasting Future Latent Embeddings Do LLMs Reliably Identify Correct Information Units in Aphasic Discourse? A New Study Evaluates Four Models AI Video Generation Method for Cardiac MRI Addresses Data Scarcity with Latent Motion Modeling SCAN Framework Helps CTOs Decide When to Use Generative AI for Task Allocation LLM-Encoded Knowledge Guides Federated Graph Recommendation to Improve Accuracy New Framework Reduces Visual Hallucinations in Multimodal AI Systems Without Retraining MAF Framework Dynamically Optimizes Prompting for Multimodal Sentiment Analysis Study on Pedestrian Attribute Recognition Identifies Sparsity Wall and Optimizes Edge Deployment AI Framework Targets 50% Water Loss in Jordan with LLM and Digital Twin Integration AnonShield: Scalable On-Premise Pseudonymization Cuts Vulnerability Data Processing from 92 Hours to Under 10 Minutes MoFore: A New Self-Supervised Framework Learns Video Representations by Forecasting Future Latent Embeddings Do LLMs Reliably Identify Correct Information Units in Aphasic Discourse? A New Study Evaluates Four Models AI Video Generation Method for Cardiac MRI Addresses Data Scarcity with Latent Motion Modeling SCAN Framework Helps CTOs Decide When to Use Generative AI for Task Allocation LLM-Encoded Knowledge Guides Federated Graph Recommendation to Improve Accuracy
Home ›› Topics ›› deep learning

Topic

deep learning

25 stories
Study on Pedestrian Attribute Recognition Identifies Sparsity Wall and Optimizes Edge Deployment Technology
Artificial Intelligence #pedestrian attribute recognition#optimization dynamics

Study on Pedestrian Attribute Recognition Identifies Sparsity Wall and Optimizes Edge Deployment

A new study on pedestrian attribute recognition (PAR) addresses extreme class imbalance in large-scale datasets. Researchers identified the "majority negative class cheating trap" and proposed a calibrated Multi-Label Focal Loss configuration. They also defined the "Sparsity Wall," a boundary where global loss reweighting fails, requiring instance-level intervention.

Jun 16, 2026 1 source
MoFore: A New Self-Supervised Framework Learns Video Representations by Forecasting Future Latent Embeddings Technology
Artificial Intelligence #machine learning#self-supervised learning

MoFore: A New Self-Supervised Framework Learns Video Representations by Forecasting Future Latent Embeddings

A new self-supervised video representation learning framework called MoFore (Momentum-Guided Semantic Forecasting) is introduced by researcher Xu Qinwu. Instead of reconstructing masked pixels or aligning contrastive pairs, MoFore learns by forecasting future latent embeddings from temporally distant clips. Experiments on the UCF101 dataset show strong temporal stability and emergent category-level structure without action labels.

Jun 16, 2026 1 source
LLM-Encoded Knowledge Guides Federated Graph Recommendation to Improve Accuracy Technology
Artificial Intelligence #federated learning#graph recommendation

LLM-Encoded Knowledge Guides Federated Graph Recommendation to Improve Accuracy

Researchers propose a federated graph recommendation framework that leverages LLM-encoded semantic knowledge to guide cross-client structural aggregation, addressing the challenge of non-IID client data. The method consistently outperforms existing federated graph baselines on standard benchmarks.

Jun 16, 2026 1 source
EcoBin Neural Network Cuts Waste Sorting Errors by Detecting Contamination in Recyclables Technology
Artificial Intelligence #waste classification#deep learning

EcoBin Neural Network Cuts Waste Sorting Errors by Detecting Contamination in Recyclables

EcoBin is a two-stage deep convolutional neural network that classifies household waste and explicitly accounts for contamination. The first stage achieves 87.42% test accuracy and 96.13% pathway-adjusted accuracy, while the contamination stage distinguishes clean from contaminated items with a 0.99 ROC-AUC. On contaminated recyclables, the full pipeline correctly routes 24 of 25 items, a significant improvement over the base classifier alone.

Jun 16, 2026 1 source
AI and Deep Learning Transform Cattle Identification for Livestock Supply Chain Security Technology
Artificial Intelligence #machine learning#deep learning

AI and Deep Learning Transform Cattle Identification for Livestock Supply Chain Security

A systematic review of machine learning and deep learning techniques for cattle identification reveals that deep learning methods like CNNs, ResNets, and YOLO outperform classical approaches in detection and recognition tasks. Key features include muzzle prints and coat patterns, while challenges remain in dataset availability and real-time processing.

Jun 16, 2026 1 source
New Hardware-Aware Neural Architecture Search Runs on Embedded Devices with Under 512MB RAM Technology
Artificial Intelligence #hardware-aware#neural architecture search

New Hardware-Aware Neural Architecture Search Runs on Embedded Devices with Under 512MB RAM

Researchers propose a hardware-aware neural architecture search (HW NAS) method that runs on embedded devices with under 512MB of RAM. It produces tiny convolutional neural networks for low-end microcontrollers, enabling on-device AI without cloud dependence. The approach achieves state-of-the-art results on the Visual Wake Word dataset.

Jun 16, 2026 1 source
EyeMVP AI Model Enhances Retinal Screening by Learning OCT Insights from Fundus Photos Technology
Artificial Intelligence #artificial intelligence#computer vision

EyeMVP AI Model Enhances Retinal Screening by Learning OCT Insights from Fundus Photos

Researchers developed EyeMVP, a cross-modal retinal foundation model that enriches color fundus photography (CFP) with depth-resolved information from optical coherence tomography (OCT). Pretrained on 674,893 paired images from 112,642 patients across eight Chinese hospitals, EyeMVP outperforms leading models on 16 downstream tasks including macular edema detection (AUROC 0.948 vs 0.852) and myopic macular schisis (0.825).

Jun 16, 2026 1 source
New Rational Sparse Autoencoder Improves AI Interpretability with Trainable Activation Function Technology
Artificial Intelligence #machine learning#autoencoder

New Rational Sparse Autoencoder Improves AI Interpretability with Trainable Activation Function

Researchers introduce the Rational Sparse Autoencoder (RSAE), which replaces fixed encoder nonlinearities with a trainable rational function. Across three language models and three baseline activation families, RSAE strictly improves reconstruction and downstream-behaviour metrics while preserving feature-level interpretability, adding only a few scalar parameters per autoencoder.

Jun 16, 2026 1 source
Cortical Geometry and Wiring Serve as Powerful Inductive Biases for Recurrent Neural Networks Technology
Artificial Intelligence #artificial intelligence#neural networks

Cortical Geometry and Wiring Serve as Powerful Inductive Biases for Recurrent Neural Networks

A new study leveraging the MICrONS functional connectomics dataset demonstrates that recurrent neural networks initialized with cortical geometry, wiring, and functional relationships consistently outperform baseline and partially constrained models across three decision-making tasks, achieving lower entropy and modular organization.

Jun 16, 2026 1 source
Scribby Multi-Level LLM Framework Promises Fine-Grained Semantic Analysis of Long-Form Video Technology
Artificial Intelligence #llm#video analysis

Scribby Multi-Level LLM Framework Promises Fine-Grained Semantic Analysis of Long-Form Video

Researchers propose Scribby, an LLM-based framework for semantic video analysis that balances macro-level comprehension with micro-level semantic indexing. The approach analyzes full transcripts, individual sentences, and groups sentences by semantic similarity using an LLM as a judge, enabling more detailed understanding of video structure and thematic progression.

Jun 16, 2026 1 source
New Research Advances Emotional Speech Synthesis with Latent Representations and FastSpeech 2 Technology
Artificial Intelligence #emotional speech synthesis#latent representations

New Research Advances Emotional Speech Synthesis with Latent Representations and FastSpeech 2

Researchers have published an empirical study on arXiv detailing a method for emotional speech synthesis by integrating speaker embedding and a prosody bottleneck into the FastSpeech 2 architecture. The approach addresses two sub-tasks: generating emotional speech for a single speaker and transferring speaking styles from another speaker while retaining target speaker identity. The work was submitted to the VLSP 2022 competition.

Jun 16, 2026 1 source
TimeVista: Researchers Use Vision-Language Models as Judges for Time Series Forecasting Evaluation Technology
Artificial Intelligence #time series forecasting#vision-language models

TimeVista: Researchers Use Vision-Language Models as Judges for Time Series Forecasting Evaluation

Researchers propose using vision-language models (VLMs) as judges for time series forecasting, addressing limitations of traditional point-wise metrics. They introduce TimeVista, a benchmark of 5,563 samples, and show VLMs achieve significantly higher consistency with human preferences than conventional metrics, also assessing Time Series Foundation Models.

Jun 16, 2026 1 source
Steady-Forcing: New AI Framework Balances Spatial Persistence and Motion in Long-Horizon Nature Video Generation Technology
Artificial Intelligence #video diffusion#nature video

Steady-Forcing: New AI Framework Balances Spatial Persistence and Motion in Long-Horizon Nature Video Generation

A team of researchers has introduced Steady-Forcing, a framework designed to address the stability-motion trade-off in long-horizon nature video generation. The method combines a persistent visual anchor, motion memory, and distillation from a large teacher model to maintain background identity while sustaining fluid dynamics over multi-minute rollouts.

Jun 16, 2026 1 source
Chaos-Informed Wave Interference Model Boosts Cross-City Traffic Forecasting with Less Data Technology
Artificial Intelligence #artificial intelligence#traffic forecasting

Chaos-Informed Wave Interference Model Boosts Cross-City Traffic Forecasting with Less Data

A research paper introduces CIWI-CKT, a chaos-informed wave interference feature fusion framework with cross-city knowledge transfer for traffic flow forecasting. The model addresses data scarcity and chaotic traffic dynamics, significantly outperforming existing methods on four real-world datasets while requiring less training data.

Jun 16, 2026 2 sources
DH-V2: Geometry-Based Sampler Achieves 1,433x Compression for Edge Perception Technology
Artificial Intelligence #computer vision#visual sampler

DH-V2: Geometry-Based Sampler Achieves 1,433x Compression for Edge Perception

Researchers present Double-Helix Vision (DH-V2), a geometry-based visual sampler that compresses 2D images into compact 1D signals using golden-ratio-inspired spiral trajectories. At 4K resolution, it achieves a 1,433x compression ratio while running in 0.52ms on CPU-only hardware, and includes a JSON-serializable Robotics API for bandwidth-constrained perception.

Jun 16, 2026 1 source
New Sub-Semantic Image Segmentation Method DETECTURE Introduced by Researchers, Outperforms Baselines Technology
Artificial Intelligence #sub-semantic#image segmentation

New Sub-Semantic Image Segmentation Method DETECTURE Introduced by Researchers, Outperforms Baselines

Researchers propose a new category of image segmentation called sub-semantic, which uses language to partition images into stable appearance patterns rather than whole objects. They introduce DETECTURE, a method that couples a vision-language model with SAM 3 to overcome three failure modes, and create a new dataset called TextureADE derived from ADE20K. DETECTURE achieves the strongest performance on several datasets compared to baselines.

Jun 16, 2026 1 source
VigilFormer: Deformable Attention for Video Anomaly Detection with Causal Risk Inference Technology
Artificial Intelligence #video anomaly detection#deformable attention

VigilFormer: Deformable Attention for Video Anomaly Detection with Causal Risk Inference

A new AI framework, VigilFormer, uses deformable attention and causal inference to detect anomalies in surveillance video at 41.5 FPS, outperforming prior methods on three benchmarks.

Jun 16, 2026 1 source
Multiple Descents in Deep Learning Linked to Order-Chaos Transitions in LSTM Networks, New Research Shows Technology
Artificial Intelligence #deep learning#lstm

Multiple Descents in Deep Learning Linked to Order-Chaos Transitions in LSTM Networks, New Research Shows

Researchers have observed a 'multiple-descent' phenomenon in LSTM networks, where test performance cycles through ups and downs after overtraining. Asymptotic stability analysis reveals these cycles are linked to order-chaos phase transitions, with the most optimal training step at the first transition from order to chaos, where the 'edge of chaos' is widest.

Jun 16, 2026 1 source
Deep Learning Automates Doppler Angle Estimation in Ultrasound, Reducing Measurement Errors Technology
Artificial Intelligence #deep learning#ultrasound

Deep Learning Automates Doppler Angle Estimation in Ultrasound, Reducing Measurement Errors

A deep learning approach developed using 2100 carotid ultrasound images can automatically estimate Doppler angle, reducing error. The best model achieved mean absolute error less than clinical threshold, potentially improving blood velocity measurements.

Jun 16, 2026 1 source
New Method Reduces Object Hallucinations in Large Vision-Language Models by Over 35% Technology
Artificial Intelligence #artificial intelligence#computer vision

New Method Reduces Object Hallucinations in Large Vision-Language Models by Over 35%

A research paper introduces Attention Imbalance Rectification (AIR), a decoding-time intervention that reduces object hallucination rates in large vision-language models by up to 35.1%. The method addresses attention imbalances across and within modalities, enhancing model reliability for applications like autonomous driving and medical image analysis.

Jun 16, 2026 1 source
PACT Hybrid Architecture Combines Small Language Model Planning with Reinforcement Learning for Enhanced Decision-Making Technology
Artificial Intelligence #artificial intelligence#language models

PACT Hybrid Architecture Combines Small Language Model Planning with Reinforcement Learning for Enhanced Decision-Making

Researchers propose Plan, Align, Commit, Think (PACT), a hybrid architecture that couples a fast reactive reinforcement learning policy with a slow deliberative small language model (SLM) planner. The SLM asynchronously generates and validates action plans, which are executed directly once verified as safe through simulation. Evaluated on three FrozenLake configurations, PACT outperformed all baselines using a 2B-parameter SLM backbone, demonstrating that deliberative planning and reactive execution complement each other.

Jun 16, 2026 1 source
Expert Tying Reduces Memory Footprint of Mixture-of-Experts LLMs by Nearly Half Technology
Artificial Intelligence #tied expert layers#mixture-of-experts

Expert Tying Reduces Memory Footprint of Mixture-of-Experts LLMs by Nearly Half

A new arXiv paper from Jaggi proposes Expert Tying, an architectural modification for Mixture-of-Experts LLMs that shares expert parameters across consecutive transformer layers. Pretraining experiments show memory footprint reduction by almost 2x with virtually no degradation in perplexity or downstream quality, evaluated on OLMoE, Qwen3, and DeepSeek-style architectures.

Jun 16, 2026 1 source
Tool-IQA: Augmenting Image Quality Assessment with Simple Tools to Improve VLM-Based Scoring Technology
Artificial Intelligence #image quality assessment#computer vision

Tool-IQA: Augmenting Image Quality Assessment with Simple Tools to Improve VLM-Based Scoring

Researchers propose Tool-IQA, a method that enhances Vision-Language Models (VLMs) for image quality assessment by adding a Magnifier and Gamma Corrector tools. This shifts from static one-shot scoring to a tool-augmented workflow, achieving a PLCC of 0.854 on the CLIVE dataset, outperforming existing state-of-the-art models.

Jun 16, 2026 1 source
Unifying Acoustic Features and Text with Multimodal LLMs for Neurodegenerative Disease Staging Technology
Artificial Intelligence #multimodal#llms

Unifying Acoustic Features and Text with Multimodal LLMs for Neurodegenerative Disease Staging

Researchers propose NeurMLLM, a multimodal generative framework that integrates acoustic features and text using a large language model for neurodegenerative disease staging. Evaluated on the Bridge2AI-Voice dataset, it outperforms classical machine learning and existing LLM-based methods for Alzheimer's and Parkinson's disease staging.

Jun 16, 2026 1 source
How Multi-Label Classification and Generative AI Scale User Feedback Analysis Technology
Artificial Intelligence #ai#machine learning

How Multi-Label Classification and Generative AI Scale User Feedback Analysis

A research paper on arXiv details how a major software company used supervised machine learning for multi-label topic classification and generative AI for summarization to efficiently process large volumes of user feedback. The study found that sentiment analysis alone does not reliably indicate user satisfaction, emphasizing the need for explicit satisfaction surveys.

Jun 16, 2026 1 source