iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
India, Canada Agree to Conclude Free Trade Pact Talks by Year-End After G7 Meeting Oil Prices Dip Near $70 per Barrel as Middle East Turmoil Cools After US-Iran Deal New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline New Hybrid Neuro-Symbolic Framework Achieves 78.1% Accuracy in Irony Detection Without Fine-Tuning UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion New Legal QA Benchmark Exposes Hallucination Risks in Statute-Centric AI Retrieval CrossMaps: Real-Time Open-Vocabulary Semantic Mapping for Autonomous Rover Navigation AI-Enabled Progress in Public Goods: LLMs Slightly Less Effective Than First-Year PhD Students, Study Finds Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy India, Canada Agree to Conclude Free Trade Pact Talks by Year-End After G7 Meeting Oil Prices Dip Near $70 per Barrel as Middle East Turmoil Cools After US-Iran Deal New Research Reveals Distinct Training Dynamics of On-Policy Distillation for Large Language Models Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline New Hybrid Neuro-Symbolic Framework Achieves 78.1% Accuracy in Irony Detection Without Fine-Tuning UniSinger: First End-to-End Framework Unifies Song Generation and Singing Voice Conversion New Legal QA Benchmark Exposes Hallucination Risks in Statute-Centric AI Retrieval CrossMaps: Real-Time Open-Vocabulary Semantic Mapping for Autonomous Rover Navigation AI-Enabled Progress in Public Goods: LLMs Slightly Less Effective Than First-Year PhD Students, Study Finds Epileptic Seizure Detection via Frequency-Aware Graph Convolutional Networks Achieves 99% Accuracy
Home ›› Technology ›› Ai ›› Computer Vision ›› Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline

Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline

A study evaluates Clay v1.5, a Geospatial Foundation Model, for pixel-level landslide segmentation on the Landslide4Sense benchmark. The hybrid U-Net + Clay model with two-stage LoRA achieves a test F1 of 64.5%, outperforming both the Clay-only backbone and a standard U-Net baseline.

iG
iGEN Editorial
June 17, 2026
Study Finds Hybrid CNN-Clay Model Improves Landslide Detection Accuracy Over Baseline

Rapid post-event landslide mapping is critical for disaster response but remains challenging due to extreme class imbalance. According to a study published on arXiv by researcher Huong Binh Vu, a hybrid approach combining a Convolutional Neural Network (CNN) with a Geospatial Foundation Model (GFM) significantly improves pixel-level landslide detection accuracy.

The research evaluates Clay v1.5, a Geospatial Foundation Model, on the Landslide4Sense (L4S) benchmark dataset. This dataset contains 3,799 training chips with 14 Sentinel-2 and terrain bands, and approximately 2% of pixels represent positive landslide samples.

Methodology

The study compares three strategies: using Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model employs two-stage Low-Rank Adaptation (LoRA).

Results

The hybrid U-Net + Clay model achieved the best test F1 score of 64.5% ± 1.8% over three seeds, surpassing the Clay-only backbone (55.2% ± 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections. However, its pretrained representations consistently improved performance when injected as auxiliary context.

Model Test F1 Score
U-Net baseline 59.9%
Clay-only backbone 55.2% ± 3.6%
Hybrid U-Net + Clay (LoRA) 64.5% ± 1.8%

Implications

The findings suggest that Geospatial Foundation Models are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replacing them. This hybrid approach offers a path to more reliable automated landslide mapping, which is essential for rapid disaster response. The paper code and data are available through arXiv.


Sources:

Keep Reading

Recommended Stories

Mutual Distillation of Dual Foundation Models Achieves State-of-the-Art PET/CT Segmentation with Only 5 Labeled Cases Technology

Mutual Distillation of Dual Foundation Models Achieves State-of-the-Art PET/CT Segmentation with Only 5 Labeled Cases

Researchers propose MuDuo, a mutual distillation framework that leverages two foundation models (SAM-Med3D for CT, SegAnyPET for PET) to distill knowledge into a lightweight student network for semi-supervised PET/CT segmentation. Achieving state-of-the-art performance on the AutoPET dataset with only 5 labeled cases, the approach eliminates manual prompts and maximizes unlabeled data utility.

June 16, 2026
New Benchmark and Method Address Occlusion in Vision-Language-Action Models for Robotics Technology

New Benchmark and Method Address Occlusion in Vision-Language-Action Models for Robotics

Researchers introduced LIBERO-Occ, an occlusion-oriented benchmark for Vision-Language-Action (VLA) models, and proposed Viewpoint Imagination (VIM), a method that generates a complementary view from an occluded primary observation to condition action prediction. Experiments show that state-of-the-art VLAs suffer substantial performance degradation under occlusion, and VIM improves robustness across task suites, occlusion types, and severity levels without requiring additional cameras at deployment.

June 16, 2026
DySink: Dynamic Frame Sinks Enable Adaptive Long Video Generation Without Context Collapse Technology

DySink: Dynamic Frame Sinks Enable Adaptive Long Video Generation Without Context Collapse

Researchers propose DySink, a retrieval-based framework that replaces static early-frame sinks with dynamic, visually relevant historical frames for autoregressive long video generation. This approach prevents sink collapse and improves temporal quality in minute-long videos.

June 16, 2026
Uncertainty Quality of VGGT: Analysis on DTU Benchmark Dataset Reveals Effective Confidence Threshold for 3D Reconstruction Technology

Uncertainty Quality of VGGT: Analysis on DTU Benchmark Dataset Reveals Effective Confidence Threshold for 3D Reconstruction

A new paper investigates the uncertainty predictions of the Visual Geometry Grounded Transformer (VGGT), which won Best Paper at CVPR-2025. The analysis on the DTU benchmark dataset identifies an effective confidence threshold for filtering VGGT's raw output and shows potential for improving 3D reconstruction accuracy.

June 16, 2026