iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
P3B3 Benchmark Reveals Strong Brazilian Portuguese Bias in Large Language Models Controlled Dynamics Attractor Transformer: New Model Targets Graph Anomaly Detection with Biologically Plausible Attention Tamil Nadu OE Spinning Mills Threaten 50% Production Cut Over High Cotton Waste Prices BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics New Theory Explains How Deep Transformers Achieve Adaptive Inference Using Function Vectors PVminerLLM2 Uses Preference Optimization to Improve Structured Patient Voice Extraction Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course AutoDojo: Adaptive Attacks Expose Superficial Defenses and Structural Limits in LLM Agents Calibrated Variance Propagation Cuts Uncertainty Estimation Cost for Deep Learning Models Patel Engineering Joint Venture Secures ₹126 Crore Tasgaon Lift Irrigation Project in Maharashtra P3B3 Benchmark Reveals Strong Brazilian Portuguese Bias in Large Language Models Controlled Dynamics Attractor Transformer: New Model Targets Graph Anomaly Detection with Biologically Plausible Attention Tamil Nadu OE Spinning Mills Threaten 50% Production Cut Over High Cotton Waste Prices BridgePolicy: New Diffusion Bridge Method Improves Visuomotor Policy Learning in Robotics New Theory Explains How Deep Transformers Achieve Adaptive Inference Using Function Vectors PVminerLLM2 Uses Preference Optimization to Improve Structured Patient Voice Extraction Beyond Models: Reflections on Engineering AI-enabled Systems in a Project-Based Course AutoDojo: Adaptive Attacks Expose Superficial Defenses and Structural Limits in LLM Agents Calibrated Variance Propagation Cuts Uncertainty Estimation Cost for Deep Learning Models Patel Engineering Joint Venture Secures ₹126 Crore Tasgaon Lift Irrigation Project in Maharashtra
Home ›› Technology ›› Ai ›› Computer Vision ›› RAMS: Resource-Adaptive Model Switching for Embedded Edge Perception Under Load

RAMS: Resource-Adaptive Model Switching for Embedded Edge Perception Under Load

Researchers present RAMS, a runtime controller that monitors device pressure and dynamically selects among three YOLOv8 tiers on embedded hardware, achieving up to 5.6x faster inference than a fixed medium model while retaining 74% of its accuracy. The system introduces a detection-conditioned switching policy and a new scalar metric, SWAS, for offline policy comparison.

iG
iGEN Editorial
June 16, 2026
RAMS: Resource-Adaptive Model Switching for Embedded Edge Perception Under Load

Edge object detection on embedded hardware faces a fundamental trade-off: inference latency must stay low under fluctuating resource pressure while detection quality is maintained. A team of researchers from arXiv (Kushal Khemani, Evan Leri, George Xu, and Amit Hod) has proposed RAMS (Resource-Adaptive and Detection-Conditioned Model Switching), a lightweight runtime controller that dynamically selects among three resident YOLOv8 tiers without model-reload latency. The approach targets any deployment where CPU, memory or power constraints change unpredictably — from robotics to autonomous vehicles.

How RAMS Works

RAMS monitors device pressure (e.g., CPU load, memory usage) and calibrates switching thresholds based on idle behaviour. It manages three YOLOv8 models: NANO (320×320 px), SMALL (416×416 px), and MEDIUM (640×640 px). Five switching policies are defined, including two detection-conditioned variants that prevent aggressive model downgrades when recent vulnerable-road-user (VRU) detections have occurred. The same controller equations operate across a 37× latency range, as demonstrated on Raspberry Pi 5, x86 laptops, and Jetson Orin with ONNX and TensorRT runtimes.

The VRU-Weighted Accuracy Score (SWAS)

To compare policies offline without ground-truth annotations, the team introduced the VRU-Weighted Accuracy Score (SWAS), a scalar metric that weights detection accuracy by the presence of vulnerable road users. An oracle-bounded variant of SWAS separates the circularity of the detector’s own outputs from genuine tier-retention benefit. Under heavy load on Jetson Orin TensorRT, detection-conditioned switching improved SWAS by 25.4% (oracle scoring) and 47.3% (detector-derived scoring) relative to threshold-only policies.

Performance Benchmarks

Policy Mean Latency (ms) Relative Speedup vs Fixed-MEDIUM Proxy Accuracy Retained SWAS Improvement (detector-derived)
Threshold-only Not reported Baseline
safety2 (detection-conditioned) 3.41 5.6× faster 74% +47.3%

On the Jetson Orin under heavy load, the safety2 policy achieved a mean latency of 3.41 ms, 5.6× faster than fixed-MEDIUM inference, while retaining 74% of its proxy accuracy through near-NANO operation with selective SMALL and MEDIUM locks during VRU-positive windows. Live evaluation on the KITTI dataset reported per-tier VRU recall rates of 24.2% for NANO, 41.2% for SMALL, and 59.0% for MEDIUM, indicating that reactive overrides are fundamentally limited by the baseline detector’s recall.

Implications for Embedded Perception

The ability to switch models at runtime without reload latency and to condition decisions on prior detections offers a practical path for embedded vision systems that must operate under uncertain resource budgets. While the current work focuses on autonomous driving scenarios (VRU detection), the same architecture applies to any edge perception task where latency and accuracy must be balanced dynamically. Future work could extend the policy set or integrate real-time resource forecasting.


Sources:

Keep Reading

Recommended Stories

DH-V2: Geometry-Based Sampler Achieves 1,433x Compression for Edge Perception Technology

DH-V2: Geometry-Based Sampler Achieves 1,433x Compression for Edge Perception

Researchers present Double-Helix Vision (DH-V2), a geometry-based visual sampler that compresses 2D images into compact 1D signals using golden-ratio-inspired spiral trajectories. At 4K resolution, it achieves a 1,433x compression ratio while running in 0.52ms on CPU-only hardware, and includes a JSON-serializable Robotics API for bandwidth-constrained perception.

June 16, 2026
Ensemble Deep Learning Achieves 99.27% Accuracy in Lemon Leaf Disease Detection Technology

Ensemble Deep Learning Achieves 99.27% Accuracy in Lemon Leaf Disease Detection

A study on arXiv presents an ensemble deep learning approach for classifying lemon leaf diseases, achieving 99.27% accuracy. The method combines InceptionV3 and MobileNetV2 with adversarial training and Grad-CAM visualization, using a dataset of 1,354 images across 9 classes.

June 16, 2026
UniBrain: A Unified Multimodal Model for Brain MRI Imputation and Understanding Technology

UniBrain: A Unified Multimodal Model for Brain MRI Imputation and Understanding

Researchers propose UniBrain, a unified multimodal large language model for brain MRI analysis that handles missing data through joint imputation and understanding. The model uses interleaved data flow, self-alignment, and dynamic hidden state mechanisms to achieve high performance on multi-disease MRI datasets.

June 16, 2026
Deep Residual Injection Method Enables Full-Spectrum Forensic AI Detection in Multimodal Models Technology

Deep Residual Injection Method Enables Full-Spectrum Forensic AI Detection in Multimodal Models

Researchers propose Deep Visual Residual MLLM (Deep-VRM), a method that injects low-level artifact signals into multimodal large language models without disrupting pre-trained semantic knowledge. The approach achieves state-of-the-art detection of AI-generated images across multiple benchmarks.

June 16, 2026