Visit IGEN World Explore IGEN Expo

EXPLORE UPGRADE PLANS

BREAKING

FIFA abandons $4.2-billion World Cup stake sale plan after global backlash Parametric climate insurance can safeguard India's ₹52 lakh crore agriculture economy in El Niño year Maharashtra’s ₹500 crore AI agriculture policy targets data, traceability and farm advisory Commercial LPG prices drop: 19-kg cylinder rate cut by ₹202 in Delhi, ₹209 in Kolkata Commercial LPG Prices Cut by Over Rs 200; Delhi, Kolkata 19-kg Cylinder Rates Published US Stock Markets Rally as Chip Stock Gains Lift Nasdaq, S&P 500 and Dow SEBI Clarifies Unlisted Share Sale Rules: 200-Buyer Private Deal Limit GeM completes 10 years as India's trusted digital public procurement platform Moody's Assigns First-Time Baa2 Rating to RBL Bank, One Notch Above India's Sovereign Sebi Bars Zee's Subhash Chandra, Punit Goenka From Market for One Year FIFA abandons $4.2-billion World Cup stake sale plan after global backlash Parametric climate insurance can safeguard India's ₹52 lakh crore agriculture economy in El Niño year Maharashtra’s ₹500 crore AI agriculture policy targets data, traceability and farm advisory Commercial LPG prices drop: 19-kg cylinder rate cut by ₹202 in Delhi, ₹209 in Kolkata Commercial LPG Prices Cut by Over Rs 200; Delhi, Kolkata 19-kg Cylinder Rates Published US Stock Markets Rally as Chip Stock Gains Lift Nasdaq, S&P 500 and Dow SEBI Clarifies Unlisted Share Sale Rules: 200-Buyer Private Deal Limit GeM completes 10 years as India's trusted digital public procurement platform Moody's Assigns First-Time Baa2 Rating to RBL Bank, One Notch Above India's Sovereign Sebi Bars Zee's Subhash Chandra, Punit Goenka From Market for One Year

Home ›› Topics ›› latent safety awareness

Topic

latent safety awareness

1 story

Adaptive and Explicit safe: Triggering Latent Safety Awareness in Large Reasoning Models

Artificial Intelligence #artificial intelligence#ai safety

Adaptive and Explicit safe: Triggering Latent Safety Awareness in Large Reasoning Models

A new method called Safe Trigger leverages the latent safety awareness of Large Reasoning Models to improve safety alignment without external data. Using Supervised Fine-Tuning and Direct Preference Optimization, the approach reduces Attack Success Rate on harmful and jailbreak benchmarks while preserving general performance.

Jun 16, 2026 1 source