Visit IGEN World Explore IGEN Expo

EXPLORE UPGRADE PLANS

BREAKING

ShinyHunters Claim to Leak 45GB of Data from Madison Square Garden Crude Comeback: 20 Million Barrels Leave Iran Port After Peace Breakthrough India diversifies LPG imports from West Asia conflict zones as OMCs absorb price shock Manu Chandra's Sauce VC Serves Up 8-10x Return with L'Oréal's Innovist Acquisition Reliance eyes export-led push with new manufacturing platforms across key consumer segments Bay System May Open Two-Week Rain Window Across Central India Trump Says India, US 'Very Close' to Trade Deal After Modi Bilateral at G7 The Easy Era of Critical Mineral Trade Is Over as Governments Reshape Supply Chains Texas Seeks Dual Stock Listings with London as Historic Ties Rekindle Weak monsoon set to dent India’s 2026-27 coffee prospects ShinyHunters Claim to Leak 45GB of Data from Madison Square Garden Crude Comeback: 20 Million Barrels Leave Iran Port After Peace Breakthrough India diversifies LPG imports from West Asia conflict zones as OMCs absorb price shock Manu Chandra's Sauce VC Serves Up 8-10x Return with L'Oréal's Innovist Acquisition Reliance eyes export-led push with new manufacturing platforms across key consumer segments Bay System May Open Two-Week Rain Window Across Central India Trump Says India, US 'Very Close' to Trade Deal After Modi Bilateral at G7 The Easy Era of Critical Mineral Trade Is Over as Governments Reshape Supply Chains Texas Seeks Dual Stock Listings with London as Historic Ties Rekindle Weak monsoon set to dent India’s 2026-27 coffee prospects

Home ›› Topics ›› osguard

Topic

osguard

1 story

New OSGuard Benchmark Evaluates Safety of Computer-Use Agents for Enterprise AI Deployment

Artificial Intelligence #ai safety#benchmark

New OSGuard Benchmark Evaluates Safety of Computer-Use Agents for Enterprise AI Deployment

Researchers introduce OSGuard, a benchmark suite for evaluating safety in computer-use agents. It includes action-level guardrail decisions and a risk-augmented execution suite to detect unsafe completions that satisfy nominal task objectives. Early tests show current multimodal guardrails perform well on isolated action judgments but reveal gaps in end-to-end safety.

Jun 16, 2026 1 source