Visit IGEN World Explore IGEN Expo

EXPLORE UPGRADE PLANS

BREAKING

Gold loans jump 94% y-o-y, fuel bank credit growth in Q1 Snapchat joins YouTube, LinkedIn and Substack in fight against 'AI slop' Amazon speeds last-mile delivery, expands robotics fleet past 1 million Hugging Face CEO demands AI firms answer for rogue bot attacks First tariff-free Scottish salmon shipment arrives in Bengaluru under UK-India CETA Chinese AI Researchers Are Finding Their Voice on X Equipment Sale Gains Save Heartland Express Q2, Masking 103% Operating Ratio Covenant Logistics Shares Plunge 11.2% on Earnings; CFO Stresses Long-Term Strategy India, Bhutan Sign Two Agreements on Line of Credit, Health Education Cooperation During Misri's Visit Nasdaq rises as Amazon's 13.7% rally lifts tech stocks; Apple drops 9.8% Gold loans jump 94% y-o-y, fuel bank credit growth in Q1 Snapchat joins YouTube, LinkedIn and Substack in fight against 'AI slop' Amazon speeds last-mile delivery, expands robotics fleet past 1 million Hugging Face CEO demands AI firms answer for rogue bot attacks First tariff-free Scottish salmon shipment arrives in Bengaluru under UK-India CETA Chinese AI Researchers Are Finding Their Voice on X Equipment Sale Gains Save Heartland Express Q2, Masking 103% Operating Ratio Covenant Logistics Shares Plunge 11.2% on Earnings; CFO Stresses Long-Term Strategy India, Bhutan Sign Two Agreements on Line of Credit, Health Education Cooperation During Misri's Visit Nasdaq rises as Amazon's 13.7% rally lifts tech stocks; Apple drops 9.8%

Home ›› Topics ›› discrimination

Topic

discrimination

1 story

New Benchmark 'AgentFairBench' Tests Whether LLM Agents Discriminate in Real Actions

Artificial Intelligence #llm#ai agents

New Benchmark 'AgentFairBench' Tests Whether LLM Agents Discriminate in Real Actions

Researchers introduce AgentFairBench, a reproducible benchmark for demographic disparity in LLM agent actions. Unlike traditional fairness tests that grade answers, it evaluates actions across hiring, lending, and medical triage using counterfactual matched sets. A pilot study with 864 decisions reveals that naively comparing score spreads can overstate disparity by ~2.4X; using a proper null methodology, Claude Haiku 4.5 showed no significant demographic effect.

Jun 16, 2026 1 source