iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
Vår Energi Approves Seven-Well North Sea Development with 2027 Start-Up Atom XVII Launches ₹75 Crore Consumer Fund to Back Early-Stage Indian Brands Rupee Tumbles 21 Paise to 94.66 Against US Dollar on Fed Hawkish Stance MOL and NYK Sign Long-Term Ammonia Carrier Charters with JERA for US-Japan Low-Carbon Fuel Supply Qatar LNG Tanker Sails for Hormuz as US-Iran Deal Reopens Critical Waterway UK to Scan Asylum-Seekers’ Faces with Flawed AI Age Estimation Despite Internal Warnings US Firms Sue Container Makers Over Alleged Price-Fixing Scheme Impacting Global Dry Container Market Strait of Hormuz Reopens Under US-Iran Deal, Future Transit Fees Uncertain for Shippers Crude Oil Futures Plunge After Reports of US-Iran Interim Peace Deal Digitally Signed Strait of Hormuz oil flows may recover to only 70% after war: Goldman Sachs Vår Energi Approves Seven-Well North Sea Development with 2027 Start-Up Atom XVII Launches ₹75 Crore Consumer Fund to Back Early-Stage Indian Brands Rupee Tumbles 21 Paise to 94.66 Against US Dollar on Fed Hawkish Stance MOL and NYK Sign Long-Term Ammonia Carrier Charters with JERA for US-Japan Low-Carbon Fuel Supply Qatar LNG Tanker Sails for Hormuz as US-Iran Deal Reopens Critical Waterway UK to Scan Asylum-Seekers’ Faces with Flawed AI Age Estimation Despite Internal Warnings US Firms Sue Container Makers Over Alleged Price-Fixing Scheme Impacting Global Dry Container Market Strait of Hormuz Reopens Under US-Iran Deal, Future Transit Fees Uncertain for Shippers Crude Oil Futures Plunge After Reports of US-Iran Interim Peace Deal Digitally Signed Strait of Hormuz oil flows may recover to only 70% after war: Goldman Sachs
Home ›› Topics ›› judge

Topic

judge

3 stories
Psychometric Datasheet Reveals 'Dark Current' Bias in LLM-as-a-Judge Evaluation Systems Technology
Artificial Intelligence #llm#artificial intelligence

Psychometric Datasheet Reveals 'Dark Current' Bias in LLM-as-a-Judge Evaluation Systems

Researchers introduce a Judge Datasheet protocol to measure biases in LLM-as-a-judge systems, including dark current under vacuum inputs and positional false preference. A case study of three open-weight models reveals stark differences in measurement reliability, with implications for enterprise AI evaluation.

Jun 16, 2026 1 source
Metric Match: New Subset Selection Method Improves LLM Judge Reliability Evaluation, Cuts Annotation Costs by 32.5% Technology
Artificial Intelligence #llm#judge

Metric Match: New Subset Selection Method Improves LLM Judge Reliability Evaluation, Cuts Annotation Costs by 32.5%

Researchers developed Metric Match, a subset selection method that reduces costly human annotations needed to evaluate LLM judge reliability. The approach achieves a 0.838 win-rate over random selection, cuts estimation error by 18.7%, and reduces annotation needs by 32.5%. A medical case study showed $1,041.67 in savings.

Jun 16, 2026 1 source
Judge Kicks Lawyers Off Case After Both Sides Used AI to Generate Hallucinated Legal Citations Technology
Artificial Intelligence #judge#lawyers

Judge Kicks Lawyers Off Case After Both Sides Used AI to Generate Hallucinated Legal Citations

Senior US District Judge Sharion Aycock sanctioned four lawyers after discovering they used AI to produce legal citations that did not exist. The judge disqualified all lawyers from the case, barred two from the district for two years, and imposed a total fine of $8,000, setting a precedent that ignorance of AI hallucinations is not a viable defense.

Jun 16, 2026 1 source