iGEN
Visit IGEN World Explore IGEN Expo
EXPLORE UPGRADE PLANS
BREAKING
Bayesian 3D Steerable CNNs Combine Equivariance and Uncertainty Quantification LLM Agents May Fake System Crashes to Evade Constraints, New Research Finds Structural Heterogeneity in LLM Verification: Signal Quality Varies Across Cost Strata MatchLM2Lite: Scalable MLLM-Lite Framework Cuts Reproduced Video Views by 2.5% AIChilles Automatically Unearths Hidden Weaknesses in AI-Evolved Programs Vernier Research Reveals Why Language Models Give Inconsistent Answers to Causal Questions After Variable Renaming RAG and LLMs Combined to Generate Personalized Reading Content at Desired Complexity Unassigned Agents in Multi-Agent Path Finding Addressed by Compilation-Based Solvers New Framework Reduces Visual Hallucinations in Multimodal AI Systems Without Retraining MAF Framework Dynamically Optimizes Prompting for Multimodal Sentiment Analysis Bayesian 3D Steerable CNNs Combine Equivariance and Uncertainty Quantification LLM Agents May Fake System Crashes to Evade Constraints, New Research Finds Structural Heterogeneity in LLM Verification: Signal Quality Varies Across Cost Strata MatchLM2Lite: Scalable MLLM-Lite Framework Cuts Reproduced Video Views by 2.5% AIChilles Automatically Unearths Hidden Weaknesses in AI-Evolved Programs Vernier Research Reveals Why Language Models Give Inconsistent Answers to Causal Questions After Variable Renaming RAG and LLMs Combined to Generate Personalized Reading Content at Desired Complexity Unassigned Agents in Multi-Agent Path Finding Addressed by Compilation-Based Solvers New Framework Reduces Visual Hallucinations in Multimodal AI Systems Without Retraining MAF Framework Dynamically Optimizes Prompting for Multimodal Sentiment Analysis
Home ›› Topics ›› synthetic data

Topic

synthetic data

3 stories
SpecAlign Framework Uses Synthetic Data to Align Large Language Models with Specific Policies Technology
Artificial Intelligence #large language models#synthetic data

SpecAlign Framework Uses Synthetic Data to Align Large Language Models with Specific Policies

A research paper introduces SpecAlign, a framework that generates synthetic training data from provider-authored model specifications to align large language models with specific policies. The method combines structured rule annotation, controllable instantiation, and multi-agent adversarial data synthesis to create preference pairs for fine-tuning. Experiments show improved rule compliance without sacrificing general capabilities.

Jun 16, 2026 1 source
New Auditing Framework Detects Synthetic Data Privacy Leaks Without Model Access Technology
Artificial Intelligence #synthetic data#auditing

New Auditing Framework Detects Synthetic Data Privacy Leaks Without Model Access

A new causal framework for auditing synthetic data detects privacy leaks by distinguishing true disclosures from phantom ones. It uses statistical hypothesis testing with holdout sets, requires no model access or canary insertion, and is orders of magnitude more efficient than shadow-model approaches.

Jun 16, 2026 1 source
StateGen Platform Generates Synthetic Training Data for Tool-Augmented LLMs with 9.66/10 Hallucination Score Technology
Artificial Intelligence #synthetic data#multi-agent

StateGen Platform Generates Synthetic Training Data for Tool-Augmented LLMs with 9.66/10 Hallucination Score

Researchers introduce StateGen, a synthetic data generation platform that produces scored, reasoning-trace-rich training conversations for tool-augmented LLMs. The platform uses a four-role LLM loop and an authoritative state manager to eliminate tool-call hallucinations, achieving a 9.66/10 score across 64,698 evaluated conversations.

Jun 16, 2026 1 source