Home ›› Topics ›› diffusion models

Topic

diffusion models

11 stories

Artificial Intelligence #ai#artificial intelligence

Repurposing a Speech Classifier for Guided Diffusion-Based Speech Generation

Researchers Makarov and Gerkmann propose a method to repurpose a conventionally trained speech classifier as the backbone for diffusion-based speech generation. By attaching a lightweight subnetwork and training only that under a Denoising Score Matching objective, they achieve high-quality speech synthesis with reduced memory footprint and computational cost compared to traditional classifier guidance that requires two separately trained models.

Jun 21, 2026 1 source

MakeupMirror Model Boosts Facial Attribute Preservation in Diffusion-Based Makeup Transfer

Technology

Artificial Intelligence #makeup transfer#diffusion models

MakeupMirror Model Boosts Facial Attribute Preservation in Diffusion-Based Makeup Transfer

Researchers propose MakeupMirror, a diffusion-based makeup transfer model that preserves facial identity and skin tone better than previous solutions. It achieves 60% higher facial recognition similarity, 50% lower skin tone difference, and 0.7s latency, with 94% expert acceptance, advancing virtual try-on for e-commerce.

Jun 21, 2026 1 source

New Research Provides Conditional Diffusion Guidance Under Hard Constraints for AI

Technology

Artificial Intelligence #conditional diffusion#diffusion models

New Research Provides Conditional Diffusion Guidance Under Hard Constraints for AI

A research paper proposes a framework for conditional generation in diffusion models under hard constraints, using Doob's h-transform and martingale-based learning algorithms. The method guarantees constraint satisfaction with probability one, targeting safety-critical applications and rare-event simulation.

Jun 21, 2026 1 source

PerceptionDLM: Multimodal Diffusion Model Achieves Parallel Region Perception

Technology

Artificial Intelligence #artificial intelligence#computer vision

PerceptionDLM: Multimodal Diffusion Model Achieves Parallel Region Perception

Researchers propose PerceptionDLM, a multimodal diffusion language model optimized for parallel region perception. Built on the state-of-the-art baseline PerceptionDLM-Base, it uses efficient prompting and structured attention masking to generate descriptions for multiple masked regions simultaneously, significantly improving inference efficiency. The team also introduces the ParaDLC-Bench benchmark to evaluate parallelism in visual perception.

Jun 20, 2026 1 source

New Diffusion Model Learns Permutation Distributions with Softer, More Tractable Trajectories

Technology

Artificial Intelligence #machine learning#diffusion models

New Diffusion Model Learns Permutation Distributions with Softer, More Tractable Trajectories

Researchers propose Soft-Rank Diffusion, a discrete diffusion framework that learns probability distributions over permutations more effectively than prior shuffle-based methods. By replacing abrupt shuffle corruption with a structured soft-rank forward process and introducing contextualized generalized Plackett-Luce denoisers, the method achieves consistent gains on sorting and combinatorial optimization tasks, especially for long sequences.

Jun 16, 2026 1 source

Gen-VCoT: New Framework Generates RGB Images as Visual Chain-of-Thought Intermediates for Multimodal AI Reasoning

Technology

Artificial Intelligence #generative ai#visual reasoning

Gen-VCoT: New Framework Generates RGB Images as Visual Chain-of-Thought Intermediates for Multimodal AI Reasoning

Researchers propose Gen-VCoT, a framework that generates RGB images as visual chain-of-thought intermediates, improving spatial reasoning by 25% and depth reasoning by 50% over baseline MLLMs, though text-based CoT remains superior for simple factual queries.

Jun 16, 2026 1 source

Divide-and-Denoise: Game-Theoretic Method Ensures Fair Composition of Diffusion Models

Technology

Artificial Intelligence #game-theory#diffusion-models

Divide-and-Denoise: Game-Theoretic Method Ensures Fair Composition of Diffusion Models

Researchers propose Divide-and-Denoise, a game-theoretic method for composing multiple pre-trained diffusion models fairly. At each timestep, an allocation divides the noisy sample into regions, maximizing utility under fairness constraints. The method outperforms baselines on the GenEval benchmark, resolving common failures like missing objects and mismatched attributes.

Jun 16, 2026 1 source

Trust-Region Diffusion Policies Enable Expressive AI for Complex Control Tasks

Technology

Artificial Intelligence #ai#reinforcement learning

Trust-Region Diffusion Policies Enable Expressive AI for Complex Control Tasks

Researchers introduce Trust-Region Diffusion Policies (TruDi), a method that enables diffusion models to be used in massively parallel on-policy reinforcement learning. By enforcing a KL-divergence constraint over the entire diffusion trajectory, TruDi achieves stable training and outperforms strong baselines across 73 diverse tasks, showing particular gains on challenging humanoid control problems.

Jun 16, 2026 1 source

Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models

Technology

Artificial Intelligence #artificial intelligence#language models

Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models

Masked Diffusion Language Models (MDLMs) have emerged as a distinct paradigm for sequence generation, but combining their knowledge is an underexplored problem. Researchers introduce TIE (Trajectory-based Iterative Ensembling), a framework that tracks confidence dynamics over answer-relevant positions to relay decoding trajectories between models, achieving strong performance on diverse reasoning tasks.

Jun 16, 2026 1 source

First Wasserstein-2 Convergence Proof for Decentralized Diffusion Models with ODE Samplers

Technology

Artificial Intelligence #wasserstein convergence#ode-based samplers

First Wasserstein-2 Convergence Proof for Decentralized Diffusion Models with ODE Samplers

A team of researchers has proven the first convergence guarantee in Wasserstein-2 distance for ODE-based samplers in decentralized diffusion models. The work addresses the missing theoretical foundation for decentralized generative architectures that replace a single global velocity field with multiple local experts and a routing mechanism. The result shows distribution converges at rate O(N^{-1/2}+ε), paving the way for privacy-scalable AI deployments.

Jun 16, 2026 1 source

DifFRACT Brings Circuit Tracing to Diffusion Transformers for Better AI Interpretability

Technology

Artificial Intelligence #diffusion models#ai

DifFRACT Brings Circuit Tracing to Diffusion Transformers for Better AI Interpretability

Researchers introduce DifFRACT, a method for mechanistic interpretability of multimodal diffusion transformers. By training timestep-conditioned transcoders on FLUX.1[schnell], they achieve exact feature-to-feature attribution and recover compact circuits, outperforming sparse autoencoders in precision.

Jun 16, 2026 1 source