As visual autoregressive (VAR) models rapidly advance high-fidelity text-to-image synthesis, ensuring the safety alignment of generated content becomes a critical challenge. Existing concept erasure techniques, designed for diffusion models' homogeneous denoising steps, fail when applied to VAR models — causing catastrophic semantic collapse and severe visual artifacts. To address this, researchers have introduced SACE, a scale-aware concept erasure framework that is the first of its kind for VAR architectures.
The SACE framework is grounded in the Semantic Singularity Axiom, which posits that any target semantic concept embedded within a prompt is definitively locked at Scale-0, the first scale of the VAR generation process. The axiom was validated through a novel method called Incremental Semantic Saliency Analysis (ISSA), which allows the community to transparently inspect the coarse-to-fine semantic injection process. By identifying where the concept is locked, the researchers can confine interventions to that initial scale.
SACE couples an Entropy-Regularized Erasure Objective to prevent high-entropy sampling degeneration, along with a restorative preservation loss that safely anchors the integrity of entangled benign priors. This precise confinement to the first scale avoids disrupting later scales, preserving image quality while achieving effective erasure.
Extensive experiments demonstrate that SACE achieves surgical concept erasure performance across various domains with minimal training overhead. The approach is designed to resolve critical safety vulnerabilities inherent in emerging VAR models, offering a timely solution for content safety in generative AI.
| Key Component | Description |
|---|---|
| Semantic Singularity Axiom | Posits target semantic concept is locked at Scale-0 of VAR generation |
| Incremental Semantic Saliency Analysis (ISSA) | Validates the axiom and enables inspection of semantic injection |
| SACE Framework | First scale-aware concept erasure framework for VAR models |
| Entropy-Regularized Erasure Objective | Prevents high-entropy sampling degeneration |
| Restorative Preservation Loss | Anchors benign priors without harming erasure |
This research addresses a foundational challenge in AI safety: adapting erasure techniques from diffusion models to the different architecture of VAR models. The code is publicly available at the project's arXiv page. For enterprise technology leaders leveraging generative AI, such advances are crucial for ensuring that image generation systems can safely filter or remove unwanted concepts without degrading performance or introducing artifacts.