New Orthogonal Projection Method Reduces Hallucinations in Vision-Language AI Explanations

Researchers propose Orthogonal Semantic Projection (OSP), a geometric intervention that reduces semantic hallucination in Vision-Language Model explanations. The method orthogonalizes query vectors against distractor concepts, improving attribution fidelity for safety-critical AI applications.

iGEN Editorial

June 16, 2026

New Orthogonal Projection Method Reduces Hallucinations in Vision-Language AI Explanations

As Vision-Language Models become integral to safety-critical enterprise systems, ensuring their explanations are trustworthy is paramount. A persistent issue known as semantic hallucination—where attribution maps incorrectly highlight image regions based on misleading text prompts—undermines the reliability of explainable AI. A new research paper provides a formal mathematical analysis and introduces a solution called Orthogonal Semantic Projection (OSP) to address this fundamental flaw.

According to the paper "Disentangling Hallucinations: Orthogonal Semantic Projection for Robust Interpretability" published on arXiv, semantic hallucination is not an isolated artifact but a consequence of Linear Semantic Leakage in high-dimensional embedding spaces. The authors—Bilgiç, Emirhan, Caramiaux, Baptiste, Yan, Zhi, and Franchi, Gianni—demonstrate that this problem spans multiple architectures and current explainable AI (XAI) methods.

The Problem: Semantic Hallucination in AI Attributions

When a Vision-Language Model processes an image and a text prompt, attribution maps are generated to highlight which parts of the image influenced the model's output. However, even with incorrect text descriptions—for example, prompting "cat" when the image contains a dog—the attribution maps still highlight prominent regions, misleading users about the model's reasoning. This phenomenon, termed semantic hallucination, directly threatens trust in AI systems deployed in areas such as logistics automation, medical imaging, or autonomous navigation.

The researchers establish that semantic hallucination arises from Linear Semantic Leakage, a pervasive property of high-dimensional embedding spaces where shared features between concepts cause overlapping attributions. They prove this mathematically, showing it is not a bug fixable by architecture tweaks alone.

Theoretical Framework: Linear Semantic Attribution

To tackle this, the authors propose a unified theoretical framework called Linear Semantic Attribution (LSA). LSA generalizes across discriminative XAI methods, providing a common mathematical foundation to analyze how prompts influence attribution maps. This framework reveals that standard methods inadvertently encode distractor information from incorrect prompts, leading to false positive visual highlights.

Aspect	Traditional XAI Methods	OSP-Enhanced Method
Reaction to incorrect prompt	Highlights prominent regions (hallucination)	Minimizes response to shared features
Handling of distractor concepts	No orthogonalization	Orthogonalizes query vector against distractors
Fidelity for correct prompts	Varies	Preserved or improved
Mathematical basis	Heuristic or black-box	Derived from Linear Semantic Leakage analysis

OSP: A Geometric Intervention

The core contribution is Orthogonal Semantic Projection (OSP), a geometric intervention that utilizes the residual property of Orthogonal Matching Pursuit (OMP). OSP disentangles unique semantic signals from shared concepts by orthogonalizing the query vector against distractor concept embeddings. The researchers prove theoretically and demonstrate empirically that OSP minimizes hallucination by rendering the attribution model "blind" to features shared between the correct and incorrect concepts, while preserving fidelity when the prompt is correct.

This means that for a safety-critical application, such as a warehouse robot using vision-language reasoning, OSP would ensure that an incorrect command like "pick pallet A" does not produce misleading visual attributions that point to pallet B if the features overlap structurally.

Implications for Enterprise AI Trustworthiness

In industries like supply chain and logistics, where AI models increasingly interpret visual data alongside natural language instructions, the reliability of explanations directly impacts operational decisions. A hallucinated attribution could lead to incorrect route planning, misidentified packages, or faulty safety alerts. By grounding XAI in a rigorous mathematical framework and providing a practical intervention like OSP, this research offers enterprises a path toward more robust and interpretable AI systems.

While the paper focuses on Vision-Language Models, the principle of orthogonalizing query vectors in high-dimensional spaces could extend to other multimodal AI used in trade documentation automation or customs image analysis. The researchers have made their code available, enabling adoption and further testing by the AI community.

Sources:

New Orthogonal Projection Method Reduces Hallucinations in Vision-Language AI Explanations

The Problem: Semantic Hallucination in AI Attributions

Theoretical Framework: Linear Semantic Attribution

OSP: A Geometric Intervention

Implications for Enterprise AI Trustworthiness

Recommended Stories

New Method LUCID Detects Hallucinations in LLM-Based Knowledge Graph Reasoning

DifFRACT Brings Circuit Tracing to Diffusion Transformers for Better AI Interpretability

Beijing Accuses US AI Firms of Using Chinese Models for Training

project44 CEO: AI Agents Without Context Are Just Guessing Faster