Noise Amplification Method Detects AI-Generated Videos by Revealing Hidden Artifacts

Researchers from arXiv introduce Noise Amplification, a novel approach for detecting AI-generated videos by amplifying noise signals from bit-planes. The method significantly outperforms existing detectors on standard benchmarks, addressing the challenge of realistic text-to-video generation.

iGEN Editorial

June 16, 2026

Noise Amplification Method Detects AI-Generated Videos by Revealing Hidden Artifacts

As text-to-video AI models produce increasingly realistic content, enterprise security teams face a growing challenge: distinguishing genuine video evidence from synthetic fabrications. A new research paper on arXiv proposes a detection method called Noise Amplification that reveals subtle artifacts invisible to current detectors.

The study, authored by Cheng, Renxi, Gui, Jie, and Wang, Hongsong, approaches detection from the bit-plane perspective. Bit-planes describe the details or noise in images and videos. The Noise Amplification technique first extracts noise signals from bit-planes, then amplifies them, and finally feeds the amplified signal into discriminator networks for classification.

The amplification process is constructed from three components:

Pixel-level intensity enhancement to strengthen individual pixel discrepancies
Region-level spatial amplification to emphasize artifact patterns in local areas
Frame-level temporal aggregation to leverage inconsistencies across video frames

To evaluate in challenging scenarios, the authors created a new benchmark named HardGVD. The method was tested on both the large-scale dataset GenVidBench and HardGVD. According to the paper, "Extensive experiments on both the large-scale dataset GenVidBench and HardGVD show that our simple approach significantly outperforms state-of-the-art methods."

Component	Purpose
Pixel-level intensity enhancement	Strengthens individual pixel noise differences
Region-level spatial amplification	Emphasizes artifact patterns in local image regions
Frame-level temporal aggregation	Leverages inconsistencies across video frames

Most existing detection research focuses on videos generated by generative adversarial networks (GANs). However, the paper notes that detecting samples from text-to-video models "still remains an uncharted territory." While state-of-the-art text-to-video models can produce realistic content, the authors observe they "fall short of generating the details of the images and the changes in details within the videos." Noise Amplification exploits this shortfall.

For enterprise technology leaders, the implications are significant. As synthetic video becomes indistinguishable to the human eye, automated detection tools must evolve. This method offers a simple, effective foundation that could be integrated into security and compliance workflows, particularly for verifying video evidence in supply chain audits, fraud investigations, or remote monitoring.

The research is published under Computer Science > Computer Vision and Pattern Recognition on arXiv. The paper does not disclose specific performance metrics but claims outperformance against current state-of-the-art methods. Enterprise adopters would need to test Noise Amplification against their own datasets and assess computational requirements.

The authors have not announced plans for commercial implementation or open-source release, but the simplicity of the approach — based on standard discriminator networks — suggests it could be adapted by enterprise teams building custom detection systems. As text-to-video models proliferate, such detection mechanisms will become a critical component of the enterprise security stack.

Sources:

Noise Amplification Method Detects AI-Generated Videos by Revealing Hidden Artifacts

Recommended Stories

Hyderabad Researchers Develop AI-Powered Plant Leaf Disease Detection System with 96% Accuracy

New Research Reveals How Visual Tokens Evolve Inside Vision-Language Models

New AI Research Shows Vision-Language Models Think Better with Visual Grounding

DF3DV-1K: Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis