Topic
transformers
Artificial Intelligence #transformers#representation autoencoders
New Drift-RAE Method Distills Transformers Efficiently Using Representation Autoencoders
A new research paper proposes Drift-RAE, a method for distilling pretrained flow models in representation autoencoder latent spaces. It overcomes anisotropy and large curvature challenges, achieving 1.77 FID on ImageNet 256 with only 10,000 distillation steps, outperforming existing RAE distillation methods.
Jun 16, 2026 1 source
Artificial Intelligence #artificial intelligence#deep learning
New Theory Explains How Deep Transformers Achieve Adaptive Inference Using Function Vectors
A new research paper introduces a theory of deep transformers as mean-field interacting systems that implement distributed inference using 'function vectors' to adaptively infer latent context variables at finer scales over layers. The theory predicts a relationship between non-Gaussian hierarchical structure and transformer depth, tested with constrained linear attention models.
Jun 16, 2026 1 source