Topic
distillation
New Drift-RAE Method Distills Transformers Efficiently Using Representation Autoencoders
A new research paper proposes Drift-RAE, a method for distilling pretrained flow models in representation autoencoder latent spaces. It overcomes anisotropy and large curvature challenges, achieving 1.77 FID on ImageNet 256 with only 10,000 distillation steps, outperforming existing RAE distillation methods.
Open-SWE-Traces: 207K Multilingual Trajectories Set New Standard for Autonomous Software Engineering Agents
Researchers have released Open-SWE-Traces, a dataset of 207,489 software engineering agent trajectories spanning nine programming languages, sourced from 20,000 real-world pull requests. Fine-tuning on this data yields models that achieve state-of-the-art resolve rates on multiple SWE-bench benchmarks, advancing autonomous software engineering.
Mutual Distillation of Dual Foundation Models Achieves State-of-the-Art PET/CT Segmentation with Only 5 Labeled Cases
Researchers propose MuDuo, a mutual distillation framework that leverages two foundation models (SAM-Med3D for CT, SegAnyPET for PET) to distill knowledge into a lightweight student network for semi-supervised PET/CT segmentation. Achieving state-of-the-art performance on the AutoPET dataset with only 5 labeled cases, the approach eliminates manual prompts and maximizes unlabeled data utility.