Artificial Intelligence #vision-language-action#robotics
FineVLA Framework Improves Robot Instruction Following by 62.7% in Real-World Dual-Arm Manipulation
Researchers introduce FineVLA, an open framework for fine-grained instruction alignment in vision-language-action (VLA) robot policies. The framework includes a dataset of 47,159 human-verified trajectories, a benchmark with 500 videos and 11,631 atomic facts, and a steerable policy that improves real-world dual-arm manipulation success from 49.9% (raw-only) to 62.7%.
Jun 16, 2026 2 sources