“Revolutionizing Robot Training: Google’s Groundbreaking Methods Utilizing Video and Extensive Language Models”

Google’s DeepMind Robotics researchers are one of a number of teams exploring the space’s potential. The newly announced AutoRT is designed to harness large foundational models, to a number of different ends. In a standard example given by the DeepMind team, the system begins by leveraging a Visual Language Model (VLM) for better situational awareness. A large language model, meanwhile, suggests tasks that can be accomplished by the hardware, including its end effector. LLMs are understood by many to be the key to unlocking robotics that effectively understand more natural language commands, reducing the need for hard-coding skills.