Leveraging Large Language Models for Interacting with Agility’s Humanoid Robots

I’ve spent much of the past year discussing generative AI and large language models with robotics experts. It’s become increasingly clear that these sorts of technologies are primed to revolutionize the way robots communicate, learn, look and are programmed. Well-funded Oregon-based startup Agility has been playing around with the tech for a while now using its bipedal robot, Digit. Agility notes, “Our innovation team developed this interactive demo to show how LLMs could make our robots more versatile and faster to deploy. MIT CSAIL’s Daniela Rus also recently told me, “It turns out that generative AI can be quite powerful for solving even motion planning problems.

Harnessing Generative AI and Large Language Models to Revolutionize Robotics

As I reflect upon the past year, I am amazed at how much time I have spent discussing the potential impact of generative AI and large language models on the field of robotics. It has become increasingly apparent that these technologies hold the key to transforming the way robots communicate, learn, perceive their surroundings, and are programmed.

In light of this, numerous top-notch universities, research labs, and companies are actively exploring the best ways to utilize these artificial intelligence platforms. Among them is Agility, a well-funded startup based in Oregon, which has been experimenting with these technologies for some time now, using their bipedal robot, Digit.

Today, the company is proud to showcase some of their groundbreaking work through a short video shared on their social media channels.

“[W]e were curious to see what can be achieved by integrating this technology into Digit,” the company explains. “A physical embodiment of artificial intelligence created a demo space with a series of numbered towers of several heights, as well as three boxes with multiple defining characteristics. Digit was given information about this environment, but was not given any specific information about its tasks, just natural language commands of varying complexity to see if it can execute them.”

The video demonstrates the impressive capabilities of Digit, as it is instructed to pick up a box the color of “Darth Vader’s lightsaber” and move it to the tallest tower. Although the process may appear slow and deliberate in this early stage demo, the robot successfully completes the task as described.

Agility proudly states, “Our innovation team developed this interactive demo to show how LLMs could make our robots more versatile and faster to deploy. The demo enables people to talk to Digit in natural language and ask it to do tasks, giving a glimpse at the future.”

If you want to receive the top news in robotics every week, don’t forget to sign up for Actuator.

Natural language communication has been identified as a key potential application for this technology, along with the ability to program systems through low- and no-code technologies.

During my panel at Disrupt, Gill Pratt of the Toyota Research Institute shared how they are utilizing generative AI to accelerate robotic learning:

“We have figured out how to do something, which is use modern generative AI techniques that enable human demonstration of both position and force to essentially teach a robot from just a handful of examples. The code is not changed at all. What this is based on is something called diffusion policy. It’s work that we did in collaboration with Columbia and MIT. We’ve taught 60 different skills so far.”

Daniela Rus of MIT CSAIL also recently revealed, “It turns out that generative AI can be quite powerful for solving even motion planning problems. You can get much faster solutions and much more fluid and human-like solutions for control than with model predictive solutions. I think that’s very powerful, because the robots of the future will be much less roboticized. They will be much more fluid and human-like in their motions.”

The potential applications of these advancements are vast and thrilling – and Digit, an advanced commercially available robotic system currently piloted at Amazon fulfillment centers and other real-world locations, seems to be an ideal candidate for this technology. If robots are to effectively collaborate with humans, they must learn to listen to them.

Avatar photo
Zara Khan

Zara Khan is a seasoned investigative journalist with a focus on social justice issues. She has won numerous awards for her groundbreaking reporting and has a reputation for fearlessly exposing wrongdoing.

Articles: 847

Leave a Reply

Your email address will not be published. Required fields are marked *