MIT Technology Review March 18, 2025
Google is only the latest to fuse large language models with robots. The trend has big implications.
Last Wednesday, Google made a somewhat surprising announcement. It launched a version of its AI model, Gemini, that can do things not just in the digital realm of chatbots and internet search but out here in the physical world, via robots.
Gemini Robotics fuses the power of large language models with spatial reasoning, allowing you to tell a robotic arm to do something like “put the grapes in the clear glass bowl.” These commands get filtered by the LLM, which identifies intentions from what you’re saying and then breaks them down into commands that the robot can carry out. For more details...