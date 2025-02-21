Multiple “firsts” in humanoid robotics

In the not-too-distant future, you might simply say to your robot, “Put the milk in the fridge,” and watch as it does exactly that—no coding, no complicated setup. No muss, no fuss. That vision is inching closer to reality thanks to Figure AI’s new Helix system , billed as “a generalist Vision-Language-Action (VLA) model that unifies perception, language understanding, and learned control to overcome multiple longstanding challenges in robotics.”

Helix is the first VLA to provide full upper-body control at high frequency—managing up to 35 degrees of freedom (DoF) at 200Hz across the wrists, torso, head and individual fingers. By separating the problem into two complementary systems—“System 2” (S2) for language and scene understanding at around 7–9Hz, and “System 1” (S1) for fast reactive control—Helix can handle everything from opening a fridge door to precisely grasping awkwardly shaped objects. Both systems share a single set of neural network weights, avoiding any per-task fine-tuning.

The home is a notoriously chaotic environment for robots; items are endless in variety, potentially fragile, and rarely where you expect them. Yet Helix reportedly enables robots to pick up virtually any small household object—“thousands of items they have never encountered before,” per Figure AI—by following natural language prompts. The press release highlights a playful example: “Pick up the desert item,” which prompts Helix to identify a toy cactus and then command the robot’s arms and fingers to grasp it.

Multi-robot collaboration and commercial readiness

A notable aspect of Helix is multi-robot collaboration. It’s reportedly the first VLA to operate simultaneously on two robots, allowing them to pass unfamiliar groceries like bags of cookies back and forth, coordinate head movements for “eye contact,” and work together on longer tasks—such as reorganizing a shelf. This real-time coordination still runs on low-power embedded GPUs, reinforcing Helix’s “commercial-ready” design. Figure AI notes that this entire system was trained with only around 500 hours of high-quality teleoperated data—far less than many comparable robotics datasets.

While Helix can already “pick up anything” without separate fine-tuning or hand-coded scripts, Figure AI sees this as just the beginning. The company aims to scale Helix “1,000x and beyond,” enabling humanoid robots to handle more complex tasks—like folding laundry, cooking, or setting a table—just by asking them in plain language.

Why it matters: If Helix delivers on its promise, we may finally see general-purpose household robots that can handle the unpredictable realities of daily life. This, in turn, could spur further R&D into robot collaboration, sensor fusion and safe human-robot interaction. The possibilities are considerable. Just think of humanoid robots showing up everywhere from homes to research labs in the coming years.