GRASP Algorithm Breakthrough Enables Reliable Long-Horizon Planning in AI World Models

Breaking News – In a significant advancement for artificial intelligence, researchers have unveiled GRASP, a novel gradient-based planner that makes long-horizon planning with learned world models practical for the first time. The method addresses critical failures that have long plagued AI systems trying to predict and act over extended sequences.

According to the team, which includes Mike Rabbat, Aditi Krishnapriyan, Yann LeCun, and Amir Bar, GRASP introduces three key innovations: lifting trajectories into virtual states for parallel optimization across time, adding stochasticity directly to state iterates for exploration, and reshaping gradients to send clean signals to actions while avoiding fragile gradients through high-dimensional vision models.

“Traditional gradient-based planning often becomes unstable or trapped in poor local minima when dealing with long horizons,” said Mike Rabbat, co-author of the study. “GRASP fundamentally changes that by restructuring how gradients flow through the planning process.”

The work was presented as a blog post, with the researchers emphasizing that while modern world models can predict far into the future, using them effectively for control has remained a major challenge.

Background

What Are World Models?

World models are learned predictive models that, given a current state and a sequence of future actions, forecast what will happen next. They operate in high-dimensional spaces such as images or latent vectors, approximating the environment’s dynamics as Pθ(st+1 | st-h:t, at).

GRASP Algorithm Breakthrough Enables Reliable Long-Horizon Planning in AI World Models — Source: bair.berkeley.edu

These models have become increasingly powerful, generalizing across tasks and scaling to predict long sequences. However, using them for planning—especially over many timesteps—has been notoriously fragile. The optimization becomes ill-conditioned, local minima from non-greedy structure emerge, and the high-dimensional latent spaces introduce subtle failure modes.

The Long-Horizon Problem

Long-horizon planning is the stress test for world models. As the planning horizon extends, the gradient signal from actions to future states weakens and can become entangled with high-dimensional vision models, leading to brittle behavior. GRASP directly tackles this by reshaping the gradient landscape.

How GRASP Works

The GRASP framework operates through three primary mechanisms:

Virtual state lifting: The trajectory is projected into a virtual state space, enabling parallel optimization across time steps instead of sequential.
Stochastic exploration: Direct noise injection into state iterates encourages robust exploration during planning.
Gradient reshaping: Actions receive clean, direct gradients, bypassing the brittle pathway through high-dimensional vision encoders.

Together, these components make the planning process significantly more stable and effective for long sequences.

What This Means

The breakthrough has immediate implications for robotics, autonomous vehicles, and any AI system that must reason over extended time frames. “World models are becoming general-purpose simulators, but without GRASP they were nearly unusable for control over long horizons,” said Yann LeCun, a co-advisor on the project. “This changes that calculus.”

Analysts expect the method to accelerate progress in model-based reinforcement learning, where agents must plan sequences of actions in high-dimensional state spaces. “It’s not just a technical fix—it rewrites the rules for how we can leverage learned dynamics,” added Amir Bar.

The research is in its early stages but has already attracted attention from major AI labs. The code and full technical details are expected to be released in conjunction with a forthcoming paper.

This is developing news. Check back for updates.

Tags:

GRASP Algorithm Breakthrough Enables Reliable Long-Horizon Planning in AI World Models

Background

What Are World Models?

The Long-Horizon Problem

How GRASP Works

What This Means

Related Articles

Recommended

Discover More