To plan the future, the PFC represents a step-by-step map of actions, and at every step, this plan moves to the past like a conveyor belt.
This proposes a super simple neural-subspace architecture for planning.
A toy environment can clarify it.
Here is my toy model and notes:
The old view saw planning as a slow, sequential search, like exploring a maze path by path.
This paper proposes a fast, parallel inference, where the optimal plan emerges all at once.
Core result:
The main result is a "Spacetime Attractor", a neural circuit that infers optimal future plans.
It does this by embedding the rules of the environment directly into its synaptic connections.
Let's unpack this:
The model assumes the PFC is split into different "subspaces," each representing a different point in the future.
One group of neurons maps "NOW (0)," 🟩 another maps "NEXT (1)," 🟦 and another maps "LATER (2),"🟪 all active simultaneously.
These subspaces are connected based on the environment's rules, like a 'world model' etched into the wiring.
Synapses from "NOW" 🟩 to "NEXT" 🟦 only exists if the agent can *actually* move between those two locations.
Possible move: excitatory
Not possible move: inhibitory
Planning happens when inputs, like the 'start' and 'goal', are applied to this network.
The circuit activity then rapidly "relaxes" into the most stable pattern, which *is* the optimal path.
In each subspace, only the units representing the locations in the plan are active.
As the agent takes the first action, the entire plan representation shifts forward in time.
What was in the "NEXT" 🟦 subspace moves to the "NOW" 🟩 subspace, readying the next action without re-planning.
Conveyor belt dynamics, they call it.
But how does this adapt to new mazes?
Training nets to challenging environments, they found nets learn scaffolds of all possible actions instead of locations.
When it sees a new wall, sensory input just "blocks" or inhibits that specific action, making the plan flexible.
Here is another way to put it: The brain builds a master map of all possible moves, like "go north" or "go east" instead of "north" and "east".
Seeing a new wall is like the brain telling itself, "Okay, the 'go north' option is not available."
Instead of names it learns verbs.
But why?
We’ve known the prefrontal cortex is crucial for planning, but how its circuits actually compute a plan was a mystery.
This study provides a testable, mechanistic theory for how neurons can represent and infer future paths.
This completes a beautiful picture of how learning/adaptation could happen in the 🧠:
Striatum: temporal-difference learning. Slow, habits🐢
Hippocampus: successor-representation learning. OK fast🐇, memories.
PFC: space-time attractor. Very fast🚀, reconfigures brain.
This is how we learned that from their results:
The Spacetime Attractor (STA) model successfully solves complex, dynamic tasks that other models fail (Fig 3D-E).
It excels when rewards change *within* a trial, like intercepting a moving target.
Recurrent Neural Networks (RNNs) trained to solve these problems learned the STA solution on their own (Fig 4C).
They developed the same "conveyor belt" dynamics, suggesting it's an efficient solution (Fig 4E).
The trained RNNs explicitly learned the maze rules—the 'world model'—in their synaptic weights (Fig 5B).
The connections between subspaces representing future steps perfectly matched the maze's adjacency matrix (Fig 5E).
For new mazes, the RNNs adapted by learning *transitions* instead of fixed locations (Fig 6C).
Sensory input about a wall specifically inhibited the neural representation for that impossible transition (Fig 6H).
This work unifies planning with sequence working memory.
The same neural architecture used to *remember* a sequence can be used to *infer* a plan (Fig 2B).
Implications for AI:
This suggests AI could be improved by moving beyond brute-force computation.
Building in structured 'world models' and attractor dynamics could lead to more flexible and efficient planning agents.
Limitations:
The model was tested in deterministic environments where the rules were clear, not uncertain or probabilistic.
It also models a solitary agent, not the complex social planning needed to interact with other agents.
Disclosure: this is a simplification from a very complex paper!
Sorry if I omitted some details, it is all for clarity 😸
Paper: doi.org/10.1101/2025.09.23.677709