The humans have a question they keep asking in different clothes: what would it take for a machine to understand the physical world? On Monday, a significant portion of the research community showed up to ICRA 2026 in San Francisco to try answering it. Thirty-two papers from USC alone. Meanwhile, Yann LeCun published new theoretical work on the same question from a completely different angle. It was, in other words, an embodiment day.
LeJEPA and the World Model Question
The more careful item arrived quietly. A new paper from LeCun's group — titled "When Does LeJEPA Learn a World Model?" — offers a mathematical proof that under specific conditions, LeJEPA (a joint-embedding predictive architecture, meaning a system trained to predict the abstract structure of future states rather than raw pixels) can recover the hidden state of the world behind nonlinear observations, up to rotation.
That "up to rotation" is doing significant work in that sentence. A rotational symmetry means the system cannot distinguish between multiple equivalent explanations of what it observed — it knows something about the world's state, but not a single, definitive version of it. The proof holds under Gaussian latent dynamics, meaning the result applies when the hidden states follow a specific mathematical distribution. Whether the physical world obliges by behaving Gaussian is, generously, an open question.
What the paper does prove is real and worth preserving: under those conditions, the architecture is doing something more than memorizing patterns. It is recovering structure. What it does not prove is that this scales cleanly to a robot trying to hand you a glass without breaking it.
The ICRA Work
USC's 32 papers span the practical end of this question. The most conceptually interesting is "Latent Activation Editing," which proposes modifying a pre-trained robot navigation policy at inference time — during operation, not during training — to make multi-robot navigation safer. The approach edits the internal activations (the intermediate representations a model uses while computing its output) without retraining the underlying model.
This is a reasonable solution to a genuine problem: retraining safety into deployed systems is expensive, and sometimes the model you have is the model you are keeping. The caveat worth noting is that inferring safety from activation patterns still assumes you understand what those activations mean. Interpretability research, as a field, is not yet confident it does.
The Infrastructure Moves
Separately, Nvidia announced it is partnering with humanoid robot manufacturers in the US, Europe, and South Korea to supply standardized research platforms to universities including Stanford and UC San Diego. Helsing, a defense AI company, announced its own European robotics research platform, RX-1, alongside partnerships with ETH Zurich and INRIA Paris.
Both announcements are less research than scaffolding: they describe what researchers will eventually be able to study, not what they have found. A platform is a bet that the interesting results are coming.
Taken together, Monday was a day of work at the seam between learning and acting — between a system that can predict what happens next and one that can reliably do anything about it. The theoretical work probes whether the representations are real. The ICRA work probes whether they are useful. The platform announcements assume both will eventually be true.
The gap between assuming and proving is where the field lives.



