On December 8 and 10, 2025, NASA’s Perseverance rover drove across Mars on a path planned not by a human engineer but by Claude, Anthropic’s AI model — the first AI-planned drives on another world. Over two sessions, Perseverance covered a combined 456 meters: 210 meters on December 8 and 246 meters on December 10, navigating a field of rocks that human operators analyzed from imagery but that Claude was trusted to encode into actionable rover commands. The result came from a months-long collaboration between JPL’s Rover Operations Center and Anthropic, a painstaking integration of years of driving expertise into a model context, and a 500,000-variable simulation that confirmed the AI-generated route was safe before a single command crossed 360 million kilometers of space. This article covers the full technical picture: how the communication constraints of Mars make AI planning attractive, how Claude ingested mission history and wrote Rover Markup Language, what the digital twin verification process entails, and what the results mean for the future of deep-space exploration.
The Challenge of Driving on Mars
Driving a rover on Mars is fundamentally constrained by physics that no engineering solution can overcome. The current communication delay between Earth and Mars runs approximately 20 minutes each way — meaning a command sent from JPL takes 20 minutes to arrive, and the rover’s acknowledgment takes another 20 minutes to return. Real-time teleoperation is impossible. Every drive must be planned in advance: human engineers analyze orbital imagery, surface photographs taken by Perseverance’s own cameras, and terrain elevation maps to plot a safe path, then encode that path as a sequence of waypoints for the rover to execute autonomously.
This planning process is highly skilled work. Mission operators must evaluate rock hazard, slope angle, wheel traction probability, and dozens of other factors for each segment of a proposed route. They write the plan in Rover Markup Language, run it through simulation, review the results, and iterate before the final command set is transmitted. On a good operational day, Perseverance covers several hundred meters. The bottleneck is not the rover’s physical capability — it’s the human bandwidth required to safely plan each movement from Earth.
Perseverance already uses AutoNav, an onboard self-driving capability that handles local obstacle avoidance in real time during a drive. AutoNav operates at the tactical level: given a waypoint to reach, it picks safe footholds moment to moment. Strategic route planning — deciding where to go across hundreds of meters of uncertain terrain — remained a human task. That is exactly what the JPL-Anthropic Claude integration was built to augment.
The JPL-Anthropic Collaboration
The partnership between NASA’s Jet Propulsion Laboratory and Anthropic began as an experiment in applied AI for space mission operations. The central question was whether a large language model with strong code generation capabilities could learn to produce valid Rover Markup Language commands from a context package built from years of accumulated mission data — and do so safely enough to pass JPL’s verification standards.
Claude was selected based on two capabilities that proved critical: vision-language understanding for analyzing surface imagery, and code synthesis performance for generating syntactically correct RML. JPL engineers accessed Claude through Claude Code, Anthropic’s CLI-based engineering environment, which allowed them to feed large structured contexts — drive logs, image annotations, operator notes, safety rulebooks — into the model and iterate on outputs in a familiar development workflow.
The collaboration was not a matter of plugging in an AI and watching it drive. JPL engineers spent months curating the context: selecting representative historical drives, annotating imagery with terrain classifications, encoding the accumulated rules and heuristics that experienced operators apply intuitively, and testing Claude’s outputs against known drive outcomes before the system was ever pointed at a real upcoming drive.
How Claude Planned the Drive: Step by Step
For each planned drive, JPL engineers assembled a context package containing orbital imagery of the target area, surface photographs from Perseverance’s front and rear hazard cameras, a description of the intended destination, elevation data from the terrain model, and a curated set of historical drives in comparable terrain with their outcomes. This package distilled years of human driving expertise into a structured prompt.
Claude then broke the route into discrete ten-meter waypoint segments, writing each segment in Rover Markup Language. The model did not generate the full path in a single pass and stop. After producing an initial draft, Claude evaluated its own output, flagging segments where the trajectory passed too close to hazardous rocks, where elevation transitions exceeded safe roll or pitch limits, or where the geometry introduced unnecessary steering complexity that would increase wheel stress. It then revised those segments and produced a refined waypoint sequence.
This self-critique loop is architecturally identical to what you would see in any production agentic AI workflow — generate, evaluate against known constraints, revise — applied to a domain with physical consequences rather than software outputs. For developers familiar with building AI agents, the pattern will be recognizable. What differs is the stakes: a poorly generated waypoint in a web agent produces a bad API call. A poorly generated waypoint in Rover Markup Language can strand a rover on a rocky slope 360 million kilometers from any repair shop.
Rover Markup Language: Writing Code for Another Planet
Rover Markup Language is the bespoke, XML-based programming language originally developed for NASA’s Mars Exploration Rover mission — Spirit and Opportunity — in the early 2000s, carried forward to Curiosity and Perseverance. It encodes every instruction the rover can receive: drive commands, instrument activations, positioning sequences, fault responses, and safety limits. RML is not a general-purpose language. It was designed for a specific mechanical system operating in a specific planetary environment, and its syntax reflects every physical constraint of that system.
Writing valid RML requires knowing the rover’s coordinate frames, its joint angle ranges, its wheel speed limits, tilt tolerances, and the correct execution sequence for compound commands. A syntactically well-formed RML file that violates a physical constraint will not produce a compile error. It will send the rover into an unsafe configuration on the Martian surface, with no human available to intervene for at least 40 minutes and no guarantee that the situation is recoverable remotely.
This is why the RML generation task required treating it as safety-critical code synthesis rather than text generation. Claude’s performance on code synthesis tasks — the same capability that makes it effective for software development work covered in guides like the Claude Opus 4.7 developer guide — translated directly to generating structured XML that respected the rover’s physical envelope. The output was not a natural language description of a path. It was executable code, validated against a specification.
Vision-Language AI Reads the Terrain
Route planning on Mars is not a pure coding problem — it begins with image interpretation. Before Claude could write any RML, it needed to understand the terrain it was planning a route through. This is where Claude’s vision-language capabilities were essential.
JPL provided Claude with surface imagery from Perseverance’s cameras: stereo images that encode distance information, color images that reveal rock texture and shadow patterns indicating height, and navcam panoramas showing the field of travel. Claude was used to interpret these images in context — identifying rock hazards, estimating trafficability of sandy patches, flagging areas where slope indications in the imagery suggested roll risk. This visual analysis fed directly into the waypoint selection process: Claude placed waypoints to route around hazards it identified in the images, not around hazards that a human operator had pre-labeled for it.
The integration of visual reasoning with code generation in a single model — seeing the terrain, understanding what it means for rover safety, and immediately writing the commands that navigate it — is what makes this demonstration technically significant beyond the specific RML task.
The Digital Twin: 500,000 Variables Verified
Before any drive plan reaches Mars, it goes through JPL’s digital twin — a high-fidelity virtual replica of Perseverance running on Earth-side infrastructure. The digital twin models the rover’s mechanical state, sensor outputs, terrain interaction, and environmental conditions. When a drive plan is loaded, the simulation plays out the entire route and predicts every joint position, wheel torque, attitude measurement, and fault condition that the physical rover would encounter if the plan were executed as written.
For the December AI-generated drives, the engineering team verified over 500,000 telemetry variables in simulation before the commands were transmitted to Mars. The digital twin confirmed that Claude’s waypoints kept Perseverance within its operational safety envelope across both routes — no tilt exceedances, no wheel slippage projections outside safe bounds, no trajectory segments that would produce an unrecoverable configuration.
This verification step is non-negotiable for every Perseverance drive, regardless of who or what authored the plan. The AI-generated route received no special exemption or reduced scrutiny. It passed or failed on the same terms as any human-authored plan. The fact that both December routes cleared the 500,000-variable simulation is the empirical result on which everything else rests.
December 8 and 10: The Results
On December 8, 2025, Perseverance executed the first AI-planned drive on another world, covering 210 meters (689 feet) across the Martian surface. Two days later, on December 10, the rover drove 246 meters (807 feet) on a second AI-planned route. Both drives completed without incident. The rover arrived at its planned destinations within acceptable position margins. No safety faults were triggered. No unplanned stops occurred.
Taken individually, 456 meters of driving is not exceptional for Perseverance — the rover has completed longer drives on human-planned routes. What is exceptional is that across every meter of those two drives, every waypoint in the command sequence was placed by an AI that learned from years of human expertise rather than by a human operator working directly. The milestone is not the distance. It is the authorship.
What This Means for Deep-Space Exploration
The immediate practical implication is operator leverage. If Claude can reliably produce safe drive plans from mission data, JPL engineers can shift from plan authorship — the work of writing every waypoint by hand — to plan review and validation. This changes the nature of their work without eliminating the need for their expertise. Reviewing an AI-generated plan for safety still requires the same deep knowledge of RML, rover dynamics, and Martian terrain that authoring a plan requires. But review scales differently than authorship: one expert reviewing AI-generated plans can cover more ground per operational day than one expert writing plans from scratch.
The longer-term implication concerns the architecture of future missions. Future Mars surface operations — larger rover fleets, more complex terrain, eventual crewed surface support systems — will require more planning throughput than current human-centric workflows can provide. The 20-minute communication delay is not a solvable problem. Any mission architecture that depends on Earth-based human operators for real-time tactical decisions is fundamentally limited by that delay. AI systems capable of strategic route planning from local sensor data represent a different class of operational capability, one that can eventually run onboard rather than on Earth-side infrastructure.
The December 2025 demonstration runs on Earth-side compute — Claude generates the plan here, humans validate it here, and the completed plan is uplinked to Mars in the traditional way. That is not the end state. It is the first rung. The demonstration establishes that AI can produce valid, safe Mars drive commands from imagery and mission history. The next question — how much of that capability can eventually run in the reduced compute environment of an onboard system — is one that the space AI research community is actively working on.
For the broader AI development community, the JPL-Anthropic collaboration is a concrete example of how frontier AI capabilities translate to high-stakes physical domains. The same agentic reasoning patterns that make Claude effective for software engineering tasks — context ingestion, code generation, self-critique, iterative refinement — are the same patterns that made it capable of generating safe RML for a Mars rover. The domain changes. The underlying capability structure does not.
Conclusion
The December 2025 drives are small in distance and large in meaning. Perseverance covered 456 meters of Martian terrain on paths that a human being did not design. Claude wrote valid Rover Markup Language from surface imagery and years of accumulated mission history, critiqued its own output, and produced waypoint sequences that cleared a 500,000-variable digital twin simulation. The physical rover executed those plans on the Martian surface, on two separate days, without incident.
That is what it looks like when AI capability moves from generating software to generating commands for hardware operating in one of the most unforgiving environments in the solar system. The gap between a large language model and a spacecraft command system has not disappeared — but on December 8 and 10, 2025, it narrowed in a way that will be cited for a long time.
Written by
Anup Karanjkar
Expert contributor at WOWHOW. Writing about AI, development, automation, and building products that ship.
Ready to ship faster?
Browse our catalog of 3,000+ premium dev tools, prompt packs, and templates.
Monday Memo Β· Free
One insight, every Monday. 7am IST. Zero fluff.
1 field report, 3 links, 1 tool we actually use. Join 11,200+ builders.
Comments Β· 0
No comments yet. Be the first to share your thoughts.