Markov Chains: A Love Letter to Robots, AI, & Self-Driving Cars

Prathamesh Khedekar
2 days ago
5 min read

June 02, 2025.

Markov Chain is one of the most underrated concepts in mathematics. The surprising thing is: if you're building an intelligent system — a robot, an AI model, a self-driving car — you're already relying on them, whether you like it or not. Most people think intelligence comes from complexity. In reality, much of what looks like intelligence comes from making good decisions under uncertainty.

And that, it turns out, is what Markov Chains are quietly very good at.

What Is a Markov Chain, Really?

A Markov Chain is a mathematical model that helps us predict the future — with one key constraint: the future state of an entity depends only on the present state, not the past. To better understand this concept, let's take a look at this example.

If you're in a city and deciding where to move next, a Markov Chain might say: "You’re in Chicago, there’s a 60% chance you’ll go to New York next, a 30% chance you’ll head to LA, and a 10% chance you’ll stay put." It doesn’t care where you came from. Only where you are now. This is called the Markov property — memorylessness. Mathematically, it's elegant. The real beauty comes from how it enables modern robots, AI models, and self-driving cars to operate in the real world. Let me explain.

Our World Is Constantly Evolving — Robots Get It

If you’ve ever built a robot or tested a self-driving car in a controlled environment, you know what happens the moment it hits the real world — everything falls apart, especially, in the initial stages of development.

Robots need time to digest reality. Floors aren’t perfectly flat. Objects aren’t perfectly shaped. Sensors lie. Actuators fail. Sometimes, the AI model running on your office robot thinks a human is standing in your office — only to realize it was detecting a face on a TV running a client’s ad. LiDAR and photons don’t always get along.

Early on, engineers tried to fight this uncertainty with precision i.e. custom solutions. That works fine for a factory robot bolted to the floor. The moment you let a robot loose in the real world, the illusion of certainty breaks down. You can’t plan for every bump in the road.

So instead of pretending the world is predictable, you change your approach. You say: “I don’t know exactly what will happen next, but I can estimate the odds.” That shift — from certainty to probability — is the essence of the Markovian mindset.

Robots use this concept and it is formally termed as a Markov Decision Process (MDP). It’s a structured way of thinking about decisions under uncertainty. In an MDP, the robot defines a set of possible states (like its current position, speed, or sensor readings), a set of possible actions it can take from each state, and transition probabilities that describe how likely each action is to lead to a particular next state. It also defines a reward function that assigns value or “success” to outcomes.

The goal isn’t to perfectly predict the future. Instead, the robot aims to maximize its expected reward over time by choosing the best possible action based on its current state, the available actions, and the likely outcomes. This structured approach is what allows robots to operate effectively, even in uncertain and dynamic environments.

This Markovian mindset isn’t limited to robots. 1st Gen AI & LLM models are byproducts of Markovian Theory.

Markovian Roots of AI & LLM

Before modern large language models like ChatGPT, we had much simpler n-gram models, which were basically Markov Chains applied to natural language. They predicted the next word based on the last few words typed by the user. It was a crude approach, but it worked. Your phone’s old autocomplete system was built on that.

Even now, under all the transformer layers and attention heads, modern LLMs like ChatGPT still assume that the current state encodes everything it needs to predict the next one. It is enough to predict the next token. The model doesn't need to replay the whole conversation — just understand what it sees now.

That’s the core of the Markov mindset: Distill the past. Trust the present. Plan probabilistically.

Even AlphaGo — the famous game-playing AI agent developed by Google doesn't need to replay the whole game before every move. It assumes the current board provides enough context to evaluate the situation and decide what to do next.

The beauty is: "Every reinforcement learning agent lives in a Markov world."

Self-Driving Cars: Navigating a Probabilistic Road

Nowhere is this idea more essential and obvious than in the world of self-driving cars.

A human might remember that a certain street is full of potholes. A car doesn’t have that kind of memory. It sees only what’s in front of it now through GPS, LiDAR, cameras, and a continuous stream of noisy sensor data.

To figure out where it is on the map i.e. to localize itself, the car often uses a Hidden Markov Model. The core idea is simple: the car can’t directly observe its exact position, but it can gather clues from the environment like lane markings, signs, or landmarks. It updates its belief about where it is based on what it sees now, and where it was likely to be a moment ago.

Then, to predict what other drivers will do, such as whether a pedestrian will step off the curb or if a truck will cut into its lane, the car builds small Markov models of human behavior. It doesn't try to understand the pedestrian’s entire psychology. It just needs to know: what are they likely to do next?

Based on those predictions, the car then decides its own actions, such as when to brake, steer, or accelerate. It does this by solving an internal Markov Decision Process every few milliseconds, carefully balancing safety, legal requirements, and passenger comfort.

None of this is glamorous. But it works. It’s how these autonomous vehicles drive through a foggy intersection or a construction zone without panicking.

They aren’t confident. They’re probabilistic.

Why It Matters

We like to think of AI, robotics, and self-driving systems as powered by cutting-edge algorithms. And, they are.

But their foundation is humbler — and older: Markov Chains.

Markov Chains remind us that even advanced machines live in a world they don’t fully control. Their intelligence isn’t in knowing everything. It’s in navigating uncertainty well.

That’s what makes them powerful. They don’t promise certainty. They promise something better — a way to move forward when certainty is impossible.

Most of what we call "intelligent" — robots, AI models, self-driving cars — are really just systems doing their best to answer one question: what should I do next?

They don’t remember everything. They trust the present, and act accordingly.

And maybe, that’s not just how machines work.

Maybe, it’s how we should too.

Cheers,

Prathamesh

Disclaimer: This blog is for educational purposes only and does not constitute financial, business, or legal advice. The experiences shared are based on past events. All opinions expressed are those of the author and do not represent the views of any mentioned companies. Readers are solely responsible for conducting their own due diligence and should seek professional legal or financial advice tailored to their specific circumstances. The author and publisher make no representations or warranties regarding the accuracy of the content and expressly disclaim any liability for decisions made or actions taken based on this blog.