top of page
Search

Decoding AI Agents : What They Are, Why They Matter

June 18, 2025.

It’s June 2025, and AI agents are experiencing galactic adoption. Expedia has already launched a beta travel agent called Expedia Trip Match, which allows travelers to make travel plans simply by selecting from curated Instagram Reels. It allows travellers to delegate the entire trip planning and checkout process to AI agents. Lloyds Banking Group, one of the U.K.’s largest financial services firms, is about to launch a consumer-centric AI agent that will allow customers to delegate banking operations—think checking credit scores, making credit card or mortgage payments, updating addresses, and more.


Beyond travel and banking, software development is another domain that is quietly, but profoundly, being disrupted by AI agents. Tools like OpenAI’s Codex and Anthropic’s Claude are evolving from mere coding assistants into autonomous software development agents. This means, we are entering an era where coding is pivoting from a developer-led task to an AI-led workflow. 


From travel to banking, healthcare to software development, AI agents are fundamentally transforming how organizations handle work. We’re moving from delegating individual tasks to LLMs to delegating entire programs—ideation, customer research, marketing, building and launching products—directly to AI. This shift is being enabled not by cutting-edge AI models alone, but by AI agents that can think, plan, act, and adapt.

It’s a profound change. And it has massive implications for the career trajectories of millions. You might ask: what exactly are AI agents?


What Are AI Agents?


An AI agent is an autonomous software system designed to achieve a goal by completing a series of tasks that would typically require a human to plan, iterate, reason, and interact with other stakeholders and software systems. In simple terms, it’s a self-directed software worker—capable of taking an end goal as input, creating a plan, executing a series of actions, iterating and optimizing along the way, and interfacing with various tools and systems to deliver results—without requiring any human intervention.


At its core, an AI agent operates in a loop:

  1. Observe – Gather information (e.g., user input, API response, sensor data).

  2. Plan – Decide what to do next using reasoning or learned behavior (LLMs).

  3. Act – Execute actions (e.g., call APIs, write code, send messages).

  4. Update Memory - Store outcomes, retain context, and learn from past steps.


This loop allows agents to act autonomously and iteratively toward a goal. You might ask, where and how are these AI agents being adopted today?


AI Agent Adoption & Impact - From Customer Support to Research, DevOps, and Beyond


AI agents are seeing widespread adoption across industries. For example, in the customer support space, AI agents are being built to automatically answer user queries, escalate complex cases, and update CRM entries. MavenAGI is one such company that has partnered with OpenAI and recently launched a fully automated customer support agent with human-like capabilities. 


Beyond standalone agents, traditional CRM platforms such as Intercom, Zendesk, and Forethought are now embedding agent-level intelligence directly into their systems. One striking example: Intercom’s AI agent, developed in partnership with Claude, recently achieved a staggering 85% resolution rate for customer service requests across 45+ languages within a predefined evaluation period. That's  a step change when it comes to the value these agents are creating for users and organizations. 


Similarly, in the research and development space, agents like those developed by OpenAI's Deep Research can search the web, extract key insights, and summarize content with citations—essentially acting as an always-on research assistant. 


In the DevOps space, agents are being trained to monitor logs, detect anomalies, auto-scale infrastructure, and even patch errors autonomously. At its DASH 2025 conference, Datadog showcased AI agents capable of handling tasks typically reserved for experienced site reliability engineers (SREs)—from assessing infrastructure alerts to fixing code and triaging cybersecurity issues using its integrated SIEM platform. Similarly, New Relic, another leading observability platform, is moving in the same direction.


Meanwhile, Microsoft has transformed the GitHub Copilot system into a full-fledged AI agent and launched an Azure SRE Agent specifically designed to manage site reliability tasks across the Azure cloud ecosystem. Together, these developments have major implications for the future of SRE and DevOps engineers.


In the personal productivity space, AI agents are being designed to handle everyday tasks like booking flights, canceling subscriptions, and sending reminders. Examples include Rewind’s Memory Agent and Adept’s ACT-1, both serving as intelligent digital concierges for consumers and organizations. You might ask: how do AI agents differ from traditional software systems, machine learning models, and LLMs?


What Sets AI Agents Apart from Traditional Software Systems and LLMs?


Unlike traditional software systems or machine learning models, an AI agent operates as a closed-loop system. What do I mean by that?


If you take a look at the way we wrote code a few years ago, it was more or less anchored in the deterministic principles. In simple words, it was rule-based. If the user did X then we asked our code to do Y. For every action there was a predetermined response that we hardcoded in our code. It was static and deterministic. It didn't learn, reason, or adapt on its own. This meant, for every new use case, we had to add a new rule in the code. You see? This process was neither scalable nor did it involve any feedback loop that would allow these software systems to learn from the inputs or the data fed to them. To address this challenge, we shifted to a more adaptive approach and this is where we started incorporating machine learning (ML) models in our code. 


ML models allowed us to write a code that could learn from the data fed to them and optimize itself. We pivoted from rule-based coding to data-driven self-optimizing coding.  These machine learning models map inputs to outputs based on statistical patterns learned from historical data. A classifier that labels emails as spam or not-spam is a classic example of these models. But even then, traditional ML models are fundamentally passive—they don’t generate new data; they can at most help us with classification and prediction tasks.


That’s the gap large language models (LLMs) began to fill. An LLM can generate new data—whether it’s text, code, audio, or even video—while keeping context in mind. This is made possible by the transformer architecture developed by researchers at Google, which allows these LLMs to hold and reference contextual information as they complete a task. That said, traditional LLMs were still limited in a way: they could only perform one task at a time. Write a poem. Translate this sentence. Solve this math problem. Generate this SQL query. Each task that they were capable of completing was independent and isolated. What if we wanted these LLMs to solve a problem that requires them to plan and execute a series of tasks with an end goal in mind? 


What about projects and programs that require multi-step planning and reasoning? What about designing, building, and launching a full-fledged consumer product? Can AI help with that? Absolutely—and that’s where AI agents come into the picture.  An AI agent is an autonomous software system capable of taking an end goal as input, creating a plan, executing a series of tasks,, while interfacing with other tools and systems to deliver results—without requiring any human intervention. 


You might ask: how do these AI agents actually interface with other software systems—or even with other AI agents?


AI Agent Interoperability


To enable inter-agent communication, leading players in the industry are stepping up. One of them, Anthropic, has developed and launched a protocol called MCP—short for Model Context Protocol. This protocol enables AI agents not only to communicate with each other, but also to interface with external tools and services. For agent-to-agent communication specifically, Google has introduced the Agent2Agent (A2A) protocol, designed to help agents collaborate, exchange context, and coordinate action. Meanwhile, Microsoft’s NLWeb initiative is pushing the web itself to evolve—making websites and web apps easily discoverable and navigable by AI agents, effectively turning the web into an agent-friendly environment. You might ask, well, beyond interfacing with each other, what if they need to access existing APIs  and services like payment gateways to process payments on behalf of end users? 


You'd be surprised to know that existing APIs and software services are also becoming agent-compatible. A couple of months ago, Mastercard unveiled Agent Pay, Visa launched Intelligent Commerce, and PayPal introduced its PayPal Agent Toolkit. Each of these payment giants are now expanding the scope of their APIs and payment systems by adding an agentic-interface to them. By doing so they are building the infrastructure layer needed to enable agentic-commerce or simply put agent-enabled financial transactions. 


Now, with any emerging technology, there are benefits and risks. In this case since we are now making software systems autonomous and these systems will potentially have access to all core systems of our society, healthcare, financial, energy and others, it's prudent to understand these risks.


Risks and Trade-offs


The very traits that make agents powerful—autonomy, persistence, scale of operation, and ease of access to databases and tools—also introduce real risk. When an agent makes a mistake, it’s not just a bad prediction—it’s a bad action. And actions have consequences.


One of the most immediate risks is hallucination. Hallucination simply means the AI model generated an answer that might sound correct but is factually wrong. An LLM sitting inside a chatbot might generate a plausible but incorrect sentence. That’s annoying. But an agent that uses that hallucinated information to call an API, send a payment, or delete a record? That’s dangerous. Without strict validation and sandboxing, agents can act on false assumptions—and act fast. Beyond hallucination, there is also the challenge of infinite loops and runaway processes. What does that mean? 


An agent without proper guardrails can repeatedly attempt the same task, potentially entering an infinite loop of trial and error. It may also take illogical or inefficient paths toward its goal. For example, each AI agent needs to be configured with sufficient memory to remember the sequence of actions it took to complete a goal and optimize them without falling into a loop. The challenge is that if memory isn’t managed well, the agent might forget what it just did, repeat itself, or spiral into high-risk  behaviors—ones that are not only hard to debug but can also bring down an entire data center. 


Over-automation is another common pitfall. Just because a task can be automated using an AI agent doesn’t mean it should be. Many business processes are better handled by deterministic scripts or simple forms, not an AI agent. When the logic is straightforward, adding an agent only adds complexity without adding real value. Not every problem in this world needs to be solved explicitly with an AI agent. 


Then there’s the matter of security and access control. By design, agents are granted access to existing systems and APIs, such as payment gateways, document storage systems, and code repositories. This means they often operate on sensitive data, sometimes at significant scale. If access policies are misconfigured, even a small oversight can lead to serious vulnerabilities, opening the door to data breaches, unauthorized actions, or system-wide disruptions.


And finally, there’s the cost-to-benefit mismatch. While AI makes it easy to spin up a new agent in minutes, production-grade agents that can solve real problems at scale, recover from failure, and interact reliably with other systems require significant engineering effort. 


Beyond engineering there is the maintenance debt that comes with these agents. Agents are trained on data and policies. As the ecosystem in which the agents operate evolve they need to be trained on corresponding data and policies that reflect that evolution. So the question we must ask before building these agents is:  Is this agent truly going to create value that justifies the time and effort put into building it? Is it going to solve a real problem—at scale?

If we can't answer that question with clarity and depth, it's highly unlikely we will be able to build an agent that solves the core problem under consideration. 


So to summarize, AI agents are autonomous software systems that can unlock tremendous value. But before you start building one, it's important to answer this question: which problems truly warrant an AI agent—and which don’t? The tradeoff between the value they offer and the compute, cost, and complexity involved in building and maintaining them must be carefully weighed. Not every problem needs an agent. And the ones that do are only worth pursuing if approached methodically, with the right guardrails in place.


Now, the next question likely on your mind is: how do you actually build one?

We’ll cover that in the next essay.


Cheers,

Prathamesh



Disclaimer: This blog is for educational purposes only and does not constitute financial, business, or legal advice. The experiences shared are based on past events. All opinions expressed are those of the author and do not represent the views of any mentioned companies. Readers are solely responsible for conducting their own due diligence and should seek professional legal or financial advice tailored to their specific circumstances. The author and publisher make no representations or warranties regarding the accuracy of the content and expressly disclaim any liability for decisions made or actions taken based on this blog.


 
 
 

Comentários


bottom of page