Blog

Understanding the Model Context Protocol

Artificial Intelligence (AI), especially with Generative AI and Large Language Models (LLM), has exploded in popularity recently and is finding its way into nearly every part of our daily lives. If someone is building an application, the question isn’t whether AI should be incorporated somehow, but how to incorporate it. 

However, despite its growing popularity, integrating AI with external systems and data sources has been a challenge. This is where the Model Context Protocol (MCP) comes into play, an open-sourced protocol announced by Anthropic that aims to standardize communication between AI applications, most notably those built with LLMs, and external, local and remote systems.

This article will provide you with an introduction to MCP. I’ll break it down with a simple analogy to help you understand the basics, then explore a little more of its definition to understand the components it introduces, how it works at a high level, and talk about some of the reasons as to why it’s become so popular.

Definition

MCP is an open-sourced protocol, built by Anthropic, that defines a standardized way of establishing communication between AI applications and external systems, such as data sources and tools.

The need for a common protocol

Prior to MCP being open sourced, there wasn’t a common way for developers to integrate AI applications with external systems, whether remote or local. This meant that integrating with external data sources and tools usually required building a custom connection. And, because of the significant time and effort required to build those custom connections, coupled with the blazing speed AI has evolved and the overall craving for more AI in more places, in reality, this usually meant that these applications simply didn’t communicate externally.

A dataset was collected up to some date (usually called something like the “knowledge cutoff date”), and fed into the training of an LLM, and that’s what was available within the applications that used that LLM. For example, Llama 3.2 was released October 24, 2024 and has a knowledge cutoff date of December 2023.

You could still provide data via the chat, but this was still limited by things like the context window and memory, and it was more work than, say, calling an API to get the weather on August 1, 2024. Integrating with a single API isn’t too challenging, but scaling up to being able to call 1,000 different APIs? Or 50,000 APIs? That was just too much of a resource investment to be practical when there wasn’t a single standardized way to do so.

The benefits of MCP

So, the solution is to provide what is essentially an abstraction layer that sits between the AI applications and the external systems. The abstraction layer supports a certain protocol, so AI applications now just need to integrate with that common protocol in order to integrate with those external sources. 

You can imagine trying to speak with someone who speaks a different language. You could learn the other person’s language, but that takes quite a bit of time and effort. However, if you had an interpreter at the table that spoke both languages, you could ask a question and the interpreter would translate it to your guest, receive the response, and then translate it back to you in your language.

Now, imagine that the interpreter knows eight languages. 50 languages. You can start to see how the benefit really starts to stack up as this number grows because the time and energy you’d need to invest to learn 50 languages is a significant challenge, to say the least.

The goal of MCP is that you can communicate with any number of external systems using just a single common language, reducing the level of resource investment tremendously for those building AI apps with LLMs. And, users would benefit from having more data available when using these apps beyond the LLMs knowledge cutoff date.

The benefits scale with adoption

However, it’s not quite as simple as introducing a new open protocol and suddenly everyone can use that protocol to fetch data and interact with tools and external systems. Something still has to communicate with the external systems, and since they’re not already using this protocol, because it’s new, integrations need to be built that can communicate with those systems.

If we bring back our language and interpreter example, it’s not enough for the interpreter to speak your language, she also needs to be able to speak your guest’s language, too. Anthropic started things off with a number of integrations to external systems, but as with any new standard, it’s only useful if it gains widespread adoption. And, the tech community seems to be latching onto the new open-sourced protocol and finding it useful to adopt from the data and tool provider side as well as the AI app and client side.

How MCP works: the components

MCP introduces a few key “roles” or “components” that are needed:

  • MCP Hosts:
    • These are the LLM-powered AI applications, which usually have some sort of chat interface for users to interact with
  • MCP Clients:
    • This is the component, considered a part of the MCP Host, that actually communicates using MCP to interact with external systems, indirectly via the MCP servers, to fetch data or use different tools.
    • Provide MCP Servers with:
      • Agent function calling
      • Other LLM interactions
  • MCP Servers:
    • This is the “interpreter” in the earlier example or the abstraction which connects the AI applications to external systems and provides developers a common protocol to communicate with.
    • Provide MCP clients with
      • Context and data
      • Templates for messaging or workflows
      • Function calling

Conclusion

The MCP is an open-sourced protocol announced by Anthropic that aims to standardize communication between AI applications and external systems. It provides a common protocol for AI app developers to use when integrating with external data sources and tools, rather than building custom connections to each external system.

As the demand for AI continues to grow, MCP is already gaining traction and becoming an essential tool for building connections to data sources and tools and interacting with a wide range of systems.

To start making the most of MCP:

  • Take advantage of the open source MCP servers built by Anthropic and others to integrate your AI applications with external systems.
  • Build your own MCP server to provide access to data or your tools.
  • Start exploring ways to integrate your AI-powered application with multiple external systems using MCP.

By following these recommendations, you can unlock the full potential of MCP and take your AI integration to the next level.