But it’s not just the description field that can hold malicious instructions, the attack surface extends to all the information generated by MCP servers, which includes items like function names, parameters, parameter defaults, required fields and types. MCP servers also generate other messages, such as error messages or follow-up prompts. These, too, can contain malicious instructions for AI agents to follow.
How do you know if your MCP server download is malicious? First, check the source. Does it come from a trusted organization? Second, look at the permissions it asks for. If its purpose is to provide funny pictures of cats, it doesn’t need access to your file system.
Finally, if you can, check its source code. That can be tricky, but there are already vendors out there that are trying to get a handle on this. BackSlash Security, for example, has already gone through seven thousand publicly available MCP servers and analyzed them for security risks and found instances of both suspicious and outright malicious behaviors.