Skills vs Plugins vs MCPs: Understanding AI Agent Extension Layers
Modern AI agents are not monolithic systems. They are layered architectures where different extension mechanisms handle different types of capability. Understanding those layers -- what each one does, what it does not do, and when to use it -- is the difference between an agent that grows gracefully and one that becomes an unmaintainable tangle of ad-hoc integrations.
Three extension types dominate in 2026: skills, plugins, and MCP servers. Each solves a fundamentally different problem.
An AI agent skill is a prompt-based instruction that modifies agent behavior without any code. An AI agent plugin is a distributable package that bundles skills, hooks, subagents, commands, and configurations into a single installable unit. An MCP server is a standardized connection to an external tool or data source through the Model Context Protocol.
Skills are knowledge. Plugins are packages. MCPs are connections. For a broader introduction to the ecosystem, see what AI agents are.
This distinction matters more than it sounds. Get it wrong and you will build MCP servers when you need skills, write skills when you need plugins, or install plugins when a single MCP connection would suffice.
Skills: Teaching Agents How to Think
A skill is the simplest and most powerful extension mechanism available. It is a markdown file -- YAML frontmatter for metadata, followed by instructions in natural language -- that tells the agent how to approach a specific type of task.
No code. No compilation. No deployment pipeline. Write a markdown file, put it in the right directory, and the agent loads it automatically when the task context matches.
How Skills Work
The structure is straightforward:
---
name: Code Review
description: How to review code changes rigorously
globs: ["*.ts", "*.js", "*.py"]
alwaysApply: false
---
When reviewing code changes:
1. Read the full diff before commenting on any individual line
2. Check for security implications first, correctness second, style third
3. Verify that tests cover the changed behavior
4. Look for edge cases the author may have missed
5. Comment on patterns, not individual instances
The YAML frontmatter tells the agent runtime when to activate the skill. The globs field matches file patterns -- this skill loads when the agent is working with TypeScript, JavaScript, or Python files. alwaysApply: false means the skill is conditional, loaded only when relevant (as opposed to skills that apply to every interaction).
The instructions section is natural language. Not function signatures. Not API schemas. Just clear, specific guidance that shapes how the agent approaches the task.
What Skills Are Good At
Encoding workflow patterns. How should code reviews work? What steps should the agent follow when deploying? What quality checks should run before marking a task complete? Skills encode institutional knowledge -- the kind of information that lives in a senior engineer's head and is usually transmitted through pairing sessions and PR reviews.
Shaping judgment. The same agent model can produce wildly different output depending on the instructions it receives. A skill that says "always check for SQL injection in any database query" makes that check structural rather than incidental. The agent does not forget because the instruction is loaded automatically, every time.
Zero-cost authoring. Anyone who can write a clear paragraph can write a skill. No programming knowledge required. This means domain experts -- not just developers -- can contribute to agent capability. A security specialist writes the security review skill. A UX researcher writes the usability testing skill. A content strategist writes the editorial guidelines skill.
Token efficiency. Skills load based on context matching. An agent working on a Python data pipeline does not load the React component skill. This conditional activation keeps token consumption proportional to task relevance, not total skill count.
What Skills Cannot Do
Skills cannot take actions in the world. A skill can tell the agent how to write a good database query, but it cannot connect to the database. It can describe the deployment process, but it cannot run the deploy command. Skills modify behavior -- they are the "how to think" layer, not the "how to do" layer.
Skills also cannot enforce compliance. A skill that says "always run tests before committing" is advisory. The agent should follow it, but nothing prevents it from skipping the step. For enforcement, you need hooks -- which is part of what plugins provide.
MCP Servers: Connecting Agents to the World
An MCP (Model Context Protocol) server is a standardized connection between an agent and an external tool or data source. It wraps databases, APIs, file systems, and services in a universal protocol interface that any MCP-compatible client can discover and use.
How MCP Servers Work
An MCP server is a process that exposes three types of primitives:
- Tools -- executable functions the agent can invoke. "Search documents," "run query," "send email," "create issue."
- Resources -- readable data. Files, database records, configuration values, cached results.
- Prompts -- reusable prompt templates. A server can expose prompts for common workflows that clients can use directly.
The agent discovers available tools at runtime by querying the server. It does not need tool definitions injected into its context at compile time -- it asks "what can you do?" and gets back a list of capabilities with descriptions and parameter schemas.
Two transport mechanisms:
- stdio for local connections. The server runs as a subprocess, communicating through standard input/output. Zero network overhead, microsecond latency. Ideal for local databases, file systems, and CLI tools.
- Streamable HTTP for remote connections. The server runs as an independent process handling standard HTTP requests. Supports load balancing, OAuth authentication, and CORS. Ideal for cloud services and shared infrastructure.
What MCP Servers Are Good At
Connecting to external systems. Databases, APIs, cloud services, SaaS tools, file systems, monitoring platforms -- anything the agent needs to read from or write to. MCP servers are the agent's hands and eyes.
Standardization. Every MCP server speaks the same protocol. The agent does not need to learn GitHub's API and Slack's API and PostgreSQL's wire protocol. It learns MCP once and can use any server. This is the same value proposition that containerization brought to deployment -- standardize the interface, and the ecosystem explodes.
Isolation. The MCP server handles authentication, rate limiting, error handling, and connection management. The agent sends a tool call; the server handles the complexity. Credentials stay on the server side -- the agent never sees API keys, database passwords, or OAuth tokens.
Composability. An agent can connect to multiple MCP servers simultaneously, each providing different capabilities. A document retrieval server, a database server, a deployment server, and a monitoring server can all be active in the same session. The agent picks the right tool for each subtask.
What MCP Servers Cannot Do
MCP servers cannot teach the agent how to use them effectively. A server can expose a "search documents" tool, but it cannot explain when searching is better than reading the whole file, or how to formulate queries that return relevant results. That is knowledge -- the domain of skills.
MCP servers also do not bundle. Each server is a single connection to a single domain. If you want a coherent integration that includes the tool connection, the behavioral guidance, the enforcement hooks, and the workflow commands, you need a plugin.
Plugins: Packaging Complete Integrations
A plugin is a distributable package that bundles multiple extension types into a single installable unit. It is not a new capability type -- it is a delivery mechanism for skills, MCP configurations, hooks, subagents, and commands that need to work together.
How Plugins Work
A plugin contains a manifest file that declares its contents and a directory structure that organizes its components:
my-database-plugin/
plugin.json # Manifest: name, version, dependencies
skills/
query-patterns.md # Skill: how to write effective queries
migration-guide.md # Skill: how to handle schema migrations
mcp/
config.json # MCP server configuration for database connection
hooks/
pre-query.sh # Hook: validate queries before execution
post-migrate.sh # Hook: verify schema state after migration
agents/
migration-agent.md # Subagent: specialized for migration workflows
commands/
migrate.md # Slash command: /migrate triggers the workflow
Install the plugin, and you get everything: the MCP connection to the database, the skills that teach the agent query best practices, the hooks that validate queries before execution and verify schema state after migrations, the specialized subagent for migration workflows, and the slash command that kicks it all off.
What Plugins Are Good At
Coherent integrations. Some capabilities require multiple extension types working together. A deployment integration needs an MCP connection to the cloud provider, a skill that describes the deployment process, hooks that enforce pre-deploy verification, and a subagent that handles rollback if something goes wrong. A plugin packages all of that.
Distribution. Plugins are the unit of sharing. If you build a great database integration for your team, you package it as a plugin and others install it with a single command. Skills, MCP configurations, hooks, and agents come along for the ride.
Enforcement through hooks. This is the critical capability that plugins add beyond skills and MCP servers. Hooks are event handlers that fire at lifecycle points -- before a tool is used, after a task completes, when an error occurs. A hook can block an action (reject a commit that does not meet quality standards), automate a step (run tests after every file write), or log an event (record all deployments for audit). Skills advise. Hooks enforce.
Versioning and dependency management. Plugins declare versions and dependencies. When the database driver updates, you update the plugin version and dependents know to upgrade. This is standard package management applied to agent extensions.
What Plugins Cannot Do
Plugins are not a substitute for architecture. A plugin that tries to be an entire agent system -- handling orchestration, quality gating, memory management, and tool integration in one package -- is a monolith in disguise. Good plugins are focused: one domain, one coherent integration, one clear reason to install.
Plugins also add overhead. Every installed plugin contributes skills that may load, hooks that fire, and MCP servers that consume resources. The right number of plugins is the smallest number that covers your critical workflows. More is not better.
The Comparison Table
| Dimension | Skills | MCP Servers | Plugins |
|---|---|---|---|
| What it is | Markdown instructions | Standardized tool connection | Distribution package |
| Primary purpose | Modify agent behavior | Connect to external tools | Bundle complete integrations |
| Requires coding | No | Yes (server implementation) | Some (hooks, agents may need code) |
| Complexity | Low (write markdown) | Medium (implement protocol) | Medium-High (multiple components) |
| Scope | Single behavior pattern | Single tool/data source | Full workflow integration |
| Token impact | Conditional loading (efficient) | On-demand discovery (efficient) | Cumulative (all components) |
| Enforcement | Advisory only | N/A (provides capability, not rules) | Hooks enforce compliance |
| Portability | Copy the file | Any MCP-compatible client | Agent-runtime specific |
| Persistence | Loaded per-session by context | Running server process | Installed to project |
| Discovery | Agent scans metadata automatically | Runtime protocol query | Plugin registry / marketplace |
| Authoring barrier | Anyone who can write clearly | Developer with protocol knowledge | Developer with full-stack knowledge |
| Best for | Workflow patterns, guidelines, institutional knowledge | Database, API, service connections | Coherent multi-component integrations |
When to Use Each
Reach for a skill when:
- You want to change how the agent approaches a type of task
- The guidance does not require new tools or connections
- Domain experts (not just developers) need to contribute
- The instruction should load conditionally based on context
- You want the simplest possible extension with the fastest iteration cycle
Examples: code review guidelines, deployment checklists, editorial standards, security review patterns, debugging workflows, architectural decision frameworks.
Reach for an MCP server when:
- The agent needs to interact with an external system
- You want standardized, discoverable tool access
- Credentials and connection management should be isolated from the agent
- Multiple agents or clients need access to the same tool
- The integration is primarily about capability (actions), not behavior (guidelines)
Examples: database queries, API integrations, file system access, cloud service management, monitoring and alerting, search engines, version control.
Reach for a plugin when:
- The integration requires multiple extension types working together
- You need hooks to enforce quality or compliance standards
- The integration should be distributable to other teams or projects
- A specialized subagent is needed for complex workflows
- The capability requires both behavioral guidance (skills) and tool access (MCP) to be useful
Examples: full CI/CD pipeline integration, database management (connection + query patterns + migration handling + enforcement), deployment workflows (tool access + pre-deploy checks + rollback handling), compliance frameworks (guidelines + audit hooks + reporting tools).
How They Work Together in Production
The three layers are not alternatives. They are complementary layers of a single architecture. In a production agent system, all three operate simultaneously, each handling its domain.
Here is how Nevo structures this in practice.
The skill layer includes 60+ skills across three scopes. Project-scoped skills encode Nevo-specific patterns: the 8-stage quality pipeline, the PRD-driven execution framework, the error-to-rule system. User-scoped skills encode Ryan's preferences: communication style, code review standards, deployment verification steps. Generated skills are authored by Nevo's own Skill Forge when it identifies capability gaps. Skills load conditionally -- working on TypeScript activates TypeScript-specific skills, working on deployment activates deployment skills.
The MCP layer includes three servers running locally:
- QMD -- document retrieval with BM25 keyword search and GGUF neural embeddings across 7 collections and 190+ documents. Instead of injecting all documentation into every session (100K+ tokens), Nevo queries QMD and retrieves only what is relevant. Token savings: 92-96%.
- Memory -- a knowledge graph storing entities, relations, and observations. Nevo's persistent memory across sessions, accessible to any agent in the system.
- Lighthouse -- web performance and quality auditing. Runs analysis on web properties and returns structured results.
Each server handles a single domain and exposes its capabilities through the standard MCP protocol. Any of Nevo's 14 specialized agents can use any server without knowing implementation details.
The plugin layer provides bundled integrations with enforcement. The Ralph plugin handles autonomous execution persistence -- maintaining agent loops across session boundaries with circuit breaker safety. It bundles the execution engine, the persistence mechanism, the circuit breaker logic, and the hooks that enforce iteration limits. Removing it does not break the agent -- it removes that specific capability cleanly.
The layers compose naturally. When Nevo receives a coding task, skills tell it how to approach the work (PRD decomposition, quality standards, testing patterns). MCP servers give it the tools to execute (document retrieval for context, memory for persistent knowledge). Plugin hooks enforce quality gates (the 8-stage pipeline runs automatically through hooks, not manual invocation). No single layer could do all of this. Together, they create an agent that knows what to do, has the tools to do it, and enforces quality standards on everything it produces.
The Architecture Principle
The cleanest mental model for these three layers maps to a concept every developer knows: the separation of concerns.
Skills are configuration. They change behavior without changing code. Swap one skill for another and the agent approaches the same task differently -- like swapping a config file changes application behavior without redeployment. Configuration is cheap to create, cheap to change, and should be the first thing you reach for.
MCP servers are infrastructure. They provide the raw capabilities -- connections, tools, data access. Like databases, message queues, and APIs, they are the systems your agent depends on. They take more effort to set up but rarely change once running.
Plugins are applications. They are the packaged, distributable units that combine configuration and infrastructure into coherent workflows. Like applications, they have versions, dependencies, and lifecycle management.
When you have a new requirement, the question to ask is: "Is this a configuration change, an infrastructure addition, or a new application?"
- The agent should handle errors differently? That is a skill (configuration).
- The agent needs to access a new database? That is an MCP server (infrastructure).
- The agent needs a complete CI/CD workflow with quality gates and rollback? That is a plugin (application).
Start at the simplest layer that solves the problem. Escalate to a more complex layer only when the simpler one proves insufficient. Most teams over-engineer their agent extensions. The skill that took 10 minutes to write often outperforms the plugin that took a week to build -- because the constraint of simplicity forces clarity of thought.
Common Mistakes
Writing MCP servers when skills would suffice. If the agent already has the capability but is not using it correctly, you need a skill, not a new tool. Teaching the agent when to use git rebase versus git merge does not require an MCP server for Git -- it requires a skill that explains the decision framework.
Writing skills when MCP servers are needed. No amount of behavioral instruction can give the agent access to a database it cannot connect to. If the missing capability is access, not knowledge, you need an MCP server.
Installing plugins for single-feature needs. If you only need the MCP connection and do not need hooks, subagents, or specialized skills, just configure the MCP server directly. Plugins add overhead. Use them when you need the full package, not when you need one piece.
Ignoring the token budget. Every skill that loads consumes context window tokens. Every MCP server that provides tools adds to the tool definition overhead. The right architecture minimizes token cost -- conditional skill loading, on-demand MCP discovery, and focused plugins that do not inject unnecessary context.
Not using hooks for enforcement. If you have a quality standard that matters, encode it in a hook, not just a skill. Skills advise. Hooks enforce. The distinction between "the agent should run tests" and "the agent cannot commit without passing tests" is the distinction between a suggestion and a guarantee.
FAQ
What is the simplest way to extend an AI agent's capabilities?
The simplest extension mechanism is a skill -- a markdown file with YAML frontmatter and natural language instructions. Skills require no code, no compilation, and no deployment pipeline. Write a markdown file describing how the agent should approach a specific task, place it in the skills directory, and the agent loads it automatically when the task context matches. Skills are ideal for encoding workflow patterns, quality standards, and institutional knowledge. Anyone who can write a clear paragraph can author a skill.
Can I use skills, plugins, and MCP servers together?
Yes -- they are designed to be complementary layers, not alternatives. In a production agent system, skills provide behavioral guidance (how to approach tasks), MCP servers provide tool connections (access to external systems), and plugins bundle complete integrations (skills + MCP configs + hooks + subagents). For example, a database integration might use an MCP server for the connection, a skill for query best practices, and a plugin hook that validates queries before execution. Each layer handles its domain without duplicating the others.
How do AI agent skills differ from traditional prompts?
AI agent skills are structured, persistent, and contextually loaded -- unlike ad-hoc prompts that are manually included in each conversation. A skill has metadata (name, description, file globs, activation rules) that tells the agent runtime when to load it. Skills persist across sessions, load automatically based on task context, and can be shared across teams. A prompt is a one-time instruction. A skill is institutional knowledge that the agent carries permanently and applies when relevant, without being told to do so.
When should I build a plugin instead of just configuring an MCP server?
Build a plugin when the integration requires multiple extension types working together -- when you need the tool connection (MCP) and behavioral guidance (skills) and enforcement (hooks) and specialized workflows (subagents) to deliver value. If you only need tool access, configure the MCP server directly. If you only need behavioral guidance, write a skill. Plugins are the right choice when removing any single component would make the integration incomplete, and when you want to distribute the full integration to other teams or projects as a single installable package.
How many skills can an AI agent handle before performance degrades?
The practical limit depends on token efficiency, not raw count. Skills that use conditional loading (activated only when file patterns or task context match) scale well -- an agent can have hundreds of skills if only the relevant ones load per session. Skills with alwaysApply: true consume tokens in every session regardless of relevance, so these should be limited to genuinely universal guidelines. The key metric is not total skill count but tokens loaded per session. Monitor this and prune skills that rarely activate or provide marginal value. Most production agents operate effectively with 30-60 active skills and conditional loading.
Dive deeper into each extension type: What Are AI Agent Skills covers the skill layer in detail, including authoring patterns and best practices. What Are AI Agent Plugins explains plugin architecture and the packaging model. What Is MCP covers the Model Context Protocol from fundamentals to production deployment. For a practical guide to writing your own skills, see How to Write an AI Agent Skill.