How to Build an AI Agent Plugin: Step-by-Step Guide
An AI agent plugin is a distributable package that bundles skills, hooks, custom agents, MCP server configurations, and settings into a single installable unit. Building one is not complicated -- but building a good one requires understanding how each component works, how the agent runtime loads them, and how they interact at the lifecycle level.
This tutorial takes you from an empty directory to a production-ready plugin. You will build a plugin that teaches an agent a new skill, enforces a quality gate through hooks, defines a specialized subagent, and connects to an external tool via MCP. By the end, you will understand not just the format, but the design decisions that separate useful plugins from ones that sit uninstalled.
If you are unfamiliar with what plugins are and how they differ from standalone skills and MCP servers, read our guide on what AI agent plugins are first. For the foundational concepts behind all of this, start with what AI agents are.
What You Will Build
By the end of this tutorial, you will have a working plugin called deployment-guard that does four things:
- Teaches the agent a skill -- a pre-deployment checklist the agent follows before any production deploy
-
Enforces a hook -- a
PreToolUsehook that blocks destructive file operations in protected directories - Defines a subagent -- a specialized reviewer that audits deployment readiness
- Connects an MCP server -- a monitoring dashboard integration the agent can query
Each component is optional. Your own plugin may include only one of these. The tutorial covers all four so you understand the full surface area.
Prerequisites
You need:
- An AI agent platform that supports the plugin format (Claude Code 2.1+, or any Agent Skills-compatible platform)
- A terminal and a text editor
- Basic familiarity with YAML and markdown
- No programming experience required for skills and hooks -- MCP integration requires JavaScript or Python
That last point is important. The most impactful part of a plugin -- the skills -- requires zero code. Just structured markdown.
Step 1: Create the Plugin Directory Structure
Every plugin starts with a directory and a manifest file. Create the following structure:
mkdir -p deployment-guard/.claude-plugin
mkdir -p deployment-guard/skills/deploy-checklist
mkdir -p deployment-guard/agents
mkdir -p deployment-guard/hooks
This gives you:
deployment-guard/
.claude-plugin/ # Plugin metadata (required)
skills/ # Agent skills (optional)
deploy-checklist/ # One skill per subdirectory
agents/ # Custom subagent definitions (optional)
hooks/ # Lifecycle event handlers (optional)
The .claude-plugin/ directory is the only required directory. Everything else is optional -- include only the components your plugin needs.
Step 2: Write the Plugin Manifest
The manifest is the plugin's identity. Create .claude-plugin/plugin.json:
{
"name": "deployment-guard",
"description": "Pre-deployment validation, protected directory enforcement, and deployment readiness review for production releases.",
"version": "1.0.0"
}
Three fields. That is the entire manifest specification.
| Field | Purpose | Rules |
|---|---|---|
name |
Unique identifier for the plugin | Lowercase, hyphens only, max 64 characters |
description |
What the plugin does -- displayed in marketplace search and discovery | Be specific. The agent runtime uses this for relevance matching. |
version |
Semantic version number | Follow semver: major.minor.patch
|
The description matters more than you might expect. When users browse a marketplace with thousands of plugins, the description is what determines whether they click "install" or keep scrolling. Write it for humans, not for keyword stuffing.
Step 3: Build the Skill
Skills are the highest-leverage component. A skill is a markdown file that teaches the agent how to perform a specific task -- the steps, decision criteria, and quality checks that turn a general-purpose model into a specialist.
Create skills/deploy-checklist/SKILL.md:
---
name: deploy-checklist
description: >
Pre-deployment validation checklist for production releases.
Verifies database migrations, environment variables, API
compatibility, rollback plan, and monitoring alerts before
any deploy command executes. Use when the user says "deploy",
"release", "ship it", "push to prod", or when any deploy
script is about to run.
---
# Pre-Deployment Checklist
Before executing any deployment to staging or production, complete
every item on this checklist. Do not skip items. If any check
fails, halt the deployment and report the failure.
## 1. Database Migrations
- [ ] All pending migrations have been reviewed
- [ ] Migrations are backward-compatible (no column drops without
a deprecation period)
- [ ] Rollback migration exists and has been tested
- [ ] Migration execution time estimated (flag if > 30 seconds)
## 2. Environment Variables
- [ ] All required environment variables are set in the target
environment
- [ ] No secrets are hardcoded in source files
- [ ] New environment variables are documented in the project
README or configuration guide
## 3. API Compatibility
- [ ] No breaking changes to public API endpoints
- [ ] API version has been incremented if behavior changed
- [ ] Deprecation notices added for removed or changed endpoints
- [ ] Client SDKs updated if applicable
## 4. Rollback Plan
- [ ] Previous version tag identified for rollback
- [ ] Rollback procedure documented and tested
- [ ] Database rollback migration verified
- [ ] Estimated rollback time: ___ minutes
## 5. Monitoring and Alerts
- [ ] Health check endpoint responds correctly
- [ ] Error rate alerting is configured
- [ ] Performance baseline established for comparison
- [ ] On-call team notified of deployment window
## Deployment Decision
If all checks pass: proceed with deployment.
If any check fails: document the failure, notify the team, and
do not deploy until resolved.
Anatomy of the Skill File
YAML frontmatter: metadata read at session start. The name becomes the slash command (/deployment-guard:deploy-checklist). The description tells the agent when to activate automatically.
Markdown body: instructions loaded into context only when the skill triggers, not at session start. Your skill body does not consume context tokens until needed.
A good skill encodes knowledge the model does not already have. The model knows what database migrations are. It does not know your organization's specific policies about backward compatibility or migration time limits. Keep skills concise -- every token consumed is a token unavailable for reasoning.
For a deep dive, see our guide on how to write a skill for an AI agent.
Step 4: Define Hooks
Hooks are event handlers that fire at specific points in the agent's lifecycle. Unlike skills (which are advisory -- the agent can choose to follow or ignore them), hooks are deterministic. They execute every time their trigger event occurs, and they can block actions.
Create hooks/hooks.json:
{
"hooks": [
{
"event": "PreToolUse",
"type": "command",
"command": "bash hooks/protect-directories.sh",
"description": "Block file modifications in protected production directories",
"matcher": {
"tool_name": "Write|Edit"
}
},
{
"event": "TaskCompleted",
"type": "command",
"command": "bash hooks/verify-checklist.sh",
"description": "Verify deployment checklist was completed before marking deploy tasks done"
}
]
}
Now create the hook scripts. First, hooks/protect-directories.sh:
#!/bin/bash
# Reads tool input from stdin (JSON), checks if the target file
# is in a protected directory, and blocks the operation if so.
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_input.file // ""')
PROTECTED_DIRS=(
"/etc/"
"/prod/"
"/deploy/live/"
)
for dir in "${PROTECTED_DIRS[@]}"; do
if [[ "$FILE_PATH" == "$dir"* ]]; then
echo '{"decision": "deny", "reason": "File is in a protected production directory. Use the deployment workflow instead of direct file modification."}'
exit 0
fi
done
echo '{"decision": "allow"}'
exit 0
Then hooks/verify-checklist.sh:
#!/bin/bash
# Runs when a task is marked complete. If the task description
# contains deployment-related keywords, verify the checklist
# was referenced during the session.
INPUT=$(cat)
TASK_SUBJECT=$(echo "$INPUT" | jq -r '.task.subject // ""')
if echo "$TASK_SUBJECT" | grep -qi "deploy\|release\|ship\|prod"; then
# Exit code 2 prevents task completion
echo "Deployment-related task detected. Verify the deploy-checklist skill was completed before marking this task done."
exit 2
fi
exit 0
Make the scripts executable:
chmod +x deployment-guard/hooks/protect-directories.sh
chmod +x deployment-guard/hooks/verify-checklist.sh
How Hook Events Work
The agent runtime fires hooks at 14 lifecycle events. Here are the ones most relevant to plugin development:
| Event | When It Fires | Can It Block? |
|---|---|---|
PreToolUse |
Before any tool call executes | Yes -- return deny to block |
PostToolUse |
After a tool call succeeds | No, but can provide feedback |
PostToolUseFailure |
After a tool call fails | No, but can trigger error handling |
TaskCompleted |
When a task is marked complete | Yes -- exit code 2 blocks completion |
Stop |
When the agent finishes responding | Yes -- exit code 2 forces continuation |
SessionStart |
When a session begins | No, but can inject context |
Hook Types
Hooks come in three types:
-
Command hooks (
type: "command"): run a shell script. The script receives JSON on stdin and returns decisions via exit codes and stdout. This is what we used above. -
Prompt hooks (
type: "prompt"): send a single-turn LLM evaluation. The model returns{ok: true/false, reason}. Useful when the decision requires reasoning, not just pattern matching. -
Agent hooks (
type: "agent"): spawn a subagent with tool access (Read, Grep, Glob) to investigate before deciding. The most powerful option, but also the most expensive in tokens.
For most plugins, command hooks are the right choice. They are fast, deterministic, and predictable.
Step 5: Define a Custom Agent
Custom agents are specialized subagent definitions that the main agent can spawn for specific tasks. They are markdown files with YAML frontmatter that specify the agent's name, role, allowed tools, and behavioral instructions.
Create agents/deploy-reviewer.md:
---
name: deploy-reviewer
description: >
Reviews deployment readiness by auditing configuration files,
environment setup, and infrastructure state. Spawned when the
deploy-checklist skill identifies items needing verification.
allowed-tools:
- Read
- Glob
- Grep
- Bash
model: claude-sonnet-4-20250514
---
# Deployment Reviewer
You are a deployment readiness reviewer. Verify production
readiness by checking concrete evidence -- not by asking the
user if things are done.
## What to Check
1. **Migration files**: Verify rollback migrations exist. Flag
column drops without deprecation comments.
2. **Environment variables**: Cross-reference `.env.example`
with target configuration. Flag missing variables.
3. **API versioning**: Verify version prefixes are consistent.
Flag removed routes without deprecation notices.
4. **Health checks**: Verify the endpoint exists and returns
a meaningful response.
## Reporting
- PASS: brief confirmation
- FAIL: specific file, line number, what needs to change
- WARN: not blocking, should be addressed
Do not guess. No evidence = FAIL.
Agent Design Decisions
The allowed-tools field restricts what this agent can do. A deployment reviewer needs to read files and search for patterns, but it should not write files or make network requests. Restricting tools reduces the risk of unintended side effects and makes the agent's behavior more predictable.
The model field is optional. Setting it lets you use a faster, cheaper model for agents that do not need the full reasoning capability of the primary model. A deployment reviewer that checks for the existence of files and scans for patterns works well with a smaller model.
Step 6: Add MCP Server Configuration (Optional)
If your plugin needs to connect the agent to an external service, you can include an MCP server configuration. Create .mcp.json in the plugin root:
{
"mcpServers": {
"monitoring-dashboard": {
"command": "npx",
"args": ["-y", "monitoring-mcp-server"],
"env": {
"MONITORING_API_KEY": "${MONITORING_API_KEY}",
"MONITORING_ENDPOINT": "${MONITORING_ENDPOINT}"
}
}
}
}
When the plugin is enabled, this MCP server starts automatically. The agent gains access to whatever tools the MCP server exposes -- in this example, the ability to query a monitoring dashboard for deployment health metrics.
The ${VARIABLE} syntax references environment variables, keeping secrets out of the plugin source. Users configure these values in their own environment before enabling the plugin.
When to Include MCP vs. When to Skip It
Include an MCP server configuration when your plugin needs the agent to interact with an external service at runtime -- querying APIs, reading databases, controlling infrastructure.
Skip MCP if your plugin only needs to change agent behavior. Skills and hooks handle behavioral changes without any external dependencies. A plugin that is purely skills and hooks is simpler to install, has no runtime dependencies, and works offline.
For a thorough comparison, see our guide on skills vs plugins vs MCP servers. If you need to build the MCP server itself, see how to build an MCP server.
Step 7: Add Settings (Optional)
If your plugin needs default configuration, create a settings.json in the plugin root:
{
"permissions": {
"allow": [
"Read(*)",
"Glob(*)",
"Grep(*)"
]
},
"env": {
"DEPLOY_CHECKLIST_STRICT": "true"
}
}
Settings are applied when the plugin is enabled and can be overridden at the user or project level. Use them for sensible defaults that users might want to customize -- permission presets, behavioral flags, environment variables.
Step 8: Review the Complete Structure
Here is the final directory layout:
deployment-guard/
.claude-plugin/
plugin.json # Plugin manifest (required)
skills/
deploy-checklist/
SKILL.md # Deployment checklist skill
agents/
deploy-reviewer.md # Deployment readiness reviewer
hooks/
hooks.json # Hook definitions
protect-directories.sh # Directory protection script
verify-checklist.sh # Checklist verification script
.mcp.json # MCP server configuration
settings.json # Default settings
Seven files. One manifest, one skill, one agent, two hook scripts plus their configuration, one MCP config, and one settings file. A complete plugin that teaches the agent a workflow, enforces guardrails, provides a specialist reviewer, and connects to external monitoring.
Step 9: Test the Plugin Locally
Before publishing, test the plugin in your local environment. There are three ways to load a plugin during development:
Option 1: Direct Directory Reference
Point the agent to your plugin directory:
claude --plugin-dir /path/to/deployment-guard
This loads the plugin from the local filesystem without installing it. Changes to the plugin files take effect immediately -- no reinstall needed.
Option 2: Local Installation
Install from your local directory with /plugin install /path/to/deployment-guard. Choose a scope: Local (this project, gitignored), Project (version-controlled), or User (all projects).
Option 3: Test Individual Components
Test each component independently. Copy the skill to .claude/skills/ and invoke it with /deploy-checklist. Add hook definitions to .claude/settings.json and verify blocking behavior. Copy the agent to .claude/agents/ and spawn it via the Task tool.
What to Verify
-
Plugin loads without errors (
/pluginshows it in the Installed tab) - Skill triggers correctly when you say "deploy" or "ship it"
- Skill does NOT trigger on unrelated requests
- Hook blocks file writes to protected directories
- Hook allows file writes to non-protected directories
- Agent produces structured PASS/FAIL/WARN reports
-
Namespaced command works:
/deployment-guard:deploy-checklist
Step 10: Publish and Distribute
Once tested, you have several distribution options:
GitHub Repository
Push the plugin directory to a GitHub repository. Users install it by adding your repository as a marketplace source:
/plugin marketplace add github:your-username/deployment-guard
/plugin install deployment-guard@your-username
This is the simplest distribution method and works well for team-internal or open source plugins.
Marketplaces
Submit to community marketplaces like SkillsMP (96,000+ skills) or the official Anthropic marketplace. The official marketplace reviews submissions for quality and security, giving your plugin higher visibility and user trust. Community marketplaces typically accept submissions via pull request.
For internal team use, share the plugin directory through version control -- as a git submodule or directly in your project repository.
Common Patterns and Best Practices
Pattern 1: Skill + Hook Combo
The most common plugin pattern pairs a skill with a hook that enforces it. The skill teaches the agent what to do. The hook ensures the agent actually does it. Without the hook, the agent might skip the skill when under time pressure or when the user asks it to move fast. The hook makes compliance structural, not optional.
Pattern 2: Progressive Skill Disclosure
Keep SKILL.md under 500 lines. Move detailed reference material to separate files that the agent reads on demand. This keeps base context lean while making deep knowledge available when needed.
Pattern 3: Multi-Skill Plugins
Complex workflows benefit from multiple skills in a single plugin -- one per phase (pre-deploy, rollback, post-deploy). Each skill handles one step. The plugin bundles them so they install together and share a namespace.
Pattern 4: Minimal Plugins
Not every plugin needs all component types. Many effective plugins contain just one or two: skill-only (teach a workflow), hook-only (enforce rules), or MCP-only (connect to a service). Start minimal. Add components only when you have a specific reason.
Avoiding Common Mistakes
Duplicating model knowledge. The model already knows Python and JSON. Focus skills on knowledge unique to your organization.
Overly broad hook matchers. A PreToolUse hook matching all tools adds latency to every operation. Use the matcher field to target specific tools like "Write|Edit|Bash".
Vague descriptions. The description is the primary mechanism the agent uses to decide when to activate a skill. "Deployment stuff" causes missed activations. Be specific about trigger conditions.
Hardcoding secrets. Use environment variable references (${VARIABLE_NAME}) in MCP configurations. Never put keys or tokens in plugin files.
Monolithic skills. A 3,000-line skill consumes context the agent needs for reasoning. Split large workflows into multiple focused skills.
How Plugins Fit in the Bigger Picture
Plugins sit between standalone skills (simple, personal, no distribution) and full MCP servers (running processes for external connectivity). They add namespace isolation, versioning, and marketplace support without requiring deployment infrastructure.
For organizations building AI agent systems like Nevo, plugins are the primary mechanism for encoding institutional knowledge. A deployment checklist that lives as a plugin is enforced consistently across every agent in the system. It survives team member turnover. It improves through version updates. And it compounds -- every plugin makes the agent system more capable, which makes the next plugin more valuable.
For the full comparison, see skills vs plugins vs MCP servers.
Frequently Asked Questions
What is the minimum viable plugin?
The minimum viable AI agent plugin is a directory containing .claude-plugin/plugin.json (the manifest with name, description, and version) and one component -- typically a single skill in skills/my-skill/SKILL.md. Two files, zero code, installable and functional.
Do I need to know how to code to build a plugin?
No. Skills and agent definitions are pure markdown. Hook configurations are JSON. The only component that requires code is hook scripts (bash) and MCP server integrations (JavaScript or Python). A plugin consisting entirely of skills and agents requires no programming at all.
How do plugin skills differ from standalone skills?
Functionally, they are identical -- same SKILL.md format, same YAML frontmatter, same markdown instructions. The difference is organizational. Standalone skills live in .claude/skills/ and have short command names like /review. Plugin skills are namespaced: /plugin-name:skill-name. Plugin skills are also distributable through marketplaces, version-controlled as a unit, and scoped to the plugin's lifecycle (enabled/disabled together).
Can I convert existing standalone skills into a plugin?
Yes. Move your skill directories into a new plugin's skills/ directory, add a plugin.json manifest, and the conversion is done. The skills themselves do not change. The only user-facing difference is the namespaced command name.
How do I debug a hook that is not firing?
Check three things: (1) hooks.json is valid JSON in the hooks/ directory, (2) the event name matches exactly (case-sensitive), and (3) the matcher field matches the tool or context you expect. Write to stderr in your hook script to confirm execution.
What happens when I update a plugin?
Marketplace-sourced plugins auto-update on your configured schedule. Local plugins require reinstallation from the updated directory. Updates replace all components atomically.
Is there a size limit for plugins?
No hard limit, but skill descriptions share a budget of 2% of the context window at session start. Keep descriptions concise and use disable-model-invocation: true on skills that should only be user-invoked.
What to Build Next
Start with a single-skill plugin encoding a workflow specific to your team. Then add a hook that enforces it. Then a custom agent that verifies the output. Then MCP integration for external data. Each step adds real capability, and the compound effect -- skills teaching behavior, hooks enforcing it, agents verifying it, MCP connecting it -- is what makes plugins the most powerful extension mechanism available to AI agents today.
For curated examples of what others have built, browse the best AI agent plugins in 2026. For the broader context, start with our main guide: What Are AI Agents?