A8gent
HomeBlogClaude Agent Skills and Subagents: The Complete Tutorial
Claude Agent Skills and Subagents: The Complete Tutorial
Technical · 2026-06-06 · Last verified 2026-06-06

Claude Agent Skills and Subagents: The Complete Tutorial

Learn how Claude Agent Skills and subagents actually work: the SKILL.md format, progressive disclosure, context isolation, and how to combine both into production agent workflows - with full annotated examples.

D
Deep · ML Architect & Full Stack Engineer

10+ years shipping production ML across TensorFlow, PyTorch, AWS, and GCP. Ships every A8gent agent before it becomes a lesson. GitHub

Key takeaways
  • An Agent Skill is just a folder with a SKILL.md file: YAML frontmatter (name and description) plus markdown instructions. Claude loads only the metadata at startup and reads the full skill body on demand - this progressive disclosure is what makes skills scale where system prompts cannot.
  • Subagents run in their own context window with their own system prompt, tool restrictions, and model choice. Delegate to a subagent whenever a side task would flood your main conversation with search results, logs, or file contents you will never reference again.
  • Skills teach Claude HOW to do something (procedures, conventions, checklists). MCP servers give Claude ACCESS to external systems (databases, APIs, SaaS tools). They are complementary, not competing - most production setups use both.
  • The highest-leverage pattern is combining them: a lean orchestrator delegates to specialized subagents, and each subagent preloads the exact skills it needs via the skills frontmatter field. Context stays clean, behavior stays consistent.
  • The most common mistake is writing skill descriptions that only say what the skill does. The description must also say WHEN to use it, because that one or two sentence string is all Claude sees when deciding whether to load the skill.

What Claude Agent Skills Actually Are (And What They Are Not)

If you have used Claude Code for more than a week, you have probably felt the pain that Agent Skills solve: you keep pasting the same instructions into chat. The same deployment checklist, the same commit message format, the same "always use our error response schema" reminder. You tried stuffing it all into CLAUDE.md, and now your CLAUDE.md is 800 lines long and burns tokens on every single message whether the content is relevant or not.

An Agent Skill is Anthropic's answer to this. At its simplest, a skill is a directory containing a SKILL.md file. The file starts with YAML frontmatter (a name and a description) followed by markdown instructions. That is the entire format. No SDK, no build step, no server process. Skills work in Claude Code, claude.ai, and the Claude API, and the SKILL.md format was released as an open standard in late 2025 - it is now also supported by OpenAI's Codex CLI, Gemini CLI, and GitHub Copilot, which means skills you write today are portable across agent tooling.

The design principle that makes skills work is progressive disclosure, and it operates in three tiers:

Tier 1 - Discovery. At session start, Claude loads only the name and description of every installed skill into context. That costs roughly 100 tokens per skill. With 20 skills installed, Claude knows about all of them for about 2,000 tokens total.

Tier 2 - Activation. When Claude decides a skill is relevant to the current task (or you invoke it directly with /skill-name), it reads the full SKILL.md body into context. Anthropic recommends keeping this body under 5,000 tokens.

Tier 3 - Execution. Complex skills bundle supporting files - reference docs, templates, executable scripts. The SKILL.md points to them, and Claude reads them only when the specific scenario demands it. Anthropic's own PDF skill keeps form-filling instructions in a separate forms.md that Claude opens only when actually filling a form.

Here is how skills compare to the two things people most often confuse them with:

DimensionSystem prompt / CLAUDE.mdAgent SkillMCP server
What it isStatic text loaded every messageFolder + SKILL.md loaded on demandRunning process Claude calls via JSON-RPC
Token costFull cost, always~100 tokens until activatedTool definitions in context, plus call results
Best forShort, always-relevant factsProcedures, conventions, workflowsLive data access and external actions
Can execute codeNoVia bundled scripts Claude runsYes, server-side
Setup effortNoneWrite a markdown fileInstall/run a server, configure auth
PortabilityTool-specificOpen standard, cross-toolOpen standard, cross-tool

The one-line summary you should internalize: skills teach Claude how, MCP gives Claude access, the system prompt tells Claude who it is. If you are new to MCP, our MCP server tutorial for AI agents covers that side of the equation, and we will return to the skills-vs-MCP decision in detail later in this post.

Anatomy of a SKILL.md File: Annotated Example

Let us dissect a real SKILL.md. This one teaches Claude how to write database migrations for a fictional but realistic Postgres setup. Everything below is a working skill you could drop into .claude/skills/db-migrations/SKILL.md today:

---
name: db-migrations
description: >
  Writes and reviews Postgres database migrations for this project.
  Use when the user asks to add a column, create a table, change a
  schema, or write/review any migration file.
---

# Database Migrations

## Rules (always apply)
- Every migration needs an up AND a down script
- Never drop a column in the same release that stops writing to it
  (two-phase: stop writes first, drop in the next release)
- All timestamps are timestamptz, never timestamp
- Every foreign key gets an index

## Workflow
1. Read the latest migration in migrations/ to get the next number
2. Generate the migration using the naming scheme
   NNNN_verb_object.sql (e.g. 0042_add_user_email_index.sql)
3. Run scripts/check-migration.sh <file> to validate
4. Show the user the up and down scripts before writing the file

## Large table changes
For tables over 1M rows, read reference/large-tables.md first.
It covers CREATE INDEX CONCURRENTLY and batched backfills.

Now the annotations, top to bottom:

The frontmatter carries exactly what discovery needs. The open standard requires name (lowercase, hyphens, under 64 characters, matching the folder name) and description (under 1,024 characters). The description above does two jobs: it says what the skill does AND when to trigger it, including the literal phrases a user might say ("add a column", "create a table"). This matters because at Tier 1, the description is the only thing Claude sees. A description like "Database migration helper" gives Claude nothing to match against; the version above practically routes itself.

The body is instructions, not documentation. Notice the imperative voice: "never drop a column", "run scripts/check-migration.sh". You are writing operating procedure for an agent, not a wiki page for a human. Every line costs tokens once the skill activates and stays in context for the rest of the turn, so state what to do without narrating why.

The workflow section makes behavior deterministic. Without step 1, Claude might guess the next migration number. Without step 4, it might write the file before you review it. Numbered steps turn a vague capability into a repeatable process - which is exactly what you want when you later run this skill inside automated pipelines and need to test it. Our AI agent evals guide covers how to verify skills behave consistently.

The last section is Tier 3 progressive disclosure in action. The large-table playbook might be 3,000 tokens of detailed guidance. Instead of bloating every activation with it, the SKILL.md tells Claude the file exists and when to read it. Claude opens reference/large-tables.md only when the task involves a big table.

Claude Code extends the open standard with optional frontmatter fields worth knowing: allowed-tools (tools Claude may use without asking while the skill is active), disable-model-invocation: true (the skill only runs when a human types /db-migrations, never automatically), context: fork (run the skill in a forked subagent context so its work does not pollute the main conversation), model (temporarily switch models while the skill runs), and paths (only activate when working with files matching a glob). All of these are optional - the portable core is just name, description, and the markdown body.

Building Your First Skill, Step by Step

Theory done. Let us build a skill from scratch in about ten minutes. We will make a release-notes skill that turns merged changes into customer-facing release notes - a task most teams do badly and inconsistently, which makes it perfect skill material.

Step 1: Create the folder. Personal skills live in ~/.claude/skills/ and apply to all your projects. Project skills live in .claude/skills/ inside the repo and should be committed to version control so your whole team gets them. We will make this one a project skill:

mkdir -p .claude/skills/release-notes

Step 2: Write the SKILL.md. Create .claude/skills/release-notes/SKILL.md:

---
name: release-notes
description: >
  Generates customer-facing release notes from recent git history.
  Use when the user asks for release notes, a changelog entry,
  or "what shipped" summaries.
---

# Release Notes Generator

## Gather changes
Run: git log --oneline --no-merges v-latest..HEAD
If no tags exist, use the last 20 commits.

## Writing rules
- Group into: New, Improved, Fixed
- Write for customers, not engineers: no file names, no
  internal ticket IDs, no "refactor" entries
- One line per change, verb first ("Added CSV export to reports")
- Skip pure chores (dependency bumps, CI changes) entirely
- Match the tone of examples/sample-notes.md

## Output
Write to RELEASE_NOTES_DRAFT.md and show the user a summary.
Never publish or tag anything.

Step 3: Add a supporting file. Create examples/sample-notes.md inside the skill folder with two or three of your best past release notes. This is cheaper and more reliable than describing tone in prose - Claude imitates examples extremely well. Your folder now looks like:

.claude/skills/release-notes/
├── SKILL.md
└── examples/
    └── sample-notes.md

Step 4: Test both invocation paths. Claude Code watches skill directories, so the skill is live without a restart. Test the explicit path first by typing /release-notes - this guarantees activation and lets you debug the instructions in isolation. Then test automatic discovery by asking naturally: "write release notes for what we shipped this sprint." If Claude does not pick the skill up, your description is the problem, not the body. Add the trigger phrases users actually say.

Step 5: Iterate on failure modes. Run it on three or four real releases. Every time the output is wrong, ask: is this a missing rule, a vague rule, or a missing example? Add the fix to the skill file, not to your chat message. This is the core discipline of skill authoring - corrections go into the artifact, so you never make the same correction twice. Teams that adopt this habit end up with skills that encode months of accumulated judgment.

One sizing guideline before we move on: if your SKILL.md grows past roughly 400 lines, split scenario-specific content into reference files and link them, exactly as the migrations example did. Keep the always-relevant core lean and push the long tail into Tier 3.

Subagents Explained: Context Isolation and When to Delegate

Skills solve the "how do I teach Claude procedures without burning context" problem. Subagents solve a different one: how do I keep my main conversation clean while heavy work happens.

A subagent in Claude Code is a specialized assistant that runs in its own context window, with its own system prompt, its own tool access, and optionally its own model. When the main Claude session encounters a task matching a subagent's description, it delegates: the subagent works independently - reading files, running commands, grinding through logs - and returns only its final summary to the parent. All the intermediate noise (the 40 file reads, the 2,000-line test output) stays in the subagent's context and dies with it.

Why this matters in practice: context is the scarcest resource in agentic work. A single "find where authentication happens in this codebase" investigation can consume 30,000+ tokens of file contents. Do that three times in a session and your main conversation degrades - Claude starts forgetting earlier decisions, and you hit compaction. Delegate those investigations to subagents and your main context holds only three short summaries.

Claude Code ships with built-in subagents you are probably already using without noticing: Explore (a fast, read-only Haiku-powered agent for codebase search), Plan (read-only research during plan mode), and general-purpose (full-tool agent for complex multi-step work). Custom subagents let you add your own specialists on top.

Here is a practical decision rule for when to delegate versus doing the work in the main thread:

SituationMain thread or subagent?Why
Single fact lookup, known fileMain threadDelegation overhead exceeds the savings
Broad codebase search across many filesSubagentSearch output would flood main context
Independent parallel tasks (e.g. review 3 modules)Multiple subagentsThey run concurrently in separate contexts
Task needing strict tool limits (read-only audit)SubagentFrontmatter enforces restrictions the main thread cannot
Work depending on full conversation historyMain threadSubagents start fresh and lack your context
Cheap repetitive work (lint triage, log scanning)Subagent on HaikuRoute to a faster, cheaper model

The last row deserves emphasis: because each subagent can specify its own model, you can run a Haiku-powered scanner for grunt work while your main session stays on a frontier model for reasoning. This routing pattern is one of the biggest cost levers in agent engineering, and it is the same architectural idea behind multi-agent handoffs in other frameworks - if you have read our OpenAI Agents SDK tutorial, subagents are Claude Code's native answer to that pattern. For a broader look at how the two ecosystems compare for business workloads, see ChatGPT vs Claude for business agents.

One caveat that trips people up: subagents start with a fresh context. They receive their system prompt, the task description the parent writes for them, and basic environment details - not your conversation history. Write delegation prompts accordingly: include the constraints and decisions the subagent needs, because it cannot see the chat where you established them.

Building a Subagent: The Definition File

Custom subagents are markdown files with YAML frontmatter, stored in .claude/agents/ (project scope, check into git) or ~/.claude/agents/ (personal, all projects). The frontmatter is configuration; the markdown body becomes the subagent's system prompt. Here is a production-quality example - a read-only security reviewer:

---
name: security-reviewer
description: >
  Reviews code changes for security vulnerabilities. Use proactively
  after modifying authentication, authorization, input handling,
  file uploads, or anything touching user data.
tools: Read, Grep, Glob, Bash
model: sonnet
---

You are a senior application security engineer reviewing code.

## Scope
Review ONLY for security issues. Do not comment on style,
naming, or performance.

## Checklist
- Injection: SQL, command, template, header
- AuthZ: every endpoint checks permissions, not just authentication
- Secrets: no keys, tokens, or credentials in code or logs
- Input validation at trust boundaries, output encoding at render
- Unsafe deserialization and path traversal in file handling

## Output format
For each finding: severity (Critical/High/Medium/Low), file and
line, one-line issue, one-line fix. If nothing is found, say so
explicitly and list what you checked. Never modify any file.

Walking through the configuration choices:

name is the unique identifier, lowercase with hyphens. description is how the parent Claude decides when to delegate - note the phrase "use proactively", which nudges Claude to invoke this agent on its own after relevant changes rather than waiting to be asked. These two fields are the only required ones.

tools is an allowlist. This agent gets Read, Grep, Glob, and Bash - it can inspect anything but is structurally incapable of editing files, because Write and Edit are simply not in its tool pool. This is a real guarantee, not a prompt-level suggestion. The inverse also exists: disallowedTools: Write, Edit inherits everything except the listed tools, which is handy when you want an agent to keep MCP tools but never write. If you omit both fields, the subagent inherits all tools from the main conversation.

model pins this agent to Sonnet regardless of what the main session runs. Options are the model aliases, a full model ID, or inherit (the default). Other useful optional fields: maxTurns caps how many agentic turns the subagent takes before stopping, permissionMode controls its permission behavior, memory gives it a persistent directory to accumulate learnings across sessions, and skills preloads skill content into its context - we will use that one in the next section.

The body is a focused system prompt. Notice how much narrower it is than a general assistant prompt: explicit scope, an explicit checklist, an explicit output format, and an explicit instruction for the empty case ("if nothing is found, say so"). Subagents perform dramatically better with narrow charters. A "backend-helper" agent is a smell; a "security-reviewer" agent with a five-item checklist is a tool.

Two ways to create these besides writing the file by hand: the /agents command in Claude Code opens a guided interface (including "Generate with Claude", which drafts the definition from a plain-English description), and the --agents CLI flag accepts JSON definitions for session-scoped agents, useful in CI. Note that file-based subagents are loaded at session start, so restart your session after adding one manually.

To invoke your new agent, either mention it explicitly ("use the security-reviewer agent on this diff") or just do related work and let the description-based routing trigger it. When you watch it run, everything it reads stays out of your main context - you get back only the findings table.

Skills + Subagents Together: The Orchestration Pattern

Skills and subagents solve orthogonal problems, which means they compose beautifully. Skills package knowledge; subagents package execution contexts. The production pattern that falls out of this is what I call the lean orchestrator:

The main session stays thin. Its job is understanding your intent, decomposing work, delegating, and integrating results. It should not be grinding through files or holding 4,000 tokens of migration conventions in its head.

Specialist subagents do the heavy work, each in its own context, each restricted to the tools it needs, each on the cheapest model that can do its job.

Skills carry the domain knowledge into whichever context needs it. The same db-migrations skill can activate in the main session when you are chatting about schema design, and be preloaded into a migration-runner subagent when it is time to execute.

The wiring for that last part is the skills frontmatter field on the subagent definition:

---
name: migration-runner
description: >
  Writes, validates, and applies database migrations. Use for any
  schema change task.
tools: Read, Write, Edit, Bash, Grep
skills: db-migrations
model: sonnet
---

You execute database schema changes end to end. Follow the
db-migrations skill exactly. After applying a migration, run the
test suite and report results. If validation fails, stop and
report - never force through a failing check.

The skills: db-migrations line injects the full skill content into the subagent's context at startup - not just the description. This matters because a fresh subagent has no conversation history and would otherwise have to rediscover your conventions. Preloading guarantees the specialist starts with its playbook already open. The subagent can still invoke other project and personal skills on demand through the normal discovery mechanism.

There is also an inverse composition: a skill can request subagent execution. Setting context: fork in a skill's frontmatter runs that skill in a forked context, and the optional agent field picks which subagent type hosts it. Use this for skills that generate a lot of intermediate noise - a "run the full benchmark suite and summarize" skill should fork, because nobody wants 5,000 lines of benchmark output in the main thread.

A concrete end-to-end flow with all the pieces: you ask for "add soft deletes to the orders table." The orchestrator (main session) breaks this into research, implementation, and review. It sends the Explore built-in to map current delete behavior (parallel, read-only, Haiku). It delegates schema work to migration-runner, which starts with the db-migrations skill preloaded. It then hands the resulting diff to security-reviewer, which is structurally read-only. Your main context receives three summaries totaling maybe 900 tokens, while perhaps 60,000 tokens of actual work happened in isolated contexts. That is the entire economic argument for this architecture in one paragraph.

When you run this pattern in production, instrument it - delegation makes failures quieter because they happen off the main thread. Our agent observability guide covers tracing multi-agent flows, and the production-grade agent engineering course goes deep on hardening these pipelines.

Real Business Use Cases (Not Toy Examples)

Everything above works the same whether you are automating your own dev workflow or building agent systems for clients. Here are four deployments where the skills-plus-subagents architecture pays for itself fast.

1. Brand-consistent content operations. A marketing agency encodes each client's voice, banned phrases, compliance rules, and formatting standards as one skill per client (acme-brand-voice, globex-brand-voice). Writers work with Claude normally; the right skill activates based on which client is mentioned. Before skills, this lived in a shared doc that half the team pasted inconsistently. After, a new writer produces on-brand copy on day one, and updating the rules means editing one file that instantly applies to everyone. Time to onboard a new client's voice: about an hour of skill authoring.

2. Financial reporting with separation of duties. A finance team runs a monthly close workflow: a data-extraction subagent (read-only, restricted to the reporting directory and approved scripts) pulls and reconciles numbers, a report-writer subagent with a preloaded board-report-format skill drafts the pack, and a checker subagent verifies every figure in the draft against the extracted source data. The tool restrictions are the point here: the agent that writes prose literally cannot touch the raw data pipeline. A close that took a controller three days now takes half a day of review.

3. Support engineering triage. A SaaS team defines a ticket-triage subagent on Haiku that reads incoming bug reports, reproduces the issue from logs, and classifies severity - preloaded with a skill encoding their severity rubric and escalation rules. High-severity tickets trigger delegation to a general-purpose agent that drafts a fix. Because triage runs on the cheap model and everything heavier is gated behind classification, the per-ticket cost stays pennies while first-response quality goes up.

4. Codebase governance at a consultancy. A dev shop maintains a plugin containing their house skills (API conventions, testing standards, PR checklist) and subagents (security-reviewer, performance-auditor) and installs it on every client engagement. Every engineer's Claude Code session enforces the same standards across every project. This is also a productizable asset: firms are starting to sell packaged skills and agents the way they sell MCP servers - if that angle interests you, the build and sell MCP servers course covers the commercial playbook, and much of it transfers directly to skills.

The common thread: in each case the skill or agent file is the institutional knowledge, version-controlled and reviewable like code. That is the real business shift - expertise stops living in senior employees' heads and starts living in artifacts the whole organization executes. If you want help designing one of these systems for your own company, talk to us.

Skills vs MCP Servers: The Decision Table

This is the question every team hits within a week of adopting skills: "wait, should this be a skill or an MCP server?" The confusion is understandable because both extend what Claude can do, but they extend it in fundamentally different directions.

An MCP server is a running process that Claude talks to over JSON-RPC - it can query your database, call your CRM, post to Slack, and hold credentials server-side. A skill is static instruction content - it does not run, connect, or authenticate to anything by itself (though it can include scripts that Claude executes with its normal tools). MCP is connectivity; skills are competence.

You need Claude to...UseReasoning
Query live data (database, CRM, analytics)MCPRequires a live connection and auth
Follow your team's procedures and conventionsSkillPure knowledge, no connectivity needed
Take actions in external SaaS toolsMCPActions need an authenticated integration
Format outputs to a house standardSkillInstructions plus example files do this best
Work with local files and CLI tools it already hasSkillTeach usage of existing tools; no server needed
Expose one capability to many different AI clientsMCPServer-side logic, centrally maintained
Use your MCP tools correctly and in the right orderBothMCP provides tools, a skill provides the playbook

That last row is the insight most teams miss: the best MCP deployments ship with a companion skill. Your MCP server exposes fifteen CRM tools; a skill teaches Claude which ones to use for which request, in what order, and with which filters ("always scope contact searches by account first; never bulk-update without listing affected records"). The MCP server without the skill produces an agent that has access but fumbles; the skill without the server produces an agent that knows exactly what to do and cannot do it.

Cost is a secondary but real factor. Every connected MCP server puts its tool definitions into context, and heavy servers can add thousands of tokens before any work happens. Skills, thanks to progressive disclosure, cost ~100 tokens each until used. If a capability can be expressed as "instructions for using tools Claude already has" (Bash, file access, existing CLIs), a skill is usually the cheaper and simpler build. If it needs credentials, remote APIs, or shared server-side logic, it is MCP by necessity.

Two related decision points we have covered elsewhere: how MCP compares to plain function calling is in our MCP vs function calling comparison, and when you conclude a capability does belong server-side, our guide to building an MCP server takes you from zero to a working implementation, with the MCP server tutorial covering the protocol fundamentals.

Common Mistakes (And How to Avoid Them)

These are the failure modes I see most often in teams adopting skills and subagents, roughly in order of frequency.

1. Descriptions that describe but do not route. "Helps with database work" tells Claude nothing about when to activate. The description is a routing function: include the task types and trigger phrases users actually say. If Claude keeps failing to pick up your skill, fix the description before touching the body - the body is invisible at decision time.

2. Writing skills like documentation. Background history, motivational preambles, three paragraphs on why the convention exists. Claude needs operating instructions: rules, steps, examples, output formats. Every explanatory sentence is a recurring token cost that buys no behavior change. Ruthlessly cut anything that is not actionable.

3. One mega-skill instead of several focused ones. A 1,500-line "backend-standards" skill defeats progressive disclosure - every activation pays for all of it, and the description becomes so broad it triggers constantly. Split by task: one skill per distinct activity, each with a sharp description. Ten focused skills cost less in practice than one giant one, because usually only one activates.

4. Duplicating content between CLAUDE.md and skills. Pick a home for each piece of knowledge. Short, always-relevant facts (project structure, build commands) belong in CLAUDE.md. Procedures and long reference material belong in skills. When both contain slightly different versions of the same rule, Claude receives contradictory instructions and behavior becomes unpredictable.

5. Subagents with vague charters and full tool access. A "helper" agent that inherits every tool is just the main session with extra steps and less context. The value of a subagent comes from its constraints: narrow description, minimal tool list, explicit output format. If you cannot state in one sentence what the agent must never do, its definition is not finished.

6. Delegating tasks that need conversation context. Subagents start fresh. Handing one a task like "fix the issue we discussed" fails silently - it never saw the discussion. Delegation prompts must be self-contained: the constraint, the acceptance criteria, the relevant file paths. Treat every delegation like a ticket you would write for a contractor who just joined.

7. Never testing skills against regressions. Skills are code by another name, and edits to them can silently break behavior that used to work. Keep a handful of representative test prompts per skill and re-run them after edits. For anything business-critical, promote this to proper evals - our agent testing guide shows how to build a lightweight harness for exactly this.

8. Ignoring the security surface. Skills are instructions Claude follows and may include scripts Claude executes - so treat third-party skills like third-party code. Read them before installing, prefer version-controlled sources, and use tool restrictions (allowed-tools on skills, tools on subagents) as your enforcement layer rather than trusting prompt-level promises.

Next Steps: From First Skill to Full Agent System

Here is the adoption path that works, distilled from rolling this out across our own projects and client teams:

Week 1: one skill. Take the instructions you paste most often - commit format, PR checklist, report template - and turn them into a project skill. Iterate until it fires reliably from natural requests. This teaches you description-writing, the highest-leverage authoring subskill.

Week 2: one subagent. Build a read-only specialist (a reviewer or researcher) with a restricted tool list. Watch how much cleaner your main context stays when heavy investigation happens off-thread.

Week 3: compose them. Preload your skill into a subagent via the skills field and run a real multi-step task through the orchestrator pattern. At this point you have every architectural primitive that production Claude agent systems are built from - the rest is refinement, testing, and scale.

Then go deeper. If you want the complete, structured version of this path - skill design patterns, multi-agent orchestration architectures, cost engineering, packaging skills and agents as plugins for teams and clients, plus the eval and hardening work that separates demos from production - that is exactly what we teach in the Claude Skills and Subagents Masterclass. It takes everything in this post and turns it into working systems you build hands-on, including the client-facing playbooks from the business use cases above.

Related reading to round out the picture: pair your skills with live integrations using the MCP server tutorial, compare ecosystems in ChatGPT vs Claude for business agents, and when your agent system starts doing real work, instrument it with the observability guide. And if you would rather have a team that does this daily design and build your agent infrastructure, work with us.

The meta-lesson worth ending on: skills and subagents are not features, they are a discipline. Every correction you make in chat is knowledge that evaporates; every correction you encode in a SKILL.md or agent definition compounds. The teams winning with AI agents in 2026 are not the ones with clever prompts - they are the ones whose expertise lives in version-controlled artifacts that every session, every teammate, and every agent executes identically.

FAQ

What is the difference between Claude skills and subagents?

A skill is knowledge: a folder with a SKILL.md file containing instructions that Claude loads on demand into whatever context is running. A subagent is an execution context: a separate Claude instance with its own context window, system prompt, tool restrictions, and model. Skills teach behavior; subagents isolate work. They compose - you can preload skills into a subagent using the skills frontmatter field.

Do Agent Skills work outside Claude Code?

Yes. Skills work in Claude Code, claude.ai, and via the Claude API. Since the SKILL.md format was released as an open standard, it is also supported by other tools including OpenAI's Codex CLI, Gemini CLI, and GitHub Copilot. The portable core is the frontmatter (name, description) plus markdown body; Claude Code-specific frontmatter fields like context: fork or allowed-tools are extensions.

Where do I put skill and subagent files in Claude Code?

Project skills go in .claude/skills/<skill-name>/SKILL.md and project subagents in .claude/agents/<name>.md, both inside your repo so they can be committed and shared with your team. Personal versions live in ~/.claude/skills/ and ~/.claude/agents/ and apply across all your projects. Both can also ship inside plugins for team-wide distribution.

How many tokens does a skill cost?

Roughly 100 tokens per installed skill at session start (just the name and description). The full SKILL.md body only enters context when the skill activates, and Anthropic recommends keeping that body under 5,000 tokens. Supporting reference files cost nothing until Claude actually reads them. This three-tier progressive disclosure is why you can install dozens of skills without bloating every request.

Can subagents call other subagents?

By default a custom subagent cannot spawn other subagents unless its tools list includes the Agent tool. When it does, nested delegation works, but keep hierarchies shallow - one orchestrator delegating to specialists covers almost every real workflow, and deep nesting makes failures hard to trace. Parallel delegation from the main session is usually the better pattern than deep chains.

Should I build a skill or an MCP server for my use case?

Ask one question: does the capability need live connectivity or credentials to external systems? If yes (databases, CRMs, SaaS APIs), you need an MCP server. If the capability is knowledge - procedures, conventions, formats, how to use tools Claude already has - build a skill. Many production setups use both: the MCP server provides the tools, and a companion skill teaches Claude how to use them correctly.

Why is Claude not picking up my skill automatically?

Almost always a description problem. At decision time Claude only sees your skill's name and description, not the body. Rewrite the description to state both what the skill does and when to use it, including the literal phrases users say. Test by invoking it explicitly with /skill-name first to confirm the body works, then fix the description until natural requests trigger it.

Do subagents see my conversation history?

No. Subagents start with a fresh context containing their system prompt, the task description written by the parent session, and basic environment details. They do not receive the parent conversation. Write delegation prompts as self-contained briefs: include the constraints, relevant file paths, and acceptance criteria, because the subagent cannot see where you discussed them.

All posts
2026-06-06

Done for you

Want this built for your business?

Tell us the workflow. We scope, build, and ship the agent with guardrails and the numbers to prove it worked. The scoping call is free.

Free scoping call. You own the code.

Request a scoping call