Technical · 2026-05-06 · Last verified 2026-07-09

Build an MCP Server From Scratch (Step-by-Step)

Learn how to build a Model Context Protocol (MCP) server from scratch. This tutorial covers the MCP architecture, tool definition, resource exposure, prompt templates, testing with MCP Inspector, and connecting to Claude Desktop and other AI clients.

Deep · ML Architect & Full Stack Engineer

10+ years shipping production ML across TensorFlow, PyTorch, AWS, and GCP. Ships every A8gent agent before it becomes a lesson. GitHub

Key takeaways

MCP (Model Context Protocol) is a standard protocol that lets AI models interact with external tools and data sources through a server-client architecture, replacing custom API integrations with a universal interface.
An MCP server exposes three primitives: Tools (functions the AI can call), Resources (data the AI can read), and Prompts (reusable prompt templates) - each defined with JSON Schema for type safety.
Building a basic MCP server takes under 100 lines of code using the official Python or TypeScript SDK, making it one of the fastest ways to give AI models access to your custom data and business logic.
MCP Inspector is the essential testing tool - it lets you interactively test your server's tools, resources, and prompts before connecting to any AI client, catching issues early in development.
Production MCP servers should implement authentication, rate limiting, input validation, and comprehensive logging - the protocol handles transport but security is your responsibility.

What Is MCP and Why Should You Build a Server?

The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external tools and data sources. Think of it as a USB standard for AI - before USB, every device had its own proprietary connector. Before MCP, every AI integration required custom code. MCP provides a universal interface that any AI client (Claude Desktop, Cursor, Windsurf, custom applications) can use to interact with any MCP server, regardless of what the server does internally.

Why build an MCP server instead of a regular API? Because an MCP server is automatically discoverable by AI models. When you connect an MCP server to Claude Desktop, Claude can see all available tools, understand their parameters through JSON Schema definitions, and call them as needed during conversations. You do not need to write a single line of prompt engineering to explain your API to the model - the MCP protocol handles that. The model reads the tool definitions and knows exactly how to use them.

The MCP architecture has three layers. Transport: how the client and server communicate (stdio for local servers, HTTP with SSE for remote servers). Protocol: the JSON-RPC message format that defines requests, responses, and notifications. Capabilities: the three primitives your server can expose - Tools (actions the AI can perform), Resources (data the AI can read), and Prompts (reusable prompt templates). You do not need to implement all three. Most servers start with Tools only and add Resources and Prompts as needed.

Real-world MCP servers power a wide range of use cases. Database access servers let AI models query your PostgreSQL or MongoDB databases with natural language. API integration servers wrap third-party APIs (Slack, GitHub, Jira) so AI can interact with them. File system servers let AI read, search, and modify files in controlled directories. Business logic servers expose your company's specific calculations, rules, and workflows. The official MCP specification is maintained by Anthropic and has been adopted by multiple AI platforms.

In this tutorial, we will build a practical MCP server that provides AI models with access to a project management system. The server will expose tools for creating and updating tasks, resources for reading project status and team workload data, and prompt templates for common project management operations. This example is directly applicable to any business that manages projects and wants AI to interact with their project data. If you have already read our MCP server tutorial for AI agents, this article goes deeper into the implementation details and production patterns.

Prerequisites: Python 3.10+ or Node.js 18+ (we will cover both), and basic familiarity with JSON and either language. No AI/ML experience is required - MCP servers are standard software that happen to be called by AI models. Install the Python SDK with pip install mcp or the TypeScript SDK with npm install @modelcontextprotocol/sdk.

Project Setup and Your First MCP Tool

Let us start by creating the project structure and implementing our first tool. The MCP SDK provides a high-level Server class that handles all the protocol details - you just define your tools, resources, and prompts using decorators. The server handles JSON-RPC message parsing, request routing, error formatting, and response serialization automatically.

The project structure for a Python MCP server is minimal. You need a single Python file for a simple server (we will call it server.py), a pyproject.toml for dependencies, and optionally a configuration file for server metadata. For TypeScript, it is similarly lean: an index.ts file and a package.json. This simplicity is by design - MCP servers should be lightweight and focused.

MCP Server From Scratch - data overview" width="1264" height="695" style="width:100%;height:auto;border-radius:12px;margin:24px 0;" loading="lazy" />

The server initialization involves creating an instance of the Server class with a name and version. Then you define tools using the @server.tool() decorator (Python) or server.setRequestHandler (TypeScript). Each tool needs a name, a description (this is what the AI model reads to understand the tool), and a function that takes validated parameters and returns a result. The SDK automatically generates JSON Schema from your type annotations.

Using the Python SDK's FastMCP interface, the whole server (instantiation plus a tool registered with a decorator) fits in a handful of lines:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("project-manager")

@mcp.tool()
def create_task(title: str, description: str = "", assignee: str = "") -> dict:
    """Create a new task in the project management system."""
    task_id = tasks_store.create(title=title, description=description, assignee=assignee)
    return {"task_id": task_id, "status": "created"}

if __name__ == "__main__":
    mcp.run()

Let us define our first tool: create_task. This tool creates a new task in our project management system. It takes three parameters: title (string, required), description (string, optional), and assignee (string, optional). The function validates the inputs, creates the task in the backend (for this tutorial, we will use an in-memory store; in production, this would be a database call), and returns a confirmation with the new task ID.

The description you write for each tool is critically important - it is the primary way AI models understand what the tool does and when to use it. Write descriptions that are clear, specific, and include edge cases. Bad description: "Creates a task." Good description: "Creates a new task in the project management system. Requires a title (1-200 characters). Optionally accepts a description and an assignee email. Returns the created task ID and status. Use this when the user wants to add a new work item, todo, or action item." The difference between a good and bad tool description is often the difference between an AI model using your tool correctly and misusing it.

Running the server locally uses the stdio transport. You start the server as a subprocess, and the AI client communicates with it through standard input/output. The command to run is simply python server.py (Python) or npx tsx index.ts (TypeScript). The SDK handles the stdio transport setup automatically. You can also run the server with the HTTP+SSE transport for remote access, which we will cover in the deployment section.

A practical tip for development: use extensive logging from the start. Log every incoming request, every tool execution, and every response. MCP servers can be tricky to debug because the AI client controls the conversation flow - you cannot predict which tools will be called in which order. Good logging turns debugging from guesswork into straightforward trace analysis. The SDK provides a built-in logging facility that writes to stderr (separate from the stdout protocol channel), so your debug output never interferes with protocol messages. The official MCP servers repository has reference implementations for common server types that are excellent learning resources.

Building Production-Quality Tools

With the basic structure in place, let us build out a complete set of tools for our project management server. Production-quality tools need proper input validation, error handling, and response formatting. Each tool should follow the same pattern: validate inputs, execute the operation, handle errors gracefully, and return a structured result that the AI can work with.

Our server will have five tools: create_task (which we already defined), update_task, list_tasks, get_task_details, and assign_task. Each tool serves a distinct purpose and has clear parameters. The AI model will choose which tool to call based on the user's request and the tool descriptions.

The update_task tool takes a task ID and the fields to update (title, description, status, priority). A key design decision: should a single "update" tool handle all field changes, or should there be separate tools for each field? Our recommendation is a single update tool with optional parameters for each field. This reduces the total number of tools (fewer tools means better tool selection by the AI) while remaining flexible. The tool should validate that at least one field is being updated and that the task ID exists.

A single flexible update_task tool with optional fields, raising a clear error when the task is missing, looks like this in Python:

from typing import Optional

@mcp.tool()
def update_task(
    task_id: str,
    title: Optional[str] = None,
    status: Optional[str] = None,
    priority: Optional[str] = None,
) -> dict:
    """Update fields on an existing task. Only provided fields are changed."""
    if task_id not in tasks_store:
        raise ValueError("No task with ID " + task_id + " exists.")
    updated = tasks_store.update(task_id, title=title, status=status, priority=priority)
    return {"task_id": task_id, "updated_fields": updated}

The list_tasks tool demonstrates an important pattern: filtering and pagination. In a real project management system, you might have thousands of tasks. Returning all of them is wasteful and overwhelming for the AI model. Instead, the tool accepts filter parameters: status (open, in_progress, done), assignee (email), project (project name), and limit (maximum results, default 20). This lets the AI make targeted queries like "show me all open tasks assigned to Sarah" rather than fetching everything and trying to filter in context.

Error handling in MCP tools follows a specific pattern. The SDK defines error codes for different failure types: invalid parameters, resource not found, internal errors, and permission denied. When a tool encounters an error, it should return an McpError (or in the Python SDK, raise an exception that the framework catches and converts to the appropriate error response). The error message should be human-readable and helpful - remember, the AI model will read this error message and decide how to respond to the user. "Task not found" is less helpful than "No task with ID T-1234 exists. The user may have misremembered the ID - try listing recent tasks to find the correct one."

Response formatting matters more than you might expect. The AI model processes your tool's output as text, so structure your responses for readability. For the list_tasks tool, instead of returning raw JSON, return a formatted summary: "Found 5 tasks matching your criteria: T-101: Design homepage (open, assigned to Sarah), T-102: Write API docs (in progress, assigned to Mike)..." This formatting reduces the cognitive load on the model and produces better final responses to the user.

One pattern that improves tool reliability significantly is confirmation for destructive actions. If you build a delete_task tool, do not delete immediately. Instead, return a confirmation message: "This will permanently delete task T-1234 'Design homepage' and all associated comments. To confirm, call delete_task again with confirm=true." This gives the AI (and the user) a chance to verify before an irreversible action. The two-step confirmation pattern prevents accidental data loss and is especially important when AI models are making autonomous decisions. For a deeper dive into agent tool design, our RAG agent tutorial covers tool design patterns in the context of retrieval systems.

Adding Resources and Prompt Templates

Tools are actions the AI performs. Resources are data the AI reads. The distinction matters because resources are exposed to the AI client upfront (the client can list and read them proactively) while tools are called on-demand during conversations. Think of resources as the "read" side and tools as the "write" side of your MCP server.

MCP Server From Scratch - analysis" width="1264" height="695" style="width:100%;height:auto;border-radius:12px;margin:24px 0;" loading="lazy" />

Resources are identified by URIs and can be static (a fixed configuration file) or dynamic (data that changes with each read). For our project management server, we will expose three resources: project://status (an overview of all projects with task counts by status), project://team-workload (a summary of tasks assigned to each team member), and project://recent-activity (the last 20 task updates across all projects). Each resource returns text content that the AI can incorporate into its context.

Implementing resources uses the @server.resource() decorator. You specify the URI pattern and a function that returns the resource content. Resources can return text (plain text or markdown), JSON (for structured data), or binary content (for images or files, though text is more common for AI consumption). For our project status resource, the function queries the task database, aggregates counts by project and status, and returns a formatted summary.

The project status resource below aggregates task counts and returns a plain-text summary the model can read directly:

@mcp.resource("project://status")
def project_status() -> str:
    """Overview of all projects with task counts by status."""
    projects = tasks_store.aggregate_by_project()
    lines = []
    for project in projects:
        lines.append(project["name"] + ": " + str(project["open"]) + " open, " +
                      str(project["done"]) + " done")
    return "
".join(lines)

Resource templates add a powerful pattern: parameterized URIs. Instead of a fixed project://status resource, you can define project://{project_name}/status which returns status for a specific project. The AI client sees the template and can request any specific project's status. This is cleaner than building a "get project status" tool because the AI can proactively read the resource when building context, without the user explicitly asking.

Prompt templates are the third MCP primitive. They are reusable prompt structures that the AI client can present to users. For project management, useful prompts include: "Sprint Planning" (a template that pulls in the current backlog and asks the AI to help prioritize), "Status Report" (a template that reads project status and team workload resources and generates a formatted report), and "Task Breakdown" (a template that takes a high-level objective and generates sub-tasks). Prompts are defined with @server.prompt() and return a list of messages (system and user) that the client uses to start a conversation.

The practical value of prompts is consistency. Instead of users crafting different prompts for the same operation each time, prompts codify your organization's best practices. The "Status Report" prompt always pulls the same data, formats it the same way, and produces consistent output. This is especially valuable when multiple team members interact with the same MCP server - everyone gets the same quality of AI assistance regardless of how they phrase their request.

A design principle: keep resources read-only and idempotent. A resource read should never modify state. If reading a resource triggers a side effect (like marking notifications as read), move that to a tool instead. This principle is important because AI clients may read resources speculatively or cache them - side effects in resource reads lead to unpredictable behavior. Similarly, resources should always return the current state. Do not implement resources that "remember" previous reads or return different data based on who is reading. Statelessness makes your server predictable and debuggable. For additional patterns in building AI-accessible data systems, our AI agents for small business guide covers practical integration approaches.

Testing With MCP Inspector and Claude Desktop

MCP Inspector is the official testing tool for MCP servers, and it should be your primary development companion. It provides an interactive web UI where you can: see all tools, resources, and prompts your server exposes; call tools with custom parameters and see the results; read resources and verify their content; test prompt templates; and inspect the raw JSON-RPC messages flowing between client and server. Install it with npx @modelcontextprotocol/inspector and point it at your server.

Testing with MCP Inspector follows a systematic approach. First, verify discovery: start your server with the Inspector and check that all tools, resources, and prompts appear in the sidebar. Each tool should show its full JSON Schema, including parameter types, descriptions, and required/optional flags. If a tool is missing or its schema is wrong, fix it before testing anything else - the AI model will see the same schema, and any issues here will cause problems in production.

Second, test the happy path for every tool. Call each tool with valid parameters and verify the response. Check that the response format is clean and readable. Verify that side effects occurred correctly (if create_task returns a task ID, does get_task_details with that ID return the correct task?). Third, test error paths. Call tools with missing required parameters, invalid parameter types, nonexistent IDs, and boundary values. Verify that error messages are helpful and that the server does not crash on bad input.

Fourth, test resources. Read each resource and verify the content is accurate and well-formatted. For dynamic resources, modify the underlying data and re-read to verify the resource reflects changes. For resource templates, test with various parameter values including edge cases (empty strings, very long strings, special characters).

After MCP Inspector testing, connect your server to Claude Desktop for end-to-end testing. Add your server to Claude Desktop's configuration file (located at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS or %APPDATA%/Claude/claude_desktop_config.json on Windows). The configuration specifies the command to run your server and any environment variables it needs. Once configured, restart Claude Desktop and your server's tools will appear in the tools menu.

For a Python server run with a virtual environment's interpreter, the config entry looks like this:

{
  "mcpServers": {
    "project-manager": {
      "command": "python",
      "args": ["/path/to/server.py"],
      "env": {
        "TASKS_DB_URL": "postgres://user:pass@localhost:5432/tasks"
      }
    }
  }
}

Test with natural language queries that exercise your tools indirectly. Instead of "call create_task with title X," say "I need to add a new task for the homepage redesign." Verify that Claude selects the right tool, extracts the right parameters from natural language, handles ambiguity well (what if the user does not specify an assignee?), and formats the response naturally. This end-to-end testing is essential because it reveals tool description issues that are invisible in Inspector testing. If Claude consistently uses the wrong tool for a certain type of request, your tool descriptions need refinement - not your code.

Document your test cases. As you find edge cases and fix issues, add them to a growing test document. This becomes your regression test suite for future changes. Every time you modify a tool definition or handler, re-run the full test suite. MCP servers are particularly vulnerable to regressions because a small change in a tool description can cause the AI model to use the tool differently, and that behavioral change is not captured by unit tests. For a broader view of how MCP servers fit into agent architectures, see our OpenAI Agents SDK tutorial which covers multi-agent tool integration patterns.

Deploying Your MCP Server for Production Use

Local MCP servers using stdio transport are great for development and single-user scenarios. For production - multiple users, remote access, or always-on availability - you need the HTTP+SSE transport and proper infrastructure. This section covers the deployment patterns that make MCP servers production-ready.

The HTTP+SSE (Server-Sent Events) transport replaces stdio with HTTP endpoints. The client sends requests as HTTP POST to your server and receives responses as SSE events. This transport works through firewalls, load balancers, and cloud deployments. The SDK provides built-in HTTP transport support - switching from stdio to HTTP is typically a configuration change, not a code change. In Python, you use mcp.server.sse.SseServerTransport; in TypeScript, SSEServerTransport from the SDK.

For teams standardizing on TypeScript, the same project-manager server built with the official SDK's McpServer class looks like this before swapping in the HTTP transport:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({ name: "project-manager", version: "1.0.0" });

server.tool(
  "create_task",
  "Create a new task in the project management system.",
  {
    title: z.string().describe("Task title, 1-200 characters"),
    description: z.string().optional(),
    assignee: z.string().optional().describe("Assignee email"),
  },
  async ({ title, description, assignee }) => {
    const taskId = await tasksStore.create({ title, description, assignee });
    return { content: [{ type: "text", text: JSON.stringify({ taskId }) }] };
  }
);

await server.connect(new StdioServerTransport());

Authentication is your first production concern. The MCP protocol itself does not define authentication - this is intentionally left to the transport layer. For HTTP transport, implement authentication using standard HTTP mechanisms: API keys in headers, OAuth 2.0 tokens, or mutual TLS. The simplest approach for internal tools is an API key verified in middleware. For external-facing servers, implement full OAuth 2.0 with token rotation. Never deploy an MCP server without authentication - it exposes your data and business logic to anyone who can reach the endpoint.

Rate limiting protects your server from abuse and runaway AI agents. An AI model in a loop can generate tool calls much faster than a human user, and without rate limits, a single agent could overwhelm your server or backend services. Implement per-user and per-tool rate limits. A reasonable starting point: 60 tool calls per minute per user, with lower limits for expensive operations (database writes, external API calls). Return a clear error message when rate limited so the AI model can inform the user rather than retrying silently.

Input validation must be defense-in-depth. The SDK validates parameter types against your JSON Schema, but you need application-level validation as well. Sanitize all string inputs to prevent injection attacks (SQL injection if your tools query databases, command injection if they execute system commands). Validate that parameter values are within expected ranges. Check that referenced entities (task IDs, user emails) actually exist before performing operations. Never trust input from an AI model more than you would trust input from a web form - the validation standards should be identical.

Logging and monitoring are non-negotiable for production MCP servers. Log every request and response with timestamps, user identifiers, tool names, parameter values (with sensitive fields redacted), and execution time. Set up alerts for: error rate spikes (indicating a bug or an attack), latency increases (indicating backend issues), and unusual tool call patterns (indicating a misbehaving AI agent). Store logs for at least 90 days for debugging and audit purposes.

For deployment infrastructure, containerize your MCP server with Docker. This gives you consistent environments across development, staging, and production. Deploy behind a reverse proxy (nginx or Caddy) that handles TLS termination, rate limiting, and request logging. For high availability, run multiple server instances behind a load balancer. MCP servers should be stateless (all state lives in the backend database), making horizontal scaling straightforward. A typical production setup runs 2-4 server instances behind an application load balancer, with auto-scaling based on request volume. For the full picture of how MCP servers integrate into AI agent architectures, our complete implementation guide covers the end-to-end deployment process.

Advanced Patterns and Next Steps

With a production MCP server running, several advanced patterns can significantly enhance its capabilities and the AI's effectiveness when using it.

Notification support. MCP supports server-to-client notifications - your server can proactively inform the AI client when something changes. For our project management server, this means notifying the client when a task status changes, a new task is assigned, or a deadline is approaching. The AI can then proactively inform the user instead of waiting to be asked. Implement notifications using the SDK's notification API and configure your backend to trigger them on relevant events.

Sampling support. MCP's sampling capability lets your server request the AI client to generate text. This creates a two-way interaction where the server can leverage the AI model's intelligence during tool execution. For example, a generate_task_breakdown tool could use sampling to have the AI decompose a high-level objective into sub-tasks, then store those sub-tasks in the database. The server orchestrates the operation while the AI provides the intelligence. This pattern is powerful but should be used carefully - it creates recursion (the AI calls a tool that calls the AI) which can lead to infinite loops if not bounded.

Composite tools. Sometimes a useful operation requires multiple steps that should appear as a single tool to the AI. Instead of the AI calling create_task, then assign_task, then set_priority as three separate tool calls, create a composite create_and_assign_task tool that does all three in one call. This reduces the number of round trips (improving latency), reduces token usage (fewer tool call messages), and reduces the chance of partial failures (if the second tool call fails, you have an orphaned task). Design composite tools for operations that the AI frequently chains together.

Caching. Resource reads and tool queries that hit external APIs or databases benefit from caching. Implement a cache layer (Redis or in-memory TTL cache) that stores recent query results. Set TTL values based on data freshness requirements - project status can be cached for 5 minutes, but task details should be cached for 30 seconds or less if real-time accuracy matters. The MCP protocol does not define caching semantics, so this is entirely in your control.

Multi-server composition. In larger organizations, you might have separate MCP servers for different domains: a project management server, a code repository server, a CI/CD server, and a documentation server. AI clients like Claude Desktop can connect to multiple MCP servers simultaneously, giving the AI access to all of them in a single conversation. This creates powerful cross-domain workflows: "Create a task for the bug reported in GitHub issue #456, assign it to whoever owns the related code module, and add a link to the relevant documentation section." Each operation routes to the appropriate MCP server transparently.

For next steps in your MCP journey, explore the official MCP servers repository for production-quality reference implementations. Build servers for your most frequently used internal tools - database access, CRM integration, and analytics dashboards are high-impact starting points. And consider contributing your server to the growing MCP ecosystem if it solves a general problem. The protocol is still young, and the community is actively building the tool ecosystem that will define how AI interacts with the world. For complementary AI agent patterns, our RAG agent tutorial shows how to combine MCP servers with retrieval-augmented generation for knowledge-intensive applications, and our OpenAI Agents SDK tutorial demonstrates how to use MCP tools within a multi-agent architecture.

FAQ

What is an MCP server?

An MCP (Model Context Protocol) server is a program that exposes tools, data, and prompt templates to AI models through a standardized protocol. It allows AI clients like Claude Desktop, Cursor, and custom applications to discover and use your custom tools and data without any custom integration code.

What programming languages can I use to build an MCP server?

The official SDKs support Python and TypeScript/JavaScript. The protocol is language-agnostic (it uses JSON-RPC over stdio or HTTP), so you can implement servers in any language that can handle JSON-RPC. Community SDKs exist for Go, Rust, and other languages, though the Python and TypeScript SDKs are the most mature.

How do I connect an MCP server to Claude Desktop?

Add your server to Claude Desktop's configuration file (claude_desktop_config.json). Specify the command to run your server and any environment variables. Restart Claude Desktop, and your server's tools will appear in the tools menu. The configuration file is at ~/Library/Application Support/Claude/ on macOS or %APPDATA%/Claude/ on Windows.

Is MCP only for Anthropic/Claude?

No. MCP is an open protocol that any AI client can implement. While Anthropic created and maintains the specification, it has been adopted by multiple platforms including Cursor, Windsurf, Cline, and others. Any application that implements the MCP client protocol can connect to your MCP server.

How does MCP compare to OpenAI function calling?

OpenAI function calling defines tools inline within each API call. MCP defines tools in a separate server that can be shared across multiple AI clients and conversations. MCP also adds Resources (read-only data access) and Prompts (reusable templates) which have no equivalent in function calling. MCP is better for reusable, shareable tool ecosystems; function calling is simpler for one-off integrations.

All posts

2026-07-09