Build a custom MCP server for your business in 2026: the practical guide
Why your business needs a Model Context Protocol server in 2026, what to expose, how to build and deploy one with Next.js, and where MCP fits alongside AI agents.
MCP — Model Context Protocol — was barely a year old when it crossed 97 million installs in early 2026. Anthropic shipped the spec in late 2024, OpenAI and Google added support in 2025, and now every serious AI client (Claude Desktop, ChatGPT, Cursor, Windsurf, Gemini Code Assist, the new Vercel Agent) speaks it natively. The result is that in May 2026, businesses are facing the same question they faced about REST APIs in 2008: not whether to expose one, but when. Gartner expects 30% of enterprise SaaS vendors to ship their own MCP server this year, with 60–70% integration cost reduction reported by teams that standardize on it.
This guide is the practical version for founders and engineering leads. We will cover what MCP actually is in one paragraph, why it matters for your business, what to expose, how to build and deploy a production-grade server on Next.js, and the security and governance pieces that the quick-start tutorials skip.
What MCP is, in 60 seconds
MCP is an open standard that lets any AI client connect to any data source or tool in a uniform way. Think of it as USB-C for AI: one cable, every device. Before MCP, every AI tool integration was a one-off — a bespoke API wrapper for Claude, a different one for ChatGPT, another for your internal agent. With MCP, you ship one server and every compliant client can use it.
An MCP server exposes three primitives:
- Tools — actions the AI can take (book an appointment, create a ticket, run a SQL query)
- Resources — data the AI can read (a customer record, a document, a knowledge base article)
- Prompts — pre-built prompt templates with arguments the AI can fill in and execute
Why your business needs one in 2026
Three reasons we now bring this up with every client building anything AI-adjacent:
- Your customers' AI assistants want to do work in your product. A Claude or ChatGPT user asking 'check my last invoice in Acme' should not have to copy-paste between tabs — an MCP server makes Acme a first-class action.
- Internal employees use multiple AI clients. Sales uses ChatGPT, engineering uses Claude Code, support uses Cursor. One MCP server feeds all of them; one integration replaces three.
- AI agents need a standardized substrate. Every agentic workflow you build in 2026 — voice agents, sales agents, support agents — works better when the actions are MCP-shaped, because every framework speaks MCP natively.
What to expose first
Do not try to ship your entire API as MCP on day one. The right starting set for almost every business:
- Search — find a customer, an order, a document by natural-language query
- Read — get the full record once you have an ID
- Create — the most common create flow in your product (new ticket, new lead, new appointment)
- Update status — close ticket, mark order shipped, confirm appointment
- Summarize — a tool that returns a pre-built summary of an entity, ready for the AI to read aloud
Five well-shaped tools beat fifty mediocre ones. AIs choose tools by reading their descriptions; sloppy or overlapping tool descriptions confuse the model and tank your accuracy.
Pick the right transport: Streamable HTTP
MCP started with stdio (local-only) and SSE. In 2026, Streamable HTTP is the standard for any production server you want a remote AI client to reach. It works through firewalls, supports auth headers, scales horizontally, and is what Vercel, Cloudflare, and Anthropic all recommend now.
A minimal MCP server on Next.js + Vercel
The TypeScript SDK plus the Vercel MCP adapter turns this into a single route handler. Install:
npm install @modelcontextprotocol/sdk @vercel/mcp-adapter zodThen write the route. This example exposes a CRM with two tools — search customers and create a lead — and one resource for reading a customer record:
// app/api/mcp/route.ts
import { createMcpHandler } from "@vercel/mcp-adapter";
import { z } from "zod";
const handler = createMcpHandler((server) => {
server.tool(
"search_customers",
"Find customers by name, email, or phone. Returns up to 10 matches with IDs.",
{
query: z.string().describe("Free-text search term"),
limit: z.number().int().min(1).max(50).default(10),
},
async ({ query, limit }) => {
const rows = await db.customers.search(query, limit);
return {
content: [
{
type: "text",
text: rows
.map((r) => `${r.id} — ${r.name} <${r.email}>`)
.join("\n"),
},
],
};
},
);
server.tool(
"create_lead",
"Create a new lead in the CRM. Returns the new lead ID.",
{
name: z.string().min(1),
email: z.string().email(),
source: z.string().default("ai-assistant"),
notes: z.string().optional(),
},
async (input) => {
const lead = await db.leads.insert(input);
return {
content: [{ type: "text", text: `Created lead ${lead.id}` }],
};
},
);
server.resource(
"customer",
new URL("crm://customer/{id}"),
async (uri) => {
const id = uri.pathname.split("/").pop()!;
const customer = await db.customers.findById(id);
return {
contents: [
{
uri: uri.href,
mimeType: "application/json",
text: JSON.stringify(customer, null, 2),
},
],
};
},
);
});
export { handler as GET, handler as POST };Deploy on Vercel. The route runs on Fluid Compute by default, which is exactly what you want — long-lived MCP sessions reuse the function instance across requests, so latency stays low even on the first call after idle.
Authentication — the part that determines whether you go to production
The 2026 MCP spec standardized on OAuth 2.1 with PKCE. Every serious MCP client now supports it. For your server, you have three reasonable options:
- OAuth 2.1 with your existing identity provider (Clerk, Auth0, Descope, WorkOS) — the right choice for B2B SaaS
- Static API keys for single-tenant internal use — fine for a private team server, never for customer-facing
- Sign in with Vercel or GitHub for developer-tool MCPs — quickest path if your audience is engineers
Whichever you pick, scope tokens narrowly. An MCP token that an AI agent uses on the user's behalf should never have more permissions than the user — and ideally less. The OWASP-style rule applies: assume the AI will be tricked into calling the most destructive tool you exposed.
Observability: log everything
Every tool call and resource read should produce a structured log line with the tool name, the input, the caller (user, agent, client), the result, and the duration. Two reasons:
- Debugging — when an AI agent does the wrong thing, you need the trace to figure out why. AI mistakes are not exceptions; they are the normal case 1–5% of the time.
- Tuning — review the bottom 10% of tool calls weekly. Most accuracy issues trace back to a vague tool description or an overly permissive schema you can tighten.
How users actually connect
Once your server is live at, say, https://acme.com/api/mcp, users add it to their AI client in one of three ways:
- Claude Desktop / Claude Code — Settings → MCP Servers → Add → paste the URL → OAuth login
- ChatGPT — Custom GPTs / Connectors → MCP → paste the URL → OAuth login
- Cursor, Windsurf, Vercel Agent — config file or settings UI, all point at the same URL
Publishing the URL on your docs page with a one-click install button is the new equivalent of an 'API docs' link. Several MCP registries (smithery.ai, modelcontextprotocol.io/servers) also list public servers — submit yours if your product is consumer-facing.
Build vs buy: when does outsourcing make sense
Three reasons we have seen teams reach for a hosted MCP runtime (Arcade, Composio, etc.) instead of building from scratch:
- Multi-tenant OAuth and per-user credential vaulting that you do not want to build and audit
- Long-running tool calls (>30s) with reliable retries and observability
- Adding 50+ pre-built third-party integrations (Gmail, Slack, Salesforce) faster than you could ever write them
If your MCP server is just a thin layer over your own product's API, build it yourself — the SDK plus Vercel adapter is a half-day of work and you keep the data inside your security perimeter. If you are trying to give every customer an AI agent that touches 30 other SaaS tools on their behalf, a hosted runtime is the right call.
What it costs
- Build (5 tools, 2 resources, OAuth via existing IdP, deployed on Vercel) — 1 to 2 engineering weeks, roughly $6K to $20K
- Hosting — pennies; MCP calls are tiny HTTP requests, and Vercel Fluid Compute is cost-efficient under bursty load
- Maintenance — measure weekly for the first month, monthly thereafter; budget for prompt-injection-related hardening as the spec evolves
The takeaway
An MCP server is the smallest, highest-leverage AI investment a SaaS or productized service can make in 2026. The protocol has won, the tooling is mature on Next.js, and the cost is a fraction of a single AI feature. The discipline is in scoping the right initial tool set, getting OAuth right, and logging every call so you can tune the descriptions against real AI behavior in the first month. Done in that order, your customers' AI assistants quietly become a new acquisition and retention channel — without you writing a single line of agent code yourself.
Frequently asked questions
Do I need an MCP server if I already have a REST API?
If your customers or internal teams use AI assistants — yes. The REST API stays; the MCP server is a thin wrapper that makes those same actions discoverable and invokable by Claude, ChatGPT, Cursor, and every other compliant client. Building it is a few days of work and unlocks a channel your REST API alone cannot reach.
Is MCP secure enough for sensitive data?
The 2026 spec is OAuth 2.1 with PKCE plus narrowly scoped tokens, which is the same security model as any modern API. The new risk is prompt injection — an AI agent can be tricked by content it reads into calling the wrong tool. Mitigate it the same way you would mitigate SQL injection: validate inputs, scope permissions narrowly, require human confirmation for destructive actions, and log everything.
Should I build my own MCP server or use a hosted runtime like Arcade or Composio?
Build your own if the server is just a wrapper over your own product's data. The SDK plus Vercel's MCP adapter is a half-day project and keeps your data inside your security perimeter. Use a hosted runtime when you need to give each user an AI agent that touches dozens of third-party tools on their behalf — multi-tenant OAuth and credential vaulting are non-trivial and worth offloading.
Which AI clients actually support MCP in 2026?
Claude Desktop and Claude Code, ChatGPT (via Custom GPTs and Connectors), Gemini Code Assist, Cursor, Windsurf, Zed, the Vercel Agent, and dozens of agent frameworks (LangChain, Mastra, AutoGen, CrewAI). Coverage is now essentially universal across the serious clients — if your product audience uses any AI assistant in 2026, your MCP server reaches them.