Hassann

Posted on May 11 • Originally published at apidog.com

MCP Server Testing Playbook: Manual + Automated With Apidog

An “Ableton Live MCP” Show HN post climbed to 118 points and 78 comments earlier this week. The pattern is familiar: someone builds a Model Context Protocol server for an unexpected tool, Claude Desktop users try it, and more “should I write one for X?” posts follow. MCP has moved from an Anthropic-specific experiment to a common agent integration layer in less than a year.

Try Apidog today

That growth exposes a tooling gap: MCP servers still need better testing workflows. Running JSON-RPC over stdio by hand is fine for a hello-world server, but it breaks down once your server has multiple tools, prompts, resources, and upstream API dependencies. This guide shows how to test MCP servers manually, then automate those tests with Apidog so you can treat an MCP server like any other API: contract, mock, assertions, and CI.

If you are coming from a broader agent context, this agents.md guide pairs well with this article because those conventions make MCP server contracts easier to communicate across a team.

TL;DR

MCP is Anthropic’s Model Context Protocol. It uses JSON-RPC 2.0 over stdio or HTTP.
MCP exposes three core primitives: tools, resources, and prompts.
Test the main MCP calls: initialize, tools/list, tools/call, resources/read, and prompts/get.
Start manually with the MCP inspector or raw JSON-RPC over stdio.
Capture working request/response pairs and convert them into automated tests in Apidog.
Use Apidog assertions to validate response shape, schema, and error handling.
Use Apidog mocks to simulate upstream APIs so tests are deterministic.
Download Apidog if you want request collections, mocks, and CI runs in one workflow.

What MCP is

The Model Context Protocol specification defines a JSON-RPC 2.0 interface for connecting AI clients to external capabilities.

A client such as Claude Desktop, Cursor, or your own agent starts or connects to an MCP server, runs an initialize handshake, then calls exposed tools, resources, and prompts.

The calls you will test most often are:

initialize: negotiates protocol version and server capabilities.
tools/list: returns available tools and their argument schemas.
tools/call: invokes a tool by name with arguments.
resources/list: lists readable URI-addressed resources.
resources/read: reads resource content by URI.
prompts/list: lists available prompt templates.
prompts/get: renders a prompt template with arguments.

MCP transport is usually one of these:

stdio: newline-delimited JSON-RPC messages over stdin/stdout.
streamable HTTP: usually POST /, often with SSE for streaming responses.

Most local MCP servers use stdio. Remote MCP servers typically use HTTP.

Why test this carefully? Because a malformed tools/list response or broken schema can break every MCP client that connects to your server.

What to test in an MCP server

A useful MCP test suite should cover these areas.

1. Protocol conformance

Check the handshake first.

Verify that initialize returns:

the expected protocolVersion
the capabilities your server actually supports
a valid JSON-RPC 2.0 response envelope

Example response shape:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2026-04-01",
    "capabilities": {
      "tools": {},
      "resources": {},
      "prompts": {}
    },
    "serverInfo": {
      "name": "example-mcp-server",
      "version": "1.0.0"
    }
  }
}

2. Tool schema correctness

For every tool returned by tools/list, verify:

name exists
description exists and is useful
inputSchema is valid JSON Schema
required arguments are marked as required
argument types match what tools/call expects

A weak or missing description can hurt tool selection in clients such as Claude Desktop.

3. Tool behavior

For each tool, test:

happy path
missing required argument
invalid argument type
upstream API failure
timeout or retry behavior, if applicable

A successful tools/call should return content blocks such as:

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Weather in Tokyo: 24°C and cloudy."
      }
    ]
  }
}

For tool-level failures, return a normal MCP result with isError: true:

{
  "jsonrpc": "2.0",
  "id": 43,
  "result": {
    "isError": true,
    "content": [
      {
        "type": "text",
        "text": "Missing required argument: city"
      }
    ]
  }
}

Do not use a JSON-RPC protocol error for normal tool failures. Protocol errors should be reserved for protocol-level issues.

4. Resource access

If your server exposes resources, test:

resources/list
resources/read
URI format
unreadable or missing resources
pagination, if supported

Every URI returned by resources/list should be readable with resources/read.

5. Prompt rendering

For prompts, verify:

prompts/list returns expected templates
prompts/get returns a valid messages array
arguments are substituted correctly
missing prompt arguments return a predictable error

6. Failure modes

Test what happens when:

an upstream API is unavailable
an API key is missing
a request times out
an argument is malformed
multiple tool calls run concurrently

These are the bugs most likely to appear in production.

Manual testing with stdio

Start with the simplest setup:

your MCP server
a terminal
the MCP inspector or raw JSON-RPC

If you have not built a server yet, scaffold one with the official MCP SDK quickstart in Python or TypeScript. The two-tool weather example is enough for testing the workflow.

Run your server with the inspector:

npx @modelcontextprotocol/inspector node your-server.js

The inspector starts a local web UI, connects to your MCP server, and shows requests and responses. Use it to confirm:

the server starts
initialize succeeds
capabilities look correct
tools/list returns the expected tools
tools/call works for at least one tool

After the inspector flow works, run raw JSON-RPC over stdio so you can capture canonical request/response pairs for Apidog.

Example initialize request:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2026-04-01","capabilities":{}}}' | node your-server.js

Save the request and response.

Repeat for:

initialize
tools/list
tools/call
resources/list
resources/read
prompts/list
prompts/get

By the end, you should have 6 to 12 request/response examples that define your server’s wire-level contract.

Watch content blocks

A tool result returns content blocks:

{
  "content": [
    {
      "type": "text",
      "text": "Result text"
    }
  ]
}

Image content uses a different shape:

{
  "content": [
    {
      "type": "image",
      "data": "base64-encoded-image-data",
      "mimeType": "image/png"
    }
  ]
}

Mixed content blocks are allowed, but clients may render them differently. Test the exact shapes your server emits.

Watch error handling

Tool execution errors should look like this:

{
  "result": {
    "isError": true,
    "content": [
      {
        "type": "text",
        "text": "The upstream API is unavailable."
      }
    ]
  }
}

Avoid this for normal tool failures:

{
  "error": {
    "code": -32603,
    "message": "Tool failed"
  }
}

That is a JSON-RPC protocol error, not a tool result.

Automating MCP tests in Apidog

Manual testing catches obvious bugs. Automation catches regressions.

The workflow is:

Capture working JSON-RPC request/response pairs.
Create saved requests in Apidog.
Add assertions for response shape and content.
Mock upstream APIs.
Run the suite in CI.

1. Create an Apidog project

Create a new Apidog project for your MCP server.

If your MCP server exposes HTTP, use its HTTP endpoint as the base URL.

If your MCP server only supports stdio, run it behind a thin HTTP wrapper for tests. The production server can still use stdio; the wrapper only exists so your automated API test runner can send requests.

The same wrapper pattern is useful for other non-HTTP backends, as covered in API testing without Postman in 2026.

2. Save canonical JSON-RPC requests

Create one saved request for each core MCP call.

Example tools/call request body:

{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {
      "city": "Tokyo"
    }
  }
}

Save requests for:

initialize
tools/list
one happy-path tools/call per tool
one failure-path tools/call per tool
resources/list
resources/read
prompts/list
prompts/get

3. Add assertions

The value of automation is not sending the request. It is validating the response.

For tools/list, assert that:

$.result.tools exists
$.result.tools.length is greater than 0
every tool has name
every tool has description
every tool has inputSchema
every inputSchema is valid JSON Schema

For a successful tools/call, assert that:

$.result.isError is false or absent
$.result.content is an array
$.result.content[0].type matches the expected content type
expected content fields exist

For a failing tools/call, assert that:

$.result.isError is true
$.result.content is an array
the first content block includes a stable error message, code, or pattern

Prefer stable assertions. For example, assert on isError: true and a regex instead of an exact error string if the message may change.

4. Mock upstream APIs

Most MCP servers wrap another system:

weather APIs
GitHub
Linear
Notion
Slack
internal databases
internal incident management APIs

Do not make CI depend on live upstream APIs for every commit. It creates flaky tests, rate-limit problems, and slower builds.

Use Apidog’s mock server to define each upstream endpoint your MCP server calls. During tests, configure your MCP server to use the mock URL. In production, point it back to the real service.

The mock workflow is also covered in contract-first API development.

A simple mocked upstream response might look like this:

{
  "city": "Tokyo",
  "temperatureC": 24,
  "condition": "cloudy"
}

Then your MCP tool can produce a deterministic response:

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Weather in Tokyo: 24°C and cloudy."
      }
    ]
  }
}

5. Run the suite in CI

Apidog projects can be run from CLI-based workflows. Wire the test run into GitHub Actions or your CI provider.

Example GitHub Actions workflow:

name: MCP server tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - run: npm ci

      - name: Start MCP HTTP wrapper
        run: node test/wrapper.js &

      - name: Run Apidog suite
        run: npx apidog run --project-id $APIDOG_PROJECT --env ci
        env:
          APIDOG_PROJECT: ${{ secrets.APIDOG_PROJECT }}
          APIDOG_TOKEN: ${{ secrets.APIDOG_TOKEN }}

Now every push validates the MCP contract before changes merge.

What good MCP test coverage looks like

A practical MCP suite in Apidog usually includes:

1 initialize request with protocol and capability assertions
1 tools/list request with shape and schema assertions
2 to 4 tools/call requests per tool:
- happy path
- missing argument
- invalid type
- upstream error
1 resources/list request
1 resources/read request per resource family
1 prompts/list request
1 prompts/get request per prompt template

For a 10-tool server with 3 resource families and 4 prompts, this becomes roughly 50 to 70 requests.

Common MCP testing mistakes

Skipping `initialize`

Some servers build their tool registry during initialize. If you call tools/list first, the server may crash or return incomplete data.

Always test the normal client flow:

initialize
tools/list
tools/call

Testing only happy paths

Every tool should have failure-path tests.

At minimum, test:

missing required argument
invalid argument type
upstream error
empty upstream response

Asserting on exact error strings

Error messages change. Prefer:

$.result.isError is true
a stable error code
a regex
a structured field if your server provides one

Letting mocks drift from production

Mocks must reflect real API responses. Re-record or update mock fixtures regularly, especially after upstream API changes.

A mock that returns impossible data gives you green tests for a broken integration.

Forgetting streaming

HTTP MCP servers may stream tool results over Server-Sent Events.

If your server uses SSE, enable streaming on the saved request and assert on the assembled stream.

Not testing concurrency

Agent clients can call tools concurrently. If your server uses shared state, add parallel test runs to catch race conditions.

Confusing protocol errors with tool errors

Use JSON-RPC errors for protocol failures.

Use isError: true inside a normal result for tool execution failures.

This is the same class of contract problem discussed in API platform contract-first development.

Real-world use cases

A team building an internal MCP server for an incident management API used Apidog assertions on tools/list to catch schema regressions before they reached engineers using Claude Desktop.

An open-source MCP server maintainer used Apidog mocks to run CI without hitting Notion rate limits. Contributors could run tests without needing access to real API credentials.

A platform team running multiple internal MCP servers created a shared Apidog workspace for server contracts. New servers inherited a base test suite, and reviewers could inspect schema diffs before merging.

Another team used Apidog environments to run the same MCP suite against staging and production. Each environment used different configuration and fixtures while keeping the request definitions unchanged.

Implementation checklist

Use this checklist when adding tests to an MCP server:

[ ] Run the server with the MCP inspector.
[ ] Confirm initialize succeeds.
[ ] Capture raw JSON-RPC requests and responses.
[ ] Create saved requests in Apidog.
[ ] Add assertions for initialize.
[ ] Add assertions for tools/list.
[ ] Add happy-path tools/call tests.
[ ] Add failure-path tools/call tests.
[ ] Add resource tests if your server exposes resources.
[ ] Add prompt tests if your server exposes prompts.
[ ] Mock upstream APIs.
[ ] Run the suite locally.
[ ] Run the suite in CI.
[ ] Smoke test once in a real MCP client before release.

Conclusion

MCP is now mainstream, but MCP testing is still often manual and fragile. Treat your MCP server as an API. Define the contract, capture canonical requests, mock dependencies, and run assertions in CI.

Key takeaways:

An MCP server is a JSON-RPC API.
Test it with the same rigor as a REST or GraphQL API.
Start with the official inspector, then automate the captured flow.
Apidog can manage JSON-RPC requests, assertions, mocks, and CI runs in one project.
Cover protocol conformance, schemas, tool behavior, resources, prompts, and failure modes.
Mock upstream APIs in Apidog so the test suite stays fast and deterministic.

Next step: open Apidog, create a project, paste the MCP request bodies you captured manually, add JSONPath assertions for tools/list, and run the suite.

FAQ

What is MCP?

MCP, the Model Context Protocol, is Anthropic’s open specification for how AI clients call external tools, resources, and prompts. It uses JSON-RPC 2.0 over stdio or streamable HTTP. The full MCP specification is published on modelcontextprotocol.io.

Can I test an MCP server without an HTTP wrapper?

Yes. The official MCP inspector speaks stdio directly and is useful for manual testing.

For automated testing in Apidog, use a thin HTTP wrapper during CI. Your production server can still use stdio.

How do I mock upstream APIs that my MCP server calls?

Define each upstream endpoint as a mock in your Apidog project. During tests, point your MCP server at the mock URL. At runtime, switch back to production URLs.

The same pattern is covered in API testing tools for QA engineers.

What about streaming tool results?

HTTP MCP servers may stream tool results over Server-Sent Events. Apidog supports SSE in saved requests. Enable streaming in the request settings and assert on the assembled stream.

Should I test the protocol version?

Yes. Pin the protocolVersion your server supports in initialize and assert against it. Version mismatches can cause client incompatibility.

Can I test my MCP server against Claude Desktop?

Yes. You should smoke test against a real client before release.

Do not rely on Claude Desktop as your full regression loop. It is slower, manual, and less deterministic than an automated suite. Use Apidog for regression tests and Claude Desktop for final smoke testing.

Where can I see real MCP server examples?

The official MCP servers repository includes reference implementations for filesystem, GitHub, Slack, Postgres, and more. Reading their tool definitions is one of the fastest ways to learn what good MCP response shapes look like.