DEV Community

Cover image for MCP Server Testing Playbook: Manual + Automated With Apidog
Hassann
Hassann

Posted on • Originally published at apidog.com

MCP Server Testing Playbook: Manual + Automated With Apidog

An “Ableton Live MCP” Show HN post climbed to 118 points and 78 comments earlier this week. The pattern is familiar: someone builds a Model Context Protocol server for an unexpected tool, Claude Desktop users try it, and more “should I write one for X?” posts follow. MCP has moved from an Anthropic-specific experiment to a common agent integration layer in less than a year.

Try Apidog today

That growth exposes a tooling gap: MCP servers still need better testing workflows. Running JSON-RPC over stdio by hand is fine for a hello-world server, but it breaks down once your server has multiple tools, prompts, resources, and upstream API dependencies. This guide shows how to test MCP servers manually, then automate those tests with Apidog so you can treat an MCP server like any other API: contract, mock, assertions, and CI.

If you are coming from a broader agent context, this agents.md guide pairs well with this article because those conventions make MCP server contracts easier to communicate across a team.

TL;DR

  • MCP is Anthropic’s Model Context Protocol. It uses JSON-RPC 2.0 over stdio or HTTP.
  • MCP exposes three core primitives: tools, resources, and prompts.
  • Test the main MCP calls: initialize, tools/list, tools/call, resources/read, and prompts/get.
  • Start manually with the MCP inspector or raw JSON-RPC over stdio.
  • Capture working request/response pairs and convert them into automated tests in Apidog.
  • Use Apidog assertions to validate response shape, schema, and error handling.
  • Use Apidog mocks to simulate upstream APIs so tests are deterministic.
  • Download Apidog if you want request collections, mocks, and CI runs in one workflow.

What MCP is

The Model Context Protocol specification defines a JSON-RPC 2.0 interface for connecting AI clients to external capabilities.

A client such as Claude Desktop, Cursor, or your own agent starts or connects to an MCP server, runs an initialize handshake, then calls exposed tools, resources, and prompts.

The calls you will test most often are:

  • initialize: negotiates protocol version and server capabilities.
  • tools/list: returns available tools and their argument schemas.
  • tools/call: invokes a tool by name with arguments.
  • resources/list: lists readable URI-addressed resources.
  • resources/read: reads resource content by URI.
  • prompts/list: lists available prompt templates.
  • prompts/get: renders a prompt template with arguments.

MCP transport is usually one of these:

  • stdio: newline-delimited JSON-RPC messages over stdin/stdout.
  • streamable HTTP: usually POST /, often with SSE for streaming responses.

Most local MCP servers use stdio. Remote MCP servers typically use HTTP.

Why test this carefully? Because a malformed tools/list response or broken schema can break every MCP client that connects to your server.

What to test in an MCP server

A useful MCP test suite should cover these areas.

1. Protocol conformance

Check the handshake first.

Verify that initialize returns:

  • the expected protocolVersion
  • the capabilities your server actually supports
  • a valid JSON-RPC 2.0 response envelope

Example response shape:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2026-04-01",
    "capabilities": {
      "tools": {},
      "resources": {},
      "prompts": {}
    },
    "serverInfo": {
      "name": "example-mcp-server",
      "version": "1.0.0"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

2. Tool schema correctness

For every tool returned by tools/list, verify:

  • name exists
  • description exists and is useful
  • inputSchema is valid JSON Schema
  • required arguments are marked as required
  • argument types match what tools/call expects

A weak or missing description can hurt tool selection in clients such as Claude Desktop.

3. Tool behavior

For each tool, test:

  • happy path
  • missing required argument
  • invalid argument type
  • upstream API failure
  • timeout or retry behavior, if applicable

A successful tools/call should return content blocks such as:

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Weather in Tokyo: 24°C and cloudy."
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

For tool-level failures, return a normal MCP result with isError: true:

{
  "jsonrpc": "2.0",
  "id": 43,
  "result": {
    "isError": true,
    "content": [
      {
        "type": "text",
        "text": "Missing required argument: city"
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Do not use a JSON-RPC protocol error for normal tool failures. Protocol errors should be reserved for protocol-level issues.

4. Resource access

If your server exposes resources, test:

  • resources/list
  • resources/read
  • URI format
  • unreadable or missing resources
  • pagination, if supported

Every URI returned by resources/list should be readable with resources/read.

5. Prompt rendering

For prompts, verify:

  • prompts/list returns expected templates
  • prompts/get returns a valid messages array
  • arguments are substituted correctly
  • missing prompt arguments return a predictable error

6. Failure modes

Test what happens when:

  • an upstream API is unavailable
  • an API key is missing
  • a request times out
  • an argument is malformed
  • multiple tool calls run concurrently

These are the bugs most likely to appear in production.

Manual testing with stdio

Start with the simplest setup:

  • your MCP server
  • a terminal
  • the MCP inspector or raw JSON-RPC

If you have not built a server yet, scaffold one with the official MCP SDK quickstart in Python or TypeScript. The two-tool weather example is enough for testing the workflow.

Run your server with the inspector:

npx @modelcontextprotocol/inspector node your-server.js
Enter fullscreen mode Exit fullscreen mode

The inspector starts a local web UI, connects to your MCP server, and shows requests and responses. Use it to confirm:

  • the server starts
  • initialize succeeds
  • capabilities look correct
  • tools/list returns the expected tools
  • tools/call works for at least one tool

After the inspector flow works, run raw JSON-RPC over stdio so you can capture canonical request/response pairs for Apidog.

Example initialize request:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2026-04-01","capabilities":{}}}' | node your-server.js
Enter fullscreen mode Exit fullscreen mode

Save the request and response.

Repeat for:

  • initialize
  • tools/list
  • tools/call
  • resources/list
  • resources/read
  • prompts/list
  • prompts/get

By the end, you should have 6 to 12 request/response examples that define your server’s wire-level contract.

Watch content blocks

A tool result returns content blocks:

{
  "content": [
    {
      "type": "text",
      "text": "Result text"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Image content uses a different shape:

{
  "content": [
    {
      "type": "image",
      "data": "base64-encoded-image-data",
      "mimeType": "image/png"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Mixed content blocks are allowed, but clients may render them differently. Test the exact shapes your server emits.

Watch error handling

Tool execution errors should look like this:

{
  "result": {
    "isError": true,
    "content": [
      {
        "type": "text",
        "text": "The upstream API is unavailable."
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Avoid this for normal tool failures:

{
  "error": {
    "code": -32603,
    "message": "Tool failed"
  }
}
Enter fullscreen mode Exit fullscreen mode

That is a JSON-RPC protocol error, not a tool result.

Automating MCP tests in Apidog

Manual testing catches obvious bugs. Automation catches regressions.

The workflow is:

  1. Capture working JSON-RPC request/response pairs.
  2. Create saved requests in Apidog.
  3. Add assertions for response shape and content.
  4. Mock upstream APIs.
  5. Run the suite in CI.

1. Create an Apidog project

Create a new Apidog project for your MCP server.

If your MCP server exposes HTTP, use its HTTP endpoint as the base URL.

If your MCP server only supports stdio, run it behind a thin HTTP wrapper for tests. The production server can still use stdio; the wrapper only exists so your automated API test runner can send requests.

The same wrapper pattern is useful for other non-HTTP backends, as covered in API testing without Postman in 2026.

2. Save canonical JSON-RPC requests

Create one saved request for each core MCP call.

Example tools/call request body:

{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {
      "city": "Tokyo"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Save requests for:

  • initialize
  • tools/list
  • one happy-path tools/call per tool
  • one failure-path tools/call per tool
  • resources/list
  • resources/read
  • prompts/list
  • prompts/get

3. Add assertions

The value of automation is not sending the request. It is validating the response.

For tools/list, assert that:

  • $.result.tools exists
  • $.result.tools.length is greater than 0
  • every tool has name
  • every tool has description
  • every tool has inputSchema
  • every inputSchema is valid JSON Schema

For a successful tools/call, assert that:

  • $.result.isError is false or absent
  • $.result.content is an array
  • $.result.content[0].type matches the expected content type
  • expected content fields exist

For a failing tools/call, assert that:

  • $.result.isError is true
  • $.result.content is an array
  • the first content block includes a stable error message, code, or pattern

Prefer stable assertions. For example, assert on isError: true and a regex instead of an exact error string if the message may change.

4. Mock upstream APIs

Most MCP servers wrap another system:

  • weather APIs
  • GitHub
  • Linear
  • Notion
  • Slack
  • internal databases
  • internal incident management APIs

Do not make CI depend on live upstream APIs for every commit. It creates flaky tests, rate-limit problems, and slower builds.

Use Apidog’s mock server to define each upstream endpoint your MCP server calls. During tests, configure your MCP server to use the mock URL. In production, point it back to the real service.

The mock workflow is also covered in contract-first API development.

A simple mocked upstream response might look like this:

{
  "city": "Tokyo",
  "temperatureC": 24,
  "condition": "cloudy"
}
Enter fullscreen mode Exit fullscreen mode

Then your MCP tool can produce a deterministic response:

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Weather in Tokyo: 24°C and cloudy."
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

5. Run the suite in CI

Apidog projects can be run from CLI-based workflows. Wire the test run into GitHub Actions or your CI provider.

Example GitHub Actions workflow:

name: MCP server tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - run: npm ci

      - name: Start MCP HTTP wrapper
        run: node test/wrapper.js &

      - name: Run Apidog suite
        run: npx apidog run --project-id $APIDOG_PROJECT --env ci
        env:
          APIDOG_PROJECT: ${{ secrets.APIDOG_PROJECT }}
          APIDOG_TOKEN: ${{ secrets.APIDOG_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

Now every push validates the MCP contract before changes merge.

What good MCP test coverage looks like

A practical MCP suite in Apidog usually includes:

  • 1 initialize request with protocol and capability assertions
  • 1 tools/list request with shape and schema assertions
  • 2 to 4 tools/call requests per tool:
    • happy path
    • missing argument
    • invalid type
    • upstream error
  • 1 resources/list request
  • 1 resources/read request per resource family
  • 1 prompts/list request
  • 1 prompts/get request per prompt template

For a 10-tool server with 3 resource families and 4 prompts, this becomes roughly 50 to 70 requests.

Common MCP testing mistakes

Skipping initialize

Some servers build their tool registry during initialize. If you call tools/list first, the server may crash or return incomplete data.

Always test the normal client flow:

  1. initialize
  2. tools/list
  3. tools/call

Testing only happy paths

Every tool should have failure-path tests.

At minimum, test:

  • missing required argument
  • invalid argument type
  • upstream error
  • empty upstream response

Asserting on exact error strings

Error messages change. Prefer:

  • $.result.isError is true
  • a stable error code
  • a regex
  • a structured field if your server provides one

Letting mocks drift from production

Mocks must reflect real API responses. Re-record or update mock fixtures regularly, especially after upstream API changes.

A mock that returns impossible data gives you green tests for a broken integration.

Forgetting streaming

HTTP MCP servers may stream tool results over Server-Sent Events.

If your server uses SSE, enable streaming on the saved request and assert on the assembled stream.

Not testing concurrency

Agent clients can call tools concurrently. If your server uses shared state, add parallel test runs to catch race conditions.

Confusing protocol errors with tool errors

Use JSON-RPC errors for protocol failures.

Use isError: true inside a normal result for tool execution failures.

This is the same class of contract problem discussed in API platform contract-first development.

Real-world use cases

A team building an internal MCP server for an incident management API used Apidog assertions on tools/list to catch schema regressions before they reached engineers using Claude Desktop.

An open-source MCP server maintainer used Apidog mocks to run CI without hitting Notion rate limits. Contributors could run tests without needing access to real API credentials.

A platform team running multiple internal MCP servers created a shared Apidog workspace for server contracts. New servers inherited a base test suite, and reviewers could inspect schema diffs before merging.

Another team used Apidog environments to run the same MCP suite against staging and production. Each environment used different configuration and fixtures while keeping the request definitions unchanged.

Implementation checklist

Use this checklist when adding tests to an MCP server:

  • [ ] Run the server with the MCP inspector.
  • [ ] Confirm initialize succeeds.
  • [ ] Capture raw JSON-RPC requests and responses.
  • [ ] Create saved requests in Apidog.
  • [ ] Add assertions for initialize.
  • [ ] Add assertions for tools/list.
  • [ ] Add happy-path tools/call tests.
  • [ ] Add failure-path tools/call tests.
  • [ ] Add resource tests if your server exposes resources.
  • [ ] Add prompt tests if your server exposes prompts.
  • [ ] Mock upstream APIs.
  • [ ] Run the suite locally.
  • [ ] Run the suite in CI.
  • [ ] Smoke test once in a real MCP client before release.

Conclusion

MCP is now mainstream, but MCP testing is still often manual and fragile. Treat your MCP server as an API. Define the contract, capture canonical requests, mock dependencies, and run assertions in CI.

Key takeaways:

  • An MCP server is a JSON-RPC API.
  • Test it with the same rigor as a REST or GraphQL API.
  • Start with the official inspector, then automate the captured flow.
  • Apidog can manage JSON-RPC requests, assertions, mocks, and CI runs in one project.
  • Cover protocol conformance, schemas, tool behavior, resources, prompts, and failure modes.
  • Mock upstream APIs in Apidog so the test suite stays fast and deterministic.

Next step: open Apidog, create a project, paste the MCP request bodies you captured manually, add JSONPath assertions for tools/list, and run the suite.

FAQ

What is MCP?

MCP, the Model Context Protocol, is Anthropic’s open specification for how AI clients call external tools, resources, and prompts. It uses JSON-RPC 2.0 over stdio or streamable HTTP. The full MCP specification is published on modelcontextprotocol.io.

Can I test an MCP server without an HTTP wrapper?

Yes. The official MCP inspector speaks stdio directly and is useful for manual testing.

For automated testing in Apidog, use a thin HTTP wrapper during CI. Your production server can still use stdio.

How do I mock upstream APIs that my MCP server calls?

Define each upstream endpoint as a mock in your Apidog project. During tests, point your MCP server at the mock URL. At runtime, switch back to production URLs.

The same pattern is covered in API testing tools for QA engineers.

What about streaming tool results?

HTTP MCP servers may stream tool results over Server-Sent Events. Apidog supports SSE in saved requests. Enable streaming in the request settings and assert on the assembled stream.

Should I test the protocol version?

Yes. Pin the protocolVersion your server supports in initialize and assert against it. Version mismatches can cause client incompatibility.

Can I test my MCP server against Claude Desktop?

Yes. You should smoke test against a real client before release.

Do not rely on Claude Desktop as your full regression loop. It is slower, manual, and less deterministic than an automated suite. Use Apidog for regression tests and Claude Desktop for final smoke testing.

Where can I see real MCP server examples?

The official MCP servers repository includes reference implementations for filesystem, GitHub, Slack, Postgres, and more. Reading their tool definitions is one of the fastest ways to learn what good MCP response shapes look like.

Top comments (0)