Preecha

Posted on May 10

How to Use the Claude Opus 4.7 API ?

TL;DR

Claude Opus 4.7 (claude-opus-4-7) is Anthropic’s most capable GA model. It supports a 1M token context window, 128K max output, adaptive thinking, a new xhigh effort level, task budgets, high-res vision up to 3.75 MP, and tool use. This guide shows how to set up the API and implement the main capabilities in Python, TypeScript, and cURL.

Try Apidog today

Introduction

Anthropic released Claude Opus 4.7 on April 16, 2026. It is the most powerful model in the Claude family and is designed for complex reasoning, autonomous agents, and vision-heavy workflows.

If you already use the Claude API, the Messages API will look familiar. The main code changes are:

Extended thinking budgets are no longer supported.
Sampling parameters such as temperature, top_p, and top_k are no longer supported.
Thinking now uses only adaptive thinking.
Thinking is off by default.
display: "summarized" is required if you want thinking content returned.

This guide walks through API setup, authentication, basic requests, adaptive thinking, high-resolution images, tool use, task budgets, streaming, prompt caching, and multi-turn conversations. It also shows how to test these payloads with Apidog.

Getting Started

1. Get your API key

Create an API key from Anthropic Console:

Sign up at console.anthropic.com
Open API Keys
Click Create Key
Copy the key

Store it as an environment variable:

export ANTHROPIC_API_KEY="sk-ant-your-key-here"

2. Install the SDK

Python:

pip install anthropic

TypeScript / Node.js:

npm install @anthropic-ai/sdk

3. Use the Messages API endpoint

All requests go to:

POST https://api.anthropic.com/v1/messages

Required headers:

x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01
content-type: application/json

Basic Text Request

Use this as your smoke test before adding tools, images, streaming, or thinking.

Python

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Explain how HTTP/2 server push works in three sentences."
        }
    ]
)

print(message.content[0].text)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: "Explain how HTTP/2 server push works in three sentences.",
    },
  ],
});

console.log(message.content[0].text);

cURL

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Explain how HTTP/2 server push works in three sentences."
      }
    ]
  }'

Adaptive Thinking

Adaptive thinking lets Claude allocate reasoning tokens dynamically based on task complexity.

It is not enabled by default. Add a thinking object to the request:

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16384,
    thinking={
        "type": "adaptive",
        "display": "summarized"
    },
    messages=[
        {
            "role": "user",
            "content": """Analyze this algorithm's time complexity and suggest optimizations:

def find_pairs(arr, target):
    result = []
    for i in range(len(arr)):
        for j in range(i+1, len(arr)):
            if arr[i] + arr[j] == target:
                result.append((arr[i], arr[j]))
    return result"""
        }
    ]
)

for block in message.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Response:", block.text)

Key implementation notes:

Use thinking={"type": "adaptive"} to enable adaptive thinking.
Do not set budget_tokens; it returns a 400 error.
Use display: "summarized" if you want thinking content in the response.
If display is omitted, thinking is not returned.
Use output_config.effort to influence reasoning depth.

Control reasoning depth with `effort`

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16384,
    thinking={"type": "adaptive"},
    output_config={"effort": "xhigh"},
    messages=[
        {
            "role": "user",
            "content": "Review this pull request for security vulnerabilities..."
        }
    ]
)

Supported effort levels:

Level	Best for
`xhigh`	Coding, agentic tasks, complex reasoning
`high`	Most intelligence-sensitive work
`medium`	Balanced speed vs. quality
`low`	Simple tasks and fast responses

High-Resolution Vision

Opus 4.7 accepts images up to 2,576 pixels on the long edge, or 3.75 megapixels. Coordinates map 1:1 to actual pixels.

Analyze an image from a URL

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/architecture-diagram.png"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this architecture diagram. List every service and the connections between them."
                }
            ]
        }
    ]
)

print(message.content[0].text)

Analyze a local image with base64

import base64
import anthropic

client = anthropic.Anthropic()

with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "What UI bugs do you see in this screenshot?"
                }
            ]
        }
    ]
)

print(message.content[0].text)

Higher-resolution images consume more tokens. Resize images before sending them if you do not need full visual fidelity.

Tool Use

Tool use lets Claude call functions you define. Opus 4.7 tends to use fewer tool calls by default and may prefer reasoning. Increase effort when you want stronger tool-use behavior.

Define a tool

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city. Returns temperature, conditions, and humidity.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name, e.g. 'San Francisco'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["city"]
        }
    }
]

Run a tool-use request

import json
import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city. Returns temperature, conditions, and humidity.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name, e.g. 'San Francisco'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["city"]
        }
    }
]

messages = [
    {
        "role": "user",
        "content": "What's the weather like in Tokyo right now?"
    }
]

# First call: Claude requests a tool
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=tools,
    messages=messages,
)

if response.stop_reason == "tool_use":
    messages.append({
        "role": "assistant",
        "content": response.content
    })

    tool_results = []

    for block in response.content:
        if block.type == "tool_use":
            # Execute your real function here.
            result = {
                "temperature": 22,
                "conditions": "Partly cloudy",
                "humidity": 65
            }

            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": json.dumps(result)
            })

    messages.append({
        "role": "user",
        "content": tool_results
    })

    # Second call: Claude uses the tool result
    final_response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )

    print(final_response.content[0].text)

Agentic Loop Pattern

For autonomous agents, keep calling the model until it stops requesting tools.

def run_agent(system_prompt: str, tools: list, user_message: str) -> str:
    messages = [
        {
            "role": "user",
            "content": user_message
        }
    ]

    while True:
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=16384,
            system=system_prompt,
            tools=tools,
            thinking={"type": "adaptive"},
            output_config={"effort": "xhigh"},
            messages=messages,
        )

        messages.append({
            "role": "assistant",
            "content": response.content
        })

        if response.stop_reason != "tool_use":
            return "".join(
                block.text
                for block in response.content
                if hasattr(block, "text")
            )

        tool_results = []

        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)

                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

        messages.append({
            "role": "user",
            "content": tool_results
        })

Task Budgets Beta

Task budgets give Claude a token allowance for an entire agentic loop. The model sees a running countdown and can wrap up work as the budget is consumed.

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    output_config={
        "effort": "high",
        "task_budget": {
            "type": "tokens",
            "total": 128000
        },
    },
    messages=[
        {
            "role": "user",
            "content": "Review the codebase and propose a refactor plan."
        }
    ],
    betas=["task-budgets-2026-03-13"],
)

Important constraints:

Minimum budget: 20,000 tokens
Advisory, not a hard cap
Claude may overshoot the budget
Different from max_tokens, which is a hard ceiling the model cannot see
Requires the beta header task-budgets-2026-03-13

Streaming Responses

Use streaming for chat UIs, CLIs, and long-running responses.

Python

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Write a Python function to parse CSV files with error handling."
        }
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

TypeScript

const stream = await client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: [
    {
      role: "user",
      content: "Write a Python function to parse CSV files with error handling.",
    },
  ],
});

for await (const event of stream) {
  if (
    event.type === "content_block_delta" &&
    event.delta.type === "text_delta"
  ) {
    process.stdout.write(event.delta.text);
  }
}

If adaptive thinking is enabled with display: "summarized", thinking blocks stream before the final text response.

If display is omitted, users may see a pause while the model reasons, followed by the text response.

Prompt Caching

Use prompt caching for repeated context, such as long system prompts, codebase summaries, or documents.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a senior code reviewer. Review code for security vulnerabilities, performance issues, and best practices violations...",
            "cache_control": {
                "type": "ephemeral"
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": """Review this function:

def process_user_input(data):
    return eval(data)"""
        }
    ]
)

Cache pricing for Opus 4.7:

Operation	Cost
5-minute cache write	$6.25 / MTok, 1.25x base
1-hour cache write	$10 / MTok, 2x base
Cache read / hit	$0.50 / MTok, 0.1x base

A single cache read pays for the 5-minute cache write. Two reads pay for the 1-hour write.

Multi-Turn Conversations

Maintain conversation state by appending each user and assistant turn to the messages array.

messages = []

# Turn 1
messages.append({
    "role": "user",
    "content": "I need to build a REST API for a todo app."
})

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
)

messages.append({
    "role": "assistant",
    "content": response.content
})

# Turn 2
messages.append({
    "role": "user",
    "content": "Add authentication with JWT tokens."
})

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
)

print(response.content[0].text)

Testing Your API Calls with Apidog

Building a Claude API integration usually involves complex payloads: multi-turn messages, tool definitions, tool results, base64 images, beta headers, and streaming responses. Apidog can help you inspect and debug those requests visually.

Set up a Claude API request in Apidog:

Create a new project in Apidog.
Add the Claude Messages API endpoint.
Store ANTHROPIC_API_KEY as an environment variable.
Add the required headers:
- x-api-key
- anthropic-version
- content-type
Save reusable request bodies for basic text, vision, tool use, and streaming scenarios.

Test tool-use flows

Tool use usually requires at least two API calls:

Send the initial user message.
Inspect Claude’s tool_use block.
Execute your function outside the model.
Send a tool_result block back.
Read Claude’s final answer.

Apidog lets you chain these requests so you can simulate the full loop and inspect each payload.

Compare models

Run the same request against claude-opus-4-6 and claude-opus-4-7 to compare:

Token counts
Response quality
Latency
Tool-use behavior

Apidog’s test runner makes these comparisons repeatable.

Validate schemas

Define JSON schemas for expected response formats and validate responses automatically. This helps catch regressions when you change prompts, tools, or model versions.

Common Errors and Fixes

Error	Cause	Fix
`400: thinking.budget_tokens not supported`	Using extended thinking syntax	Switch to `thinking: {"type": "adaptive"}`
`400: temperature not supported`	Setting unsupported sampling parameters	Remove `temperature`, `top_p`, and `top_k`
`400: max_tokens exceeded`	New tokenizer produces more tokens	Increase `max_tokens`, up to 128,000
`429: Rate limited`	Too many requests	Implement exponential backoff and check your tier limits
Blank thinking blocks	Thinking display defaults to omitted	Add `display: "summarized"` to the thinking config

Pricing Reference

Usage	Cost
Input tokens	$5 / MTok
Output tokens	$25 / MTok
Batch input	$2.50 / MTok
Batch output	$12.50 / MTok
Cache reads	$0.50 / MTok
5-minute cache writes	$6.25 / MTok
1-hour cache writes	$10 / MTok

Opus 4.7’s new tokenizer may use up to 35% more tokens for the same text compared to Opus 4.6. Use the /v1/messages/count_tokens endpoint to estimate costs before production deployment.

Conclusion

Claude Opus 4.7 keeps the familiar Messages API shape but changes how reasoning is configured. Remove extended thinking budgets and unsupported sampling parameters, then use adaptive thinking, effort, task budgets, high-resolution vision, and tool use where they fit your workflow.

A practical implementation path:

Start with a basic text request.
Add adaptive thinking for complex reasoning.
Add tool use for external actions and data retrieval.
Use task budgets for long-running agentic loops.
Stream responses for better UX.
Use prompt caching for repeated context.
Test requests, tool loops, and schemas with Apidog before shipping.

DEV Community

How to Use the Claude Opus 4.7 API ?

TL;DR

Introduction

Getting Started

1. Get your API key

2. Install the SDK

3. Use the Messages API endpoint

Basic Text Request

Python

TypeScript

cURL

Adaptive Thinking

Control reasoning depth with `effort`

High-Resolution Vision

Analyze an image from a URL

Analyze a local image with base64

Tool Use

Define a tool

Run a tool-use request

Agentic Loop Pattern

Task Budgets Beta

Streaming Responses

Python

TypeScript

Prompt Caching

Multi-Turn Conversations

Testing Your API Calls with Apidog

Test tool-use flows

Compare models

Validate schemas

Common Errors and Fixes

Pricing Reference

Conclusion

Top comments (0)

TL;DR

Introduction

Getting Started

1. Get your API key

2. Install the SDK

3. Use the Messages API endpoint

Basic Text Request

Python

TypeScript

cURL

Adaptive Thinking

Control reasoning depth with effort

High-Resolution Vision

Analyze an image from a URL

Analyze a local image with base64

Tool Use

Define a tool

Run a tool-use request

Agentic Loop Pattern

Task Budgets Beta

Streaming Responses

Python

TypeScript

Prompt Caching

Multi-Turn Conversations

Testing Your API Calls with Apidog

Test tool-use flows

Compare models

Validate schemas

Common Errors and Fixes

Pricing Reference

Conclusion

Control reasoning depth with `effort`