By The DDH Team · Digital Dashboard Hub

How to Switch from OpenAI to Claude

A complete engineering migration guide — SDK swap, request shape differences, system prompt handling, tool use, streaming, prompt caching, and a hard cost comparison between GPT-5 and Claude Opus 4.8 / Sonnet 4.6. Most teams finish the core migration in an afternoon.

By DDH Research Team at Digital Dashboard Hub·Updated June 27, 2026

Browse all 40+ free prompt tools

Switching from OpenAI to Claude is one of the most common infrastructure moves teams are making in 2026 — driven by Claude's extended context window, superior instruction-following on long structured tasks, and increasingly competitive pricing at every tier. The migration is not a drop-in replacement: the Anthropic API has a different request shape, different system prompt handling, different tool-use syntax, and different streaming response format. You need to touch every integration point.

The good news: the conceptual model maps cleanly. OpenAI's `messages` array becomes Anthropic's `messages` array. OpenAI's `functions` / `tools` become Anthropic's `tools`. OpenAI's `system` parameter (inside the message list as a role='system' message) becomes a top-level `system` parameter in Anthropic. If you track those three transformations, you'll catch 90% of the migration surface. The remaining 10% is prompt caching syntax, streaming event names, and `max_tokens` (required on Anthropic, optional on OpenAI).

Before you migrate, benchmark your current per-month token cost. Paste your volume into our AI Prompt Cost Calculator to get a side-by-side line-item comparison across GPT-5, Claude Opus 4.8, Claude Sonnet 4.6, and Claude Haiku 4.5. Then read the cost-delta section of this guide to understand where Claude is cheaper and where it isn't. Related context: Anthropic vs OpenAI Pricing 2026 and Anthropic Claude Pricing 2026.

Digital Dashboard Hub

Writing good prompts for ONE AI is hard. Writing them for GPT-5, Claude, Gemini, Perplexity, Midjourney and 6 more is a full-time job. DDH's AI Prompt Builder writes once, runs everywhere — locked to your niche, voice, and brand tone.

Free 14 days, no card — AICHAT30 = 30% off Pro. →

OpenAI → Anthropic API mapping at a glance

Feature	Concept	OpenAI
Python SDK	openai (pip install openai)	anthropic (pip install anthropic)
Client init	OpenAI(api_key=...)	Anthropic(api_key=...)
API call method	client.chat.completions.create()	client.messages.create()
System prompt	messages=[{role:'system',...}]	system='...' (top-level param)
max_tokens	Optional (has default)	Required — omitting raises error
Response text	response.choices[0].message.content	response.content[0].text
Tool use	tools=[{type:'function',...}]	tools=[{name, description, input_schema}]
Tool result	role='tool', tool_call_id=...	role='user', content=[{type:'tool_result',...}]
Streaming chunk event	chunk.choices[0].delta.content	event.type == 'content_block_delta'
Prompt caching	Automatic (OpenAI auto-caches)	Explicit cache_control breakpoints
Token usage field	usage.prompt_tokens / completion_tokens	usage.input_tokens / output_tokens

Sources: [Anthropic API docs](https://docs.anthropic.com/en/api/messages) and [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat).

Step 1: Swap the Python SDK

The first change is mechanical: uninstall the `openai` package and install the `anthropic` package. Both are available on PyPI and have compatible Python version requirements (3.8+). ```bash pip uninstall openai pip install anthropic ``` Then replace the import and client initialization. OpenAI uses a single `OpenAI()` client; Anthropic uses a single `Anthropic()` client. Both read from the `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` environment variable respectively. Before (OpenAI): ```python from openai import OpenAI client = OpenAI(api_key=os.environ['OPENAI_API_KEY']) ``` After (Anthropic): ```python import anthropic client = anthropic.Anthropic(api_key=os.environ['ANTHROPIC_API_KEY']) ``` If you have a wrapper class or factory function that instantiates the client, change it there and the rest of your code stays clean. The Anthropic SDK is documented at docs.anthropic.com.

Step 2: Remap the API call and request shape

This is the core migration change. OpenAI uses `client.chat.completions.create()` with a `messages` list that can include a `role='system'` entry. Anthropic uses `client.messages.create()` where the system prompt is a separate top-level parameter and `max_tokens` is required (not optional). Before (OpenAI): ```python response = client.chat.completions.create( model='gpt-5', messages=[ {'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Summarize this text: ...'} ], temperature=0.3 ) text = response.choices[0].message.content ``` After (Anthropic): ```python response = client.messages.create( model='claude-sonnet-4-6', max_tokens=1024, # REQUIRED on Anthropic system='You are a helpful assistant.', messages=[ {'role': 'user', 'content': 'Summarize this text: ...'} ], temperature=0.3 ) text = response.content[0].text ``` Key differences to note: (1) `system` is extracted from `messages` and passed as its own param; (2) `max_tokens` is mandatory — Anthropic raises `anthropic.BadRequestError` if you omit it; (3) the response text lives at `response.content[0].text`, not `response.choices[0].message.content`. Token usage is at `response.usage.input_tokens` and `response.usage.output_tokens` instead of `usage.prompt_tokens` / `usage.completion_tokens`.

If you have a large codebase with many call sites, write a thin adapter function first. Map the old OpenAI call signature to the new Anthropic call signature in one place, verify it works, then do a find-replace across callers. ```python def chat(system: str, user: str, model: str = 'claude-sonnet-4-6', max_tokens: int = 1024) -> str: response = client.messages.create( model=model, max_tokens=max_tokens, system=system, messages=[{'role': 'user', 'content': user}] ) return response.content[0].text ``` This wrapper lets you keep the rest of your codebase unchanged while you validate output quality on Anthropic before doing a full rollout.

Step 3: Migrate multi-turn conversation history

Multi-turn conversations work similarly on both APIs: you maintain a `messages` list and append each turn. The difference is that Anthropic's `messages` list must never contain a `role='system'` entry — system prompt goes in the top-level `system` param only. If your existing code builds a messages list by prepending a system message, strip it out during migration. Before (OpenAI multi-turn): ```python history = [ {'role': 'system', 'content': SYSTEM_PROMPT}, {'role': 'user', 'content': 'Hello'}, {'role': 'assistant', 'content': 'Hi, how can I help?'}, {'role': 'user', 'content': 'What is 2+2?'} ] response = client.chat.completions.create(model='gpt-5', messages=history) ``` After (Anthropic multi-turn): ```python history = [ {'role': 'user', 'content': 'Hello'}, {'role': 'assistant', 'content': 'Hi, how can I help?'}, {'role': 'user', 'content': 'What is 2+2?'} ] response = client.messages.create( model='claude-sonnet-4-6', max_tokens=512, system=SYSTEM_PROMPT, messages=history ) ``` The assistant turn format is also slightly different when you need to include tool results — covered in Step 5. For plain text turns, the structure above is all you need. One additional note: Anthropic enforces that `messages` must alternate between `user` and `assistant` roles. If you ever have two consecutive `user` messages or two consecutive `assistant` messages, the API rejects the request. Make sure your turn-building logic enforces alternation.

Step 4: Migrate streaming responses

Streaming with Anthropic uses the same `stream=True` (or context manager) pattern as OpenAI, but the event structure is different. OpenAI streams `ChatCompletionChunk` objects with `chunk.choices[0].delta.content`. Anthropic streams typed events where you check `event.type` and then access the delta. Before (OpenAI streaming): ```python stream = client.chat.completions.create( model='gpt-5', messages=[{'role': 'user', 'content': prompt}], stream=True ) for chunk in stream: delta = chunk.choices[0].delta.content if delta: print(delta, end='', flush=True) ``` After (Anthropic streaming): ```python with client.messages.stream( model='claude-sonnet-4-6', max_tokens=1024, messages=[{'role': 'user', 'content': prompt}] ) as stream: for text in stream.text_stream: print(text, end='', flush=True) ``` The Anthropic SDK's `.text_stream` iterator is the cleanest path — it filters the raw SSE events down to just the text deltas. If you need access to the raw events (for token counts mid-stream, tool call events, etc.), use `stream` as an iterator directly and check `event.type`. The types are `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, and `message_delta`. The text is at `event.delta.text` when `event.type == 'content_block_delta'`. Full streaming docs at docs.anthropic.com/en/api/messages-streaming.

Step 5: Migrate tool use / function calling

Tool use is where the API shape diverges most significantly. OpenAI's `functions` / `tools` format uses a JSON Schema wrapped in a `function` type object. Anthropic's `tools` format uses `name`, `description`, and `input_schema` directly — no extra `function` wrapper level. The response parsing also differs: OpenAI returns `tool_calls` on the assistant message; Anthropic returns `content` blocks of `type='tool_use'`. Before (OpenAI tool definition): ```python tools = [{ 'type': 'function', 'function': { 'name': 'get_weather', 'description': 'Get current weather for a city', 'parameters': { 'type': 'object', 'properties': { 'city': {'type': 'string', 'description': 'City name'} }, 'required': ['city'] } } }] response = client.chat.completions.create( model='gpt-5', messages=msgs, tools=tools ) # Parsing the tool call: tool_call = response.choices[0].message.tool_calls[0] func_name = tool_call.function.name func_args = json.loads(tool_call.function.arguments) ``` After (Anthropic tool definition): ```python tools = [{ 'name': 'get_weather', 'description': 'Get current weather for a city', 'input_schema': { 'type': 'object', 'properties': { 'city': {'type': 'string', 'description': 'City name'} }, 'required': ['city'] } }] response = client.messages.create( model='claude-sonnet-4-6', max_tokens=1024, messages=msgs, tools=tools ) # Parsing the tool call: tool_block = next(b for b in response.content if b.type == 'tool_use') func_name = tool_block.name func_args = tool_block.input # already a dict, no json.loads needed ```

Returning tool results also differs. OpenAI expects a `role='tool'` message with `tool_call_id`. Anthropic expects a `role='user'` message containing a `content` block of `type='tool_result'` with the `tool_use_id`. Before (OpenAI tool result): ```python msgs.append(response.choices[0].message) # append assistant turn msgs.append({ 'role': 'tool', 'tool_call_id': tool_call.id, 'content': json.dumps({'temperature': '72F', 'condition': 'sunny'}) }) ``` After (Anthropic tool result): ```python msgs.append({'role': 'assistant', 'content': response.content}) msgs.append({ 'role': 'user', 'content': [{ 'type': 'tool_result', 'tool_use_id': tool_block.id, 'content': json.dumps({'temperature': '72F', 'condition': 'sunny'}) }] }) ``` The full tool use flow is documented at docs.anthropic.com/en/docs/tool-use. The pattern is consistent across all Claude models including Haiku 4.5, Sonnet 4.6, and Opus 4.8.

Step 6: Enable prompt caching on Anthropic (explicit vs automatic)

This is one of the most impactful differences between the two providers' implementations. OpenAI caches prompt prefixes automatically — you do nothing and the cache applies. Anthropic requires you to explicitly mark cache breakpoints using `cache_control` blocks. This is more work upfront but gives you precise control over what gets cached. To cache a system prompt on Anthropic: ```python response = client.messages.create( model='claude-sonnet-4-6', max_tokens=1024, system=[ { 'type': 'text', 'text': LONG_SYSTEM_PROMPT, 'cache_control': {'type': 'ephemeral'} } ], messages=msgs ) ``` The `cache_control: {type: 'ephemeral'}` marker tells Anthropic to cache everything up to that breakpoint. Cache writes cost 125% of standard input rate; cache reads cost 10% of standard input rate. The cache lives for up to 5 minutes by default (extendable to 1 hour on supported models). If you call the same model with the same prefix more than twice per session, caching pays for itself. For agentic loops that call the model 10-50 times with the same system prompt, caching alone can cut input costs 80-90%.

You can also cache document blocks in the `messages` array — useful for retrieval-augmented generation where the same large document chunk appears in every call: ```python msgs = [{ 'role': 'user', 'content': [ { 'type': 'text', 'text': LARGE_DOCUMENT_TEXT, 'cache_control': {'type': 'ephemeral'} }, { 'type': 'text', 'text': 'Summarize the key risks from the document above.' } ] }] ``` Full caching documentation at docs.anthropic.com/en/docs/build-with-claude/prompt-caching. Compare the explicit-cache model against OpenAI's automatic caching in our Anthropic vs OpenAI Pricing 2026 breakdown.

Step 7: Adapt your prompt style for Claude

Claude and GPT-5 respond differently to the same prompt phrasing. Claude was trained with a strong emphasis on following explicit instructions literally, respecting explicit output format constraints, and not adding unrequested content. This is generally a feature, but it means prompts written for GPT-5's more verbose, expansive default style sometimes produce shorter, more literal responses on Claude — and vice versa.

Three specific prompt adjustments that matter most in practice: First, Claude responds well to being told the exact output format in the system prompt (JSON schema, markdown heading structure, numbered list vs prose). GPT-5 infers format from examples; Claude follows explicit instructions. Second, Claude handles very long system prompts reliably — you can move your few-shot examples, tool definitions, and context documents all into the system prompt without degradation. OpenAI prompts sometimes need examples spread across the message history. Third, Claude's refusal pattern differs: it tends to give partial answers rather than hard refusals, and it responds well to explicit permissions in the system prompt ('You may discuss X' is more effective than trying to work around a refusal with clever phrasing).

Run your existing evaluation suite against Claude outputs before cutting production traffic over. For most text-generation tasks (summarization, classification, structured extraction, Q&A), Sonnet 4.6 matches GPT-5 quality closely enough that evals pass without prompt changes. For complex multi-step reasoning or code generation tasks, you may need to add 'think step by step' or explicit chain-of-thought instructions. Haiku 4.5 is an excellent replacement for GPT-5 mini on classification and short-answer tasks at significantly lower cost. See Claude vs ChatGPT vs Gemini 2026 for a full benchmark comparison.

Step 8: Handle error types and retry logic

Both SDKs expose typed exceptions, but the class names and retry semantics differ. OpenAI raises `openai.RateLimitError`, `openai.APIConnectionError`, and `openai.BadRequestError`. Anthropic raises `anthropic.RateLimitError`, `anthropic.APIConnectionError`, and `anthropic.BadRequestError` from the `anthropic` module namespace. Before (OpenAI error handling): ```python try: response = client.chat.completions.create(...) except openai.RateLimitError: time.sleep(60) except openai.BadRequestError as e: print('Bad request:', e) ``` After (Anthropic error handling): ```python try: response = client.messages.create(...) except anthropic.RateLimitError: time.sleep(60) except anthropic.BadRequestError as e: print('Bad request:', e) ``` The most common `BadRequestError` on Anthropic during migration is forgetting `max_tokens`. The second most common is passing a system message inside the `messages` list with `role='system'` — Anthropic rejects this and tells you to move it to the top-level `system` parameter. Both are easy fixes once you know what to look for. The Anthropic SDK also has automatic retry with exponential backoff built in — pass `max_retries=3` to the client constructor to enable it: ```python client = anthropic.Anthropic(api_key=API_KEY, max_retries=3) ```

Step 9: Cost delta — GPT-5 vs Claude Opus 4.8 / Sonnet 4.6 / Haiku 4.5

As of June 2026, the per-million-token pricing comparison across the tiers most commonly used in production (sourced from platform.openai.com/docs/pricing and anthropic.com/pricing) shows meaningful differences at both the top and bottom of the model tier ladder. GPT-5 (standard): ~$2.50/M input, ~$10/M output. GPT-5 mini: ~$0.40/M input, ~$1.60/M output. Claude Opus 4.8: $15/M input, $75/M output. Claude Sonnet 4.6: $3/M input, $15/M output. Claude Haiku 4.5: $0.80/M input, $4/M output. The pattern is clear: for mid-tier production workloads, Sonnet 4.6 sits at roughly 1.2x GPT-5's input cost and 1.5x its output cost — so it's slightly more expensive on raw token price. However, three factors shift the TCO calculation in Anthropic's favor for many teams: (1) Sonnet 4.6's 200k token context window means fewer chunking/retrieval calls per task; (2) explicit prompt caching can cut effective input cost to $0.30/M on cached portions; (3) Haiku 4.5 undercuts GPT-5 mini on input cost and is a strong replacement for classification and routing tasks.

For teams replacing GPT-5 with Claude Sonnet 4.6 on a typical mixed workload: expect raw token costs to increase 10-25% before caching, then drop 30-60% after enabling prompt caching on stable prefixes. The net effect for agentic workloads with large system prompts is typically a 20-40% cost reduction versus uncached GPT-5. For document processing workloads where each call passes a fresh document (no caching benefit), Sonnet 4.6 is 20-30% more expensive per call than GPT-5 at the same quality tier. Use our AI Prompt Cost Calculator to model your specific token distribution. Also see Anthropic to Google Migration Cost and Azure to OpenAI Direct Cost Analysis for cross-provider migration cost math.

Step 10: Model selection guide — which Claude model replaces which OpenAI model

The mapping is not one-to-one because the tier structures differ. Here is the pragmatic replacement guide based on task type and quality requirements: GPT-5 nano → Claude Haiku 4.5: best match for high-volume classification, intent detection, short-answer extraction, and structured JSON output tasks. Haiku 4.5 is fast (median response under 500ms for short outputs) and has the lowest per-token cost in the Claude family. Quality is comparable on narrow, well-defined tasks. GPT-5 mini → Claude Haiku 4.5 or Claude Sonnet 4.6 depending on task complexity: for simple Q&A and customer support replies, Haiku 4.5 is sufficient. For multi-step reasoning, code explanation, or tasks requiring nuanced instruction following, go to Sonnet 4.6. GPT-5 (standard) → Claude Sonnet 4.6: the primary production workhorse replacement. Comparable quality on most benchmark tasks, 200k context, strong tool use, reliable instruction following. This is the right default for teams that currently use GPT-5 standard on their main user-facing workloads. GPT-5 pro → Claude Opus 4.8: the frontier-model replacement for the most demanding tasks — complex code generation, strategic analysis, multi-document synthesis. Opus 4.8 is significantly more expensive than GPT-5 at $15/M input vs $2.50/M, so use it selectively. Most teams should use Opus only for tasks where Sonnet 4.6 demonstrably fails their eval criteria.

Step 11: Validate and cut over production traffic

Once your code migration is complete, run validation in three phases before cutting production traffic. Phase 1: unit test each integration point (plain chat, multi-turn, tool use, streaming) using fixtures. Verify the request shape is accepted and the response parses correctly. Phase 2: shadow mode — run both OpenAI and Anthropic in parallel on live traffic, compare outputs using your existing evaluation metrics, and log divergences. Most teams run shadow mode for 24-72 hours. Phase 3: gradual rollout — route 5% of traffic to Anthropic, monitor error rate and quality metrics, then increase to 20%, 50%, 100%.

Common issues found during shadow mode: (1) prompts that relied on GPT-5's verbose default output style producing shorter responses on Claude — fix by adding explicit length guidance to the system prompt; (2) tool call handling bugs where `tool_call_id` was hard-coded to OpenAI format — fix by using `tool_use_id` from the Anthropic response; (3) conversation history accumulation bugs where a `role='system'` message was being appended to history on each turn — fix by separating system prompt from message history. After full cutover, monitor your usage dashboard at console.anthropic.com and compare spend against your pre-migration OpenAI baseline. Set budget alerts at 110% of expected spend for the first two weeks.

Step 12: What to do if you need both OpenAI and Anthropic

Some teams migrate most workloads to Claude but keep GPT-5 for specific tasks where it performs better (image understanding via GPT-4o, specialized fine-tuned models, or existing OpenAI Assistants they haven't migrated). In this case, build a model router that abstracts both clients behind a single interface. ```python def call_llm(provider: str, model: str, system: str, user: str, max_tokens: int = 1024) -> str: if provider == 'anthropic': response = anthropic_client.messages.create( model=model, max_tokens=max_tokens, system=system, messages=[{'role': 'user', 'content': user}] ) return response.content[0].text elif provider == 'openai': response = openai_client.chat.completions.create( model=model, messages=[ {'role': 'system', 'content': system}, {'role': 'user', 'content': user} ] ) return response.choices[0].message.content else: raise ValueError(f'Unknown provider: {provider}') ``` This pattern lets you route by task type, run cost-aware fallback (try Haiku first, escalate to Sonnet on error or low confidence), and A/B test models independently without rewriting calling code. It's also the foundation for building a cost-optimized model tiering system — route cheap tasks to Haiku, expensive tasks to Sonnet, and only send the hardest tasks to Opus. For a full cost breakdown of running this kind of multi-provider setup, use our AI Prompt Cost Calculator to model the blended cost.

Digital Dashboard Hub

The prompt patterns above work 10x better when they live in a library you actually own — tunable to your niche, exportable to GPT-5, Claude, Gemini, Perplexity, Midjourney, Llama. Stop pasting across 6 tools.

Try DDH's AI Prompt Builder — free 14 days, no card. AICHAT30 = 30% off Pro. →

Continue your research on adjacent topics — calculators, rate limits, head-to-head comparisons, and guides.

Related prompt tools

AI Prompt Cost Calculator→Anthropic vs OpenAI Pricing 2026→Anthropic Claude Pricing 2026→Claude vs ChatGPT vs Gemini (2026)→Anthropic to Google Migration Cost→Azure to OpenAI Direct Cost Analysis→AI Cost Optimization Checklist 2026→How Much Does Claude Cost in 2026?→

Frequently Asked Questions

Is switching from OpenAI to Claude a drop-in replacement?

No. The API shape, SDK method names, system prompt placement, tool use format, and streaming event structure all differ. The migration is not a one-line change, but it is well-scoped — most of the work is in the request builder and response parser, not in your business logic. Most teams complete the core migration in one to two days.

What is the biggest mistake people make when migrating to Claude?

Forgetting that max_tokens is required on Anthropic (it's optional on OpenAI). The second most common mistake is passing the system prompt inside the messages list with role='system' — Anthropic rejects this with a clear error. Both are caught immediately on the first test call.

Will my OpenAI prompts work on Claude without changes?

Usually yes for simple prompts. Claude follows explicit instructions very literally, so prompts that rely on GPT-5's verbose default behavior may produce shorter, more direct responses. For most production tasks (summarization, classification, structured output), prompts work with minimal changes. For complex agentic or creative tasks, budget one to two days for prompt tuning.

Is Claude cheaper than GPT-5?

It depends on the tier and workload. Claude Haiku 4.5 is cheaper than GPT-5 mini. Claude Sonnet 4.6 is slightly more expensive than GPT-5 standard on raw token price but can be cheaper after prompt caching on repeated-context workloads. Claude Opus 4.8 is significantly more expensive than GPT-5 standard. Use our AI Prompt Cost Calculator to model your specific volume.

Does Claude support the same tool use / function calling features as OpenAI?

Yes, with different syntax. Anthropic's tool use supports parallel tool calls, tool choice (auto, any, or specific tool name), and structured input validation via JSON Schema — equivalent to OpenAI's function calling capabilities. The request and response format differs (see Step 5 in this guide), but the functional capabilities are equivalent for most production use cases.

How does prompt caching work differently on Claude vs OpenAI?

OpenAI caches automatically — you do nothing and common prompt prefixes get cached. Anthropic requires explicit cache_control breakpoints in your request. Explicit control is more work upfront but lets you guarantee exactly what gets cached and for how long. Cache reads cost 10% of standard input rate on both providers. Cache writes cost 125% on Anthropic (vs free on OpenAI's automatic approach). See Step 6 for the implementation pattern.

What Claude model should I use to replace GPT-5?

For most production workloads: Claude Sonnet 4.6. It matches GPT-5 quality on the majority of tasks with a 200k token context window and strong tool use. For high-volume classification and routing tasks, Haiku 4.5 is the better match for GPT-5 mini. Reserve Claude Opus 4.8 for tasks where Sonnet 4.6 fails your quality eval — Opus is substantially more expensive.

Can I run OpenAI and Anthropic side by side during migration?

Yes, and this is the recommended approach. Run both clients in shadow mode — send the same requests to both providers, compare outputs, and monitor quality and cost before cutting traffic over. See Step 11 for the phased rollout pattern. A model router function (Step 12) makes this easy to manage without duplicating calling code.

Know your cost before you cut over.

Paste your current monthly token volume into our [AI Prompt Cost Calculator](/blog/ai-prompt-cost-calculator) to get a side-by-side cost comparison across GPT-5, Claude Sonnet 4.6, Claude Haiku 4.5, and Claude Opus 4.8 before you migrate a single line of code. Then use DDH Pro's prompt library to grab prompts already tuned for the Claude model tier you're moving to.

Browse all prompt tools →