What each vendor's fine-tuning offering actually is
These three offerings start from very different problem definitions, and the API surface flows from those starting points. Understanding the underlying philosophy is the fastest way to know which vendor fits your situation before you spend a dollar.
**OpenAI fine-tuning** (https://platform.openai.com/docs/guides/fine-tuning) is the most mature hosted fine-tuning API in the market. It supports three different training algorithms: supervised fine-tuning (SFT) for the standard "here are input-output pairs, learn the pattern" workflow; direct preference optimization (DPO) for offline preference learning where you provide chosen-vs-rejected response pairs; and reinforcement fine-tuning (RFT) where you provide a programmatic grader function and the model is trained against that reward signal. RFT is the differentiator no one else matches in 2026 — for tasks like code completion, math problem solving, or any case where you can score outputs cheaply, RFT can extract quality gains that SFT cannot. The API surface is the standard OpenAI REST pattern: upload a jsonl file, create a fine-tune job, monitor status, then call the resulting fine-tuned model by its ID. The result is auto-deployed — no separate provisioning step.
**Anthropic fine-tuning** (https://docs.anthropic.com/en/docs/build-with-claude/fine-tuning) takes the opposite philosophy: there is no direct Anthropic API for fine-tuning. Claude fine-tuning is exclusively available through Amazon Bedrock and Google Vertex AI partner channels. The reasoning Anthropic has given publicly is that fine-tuning Claude requires custom infrastructure, data isolation, and serving guarantees that Anthropic prefers to delegate to cloud partners with mature enterprise security postures. Practically, this means your fine-tuning workflow looks like a Bedrock or Vertex job — you authenticate against the cloud provider, upload jsonl in Anthropic's chat schema, and the fine-tuned model lives on provisioned Bedrock throughput or a Vertex endpoint. The cost of that provisioned throughput is significant and not negligible compared to the training itself.
**Google Vertex AI fine-tuning** (https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-gemini-overview) is the most generous offering on the free-tier dimension, and the deepest in terms of eval/deployment tooling. Gemini 2.5 Flash supervised fine-tuning includes a substantial monthly free training token allowance — enough that small experiments can run at zero training cost — and the integration with Vertex AI's evaluation pipelines (BLEU, ROUGE, custom metrics) means you can run automated quality checks against held-out sets without writing infrastructure. The catch is that Vertex AI is locked to Google Cloud — you cannot serve a Gemini fine-tune outside Vertex, and the per-hour endpoint deployment cost is a recurring spend even when traffic is low.
**The positioning in one sentence:** OpenAI is for teams who want the deepest method surface (SFT + DPO + RFT) and the lowest serving friction. Anthropic is for teams already on Bedrock/Vertex who specifically need Claude's reasoning quality on their domain data. Google is for teams who want to experiment at zero or near-zero cost on Vertex and use Gemini for production.