How were the two models tested for SEO research?
**Evaluation set:** 63 tasks across nine SEO workflows (7 each), drawn from real briefs three teams shipped Feb-May 2026 — a B2B SaaS in HR tech, a DTC outdoor brand, and an affiliate publisher in personal finance. Keywords, intents, and competitor sets were locked before either model saw the prompt.
**Models:** Claude Opus 4.7 via Anthropic API and Gemini 2.5 Pro via Google AI Studio API with Search grounding. For Claude, raw SERP HTML or Ahrefs/SEMrush exports were pasted; Gemini fetched live.
**Scoring:** Three SEO leads graded on a 6-point rubric — intent accuracy, structural quality, source verifiability, originality, brief usability, helpful-content alignment. Krippendorff's alpha 0.71. Every cited statistic was verified against the primary source.