Claude Opus 4.6

Name: Claude Opus 4.6
Author: Anthropic

AnthropicProprietaryVerified

Claude Opus 4.6 is Anthropic's most intelligent model, upgrading its predecessor's coding, reasoning, and agentic capabilities. It plans more carefully, sustains agentic tasks for longer, operates more reliably in larger codebases, and has superior code review and debugging skills. In a first for Opus-class models, it features a 1M token context window (beta) and 128K max output tokens. Opus 4.6 achieves state-of-the-art results on Terminal-Bench 2.0 (65.4%), leads on Humanity's Last Exam, BrowseComp, and GDPval-AA, and scores 76% on the 8-needle 1M variant of MRCR v2 (vs. 18.5% for Sonnet 4.5), representing a qualitative leap in long-context performance. It introduces adaptive thinking and four effort levels (low, medium, high, max) for fine-grained control over intelligence, speed, and cost.

2026-02-05

Unreleased

Decoder-only Transformer

Proprietary

Compare with other models

Specifications

Parameters: Unreleased
Architecture: Decoder-only Transformer
License: Proprietary
Context Window: 200,000 tokens
Max Output: 128,000 tokens
Training Data Cutoff: Aug 2025
Type: multimodal
Modalities: textimage

Benchmark Scores

Terminal-bench65.4

Evaluates models on their ability to use terminal commands to solve system administration tasks....

BrowseComp84

A benchmark for measuring browsing agents' ability to navigate the web and find hard-to-find, entang...

Humanitys-Last-Exam40

A challenging benchmark of novel problems designed to test the limits of AI capabilities....

GPQA91.3

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

MMMU73.9

A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...

MMLU91.1

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

Michelangelo Long-Context Reasoning (1M)76

MRCR (Multi-Round Coreference Resolution) is part of the Michelangelo benchmark suite that evaluates...

view all (+3)

Advanced Specifications

Model Family: Claude
Finetuned From: Claude 4
API Access: Available
Chat Interface: Available
Multilingual Support: Yes
Variants: claude-opus-4-6-20260205

Capabilities & Limitations

Capabilities: deep reasoningagentic codingcomplex codingresearchadaptive thinkingeffort controlcontext compactioncomputer usetool useparallel tool executionextended thinkingvisionmultilinguallong context
Known Limitations: higher latency than Sonnet and Haikumay overthink simpler tasks at default high effortknowledge cutoff limits real-time data without tools
Notable Use Cases: agentic codingdeep researchcomplex system architecturemulti-step agentic workflowsfinancial analysiscybersecuritylong-context document analysiscode review and debugging
Function Calling Support: Yes
Tool Use Support: Yes

Resources

Related Models

FunctionGemma

Google

FunctionGemma is a specialized version of Gemma 3 270M fine-tuned for function calling and designed to run on edge devices. It bridges natural language and software execution, translating user commands into executable API actions. The model excels at unified action and chat capabilities, switching seamlessly between generating structured function calls and conversational responses. Built specifically for customization through fine-tuning, it demonstrated 85% accuracy on Mobile Actions after training (up from 58% baseline). Small enough to run on mobile phones and edge devices like NVIDIA Jetson Nano, it uses Gemma's 256k vocabulary to efficiently tokenize JSON and multilingual inputs.

Claude 4.5 Opus

Anthropic

Claude 4.5 Opus is the most intelligent and capable model in the Claude 4.5 family, designed for the most demanding reasoning, research, and coding tasks. It introduces a unique 'effort' parameter (low, medium, high) that allows users to control the depth of the model's reasoning process and token usage. Opus 4.5 excels in handling ambiguity, complex problem-solving, and deep analysis, often succeeding where other models fail on multi-system bugs or intricate research questions. It is priced to be more accessible than previous Opus generations while delivering state-of-the-art performance.

Claude 4.5 Haiku

Anthropic

Claude 4.5 Haiku is the fastest and most cost-efficient model in the Claude 4.5 family, designed to deliver near-frontier intelligence with low latency. It is optimized for high-volume tasks, real-time responses, and sub-agent orchestration. Despite its smaller size, it supports advanced features like 'extended thinking' and computer use, making it suitable for parallelized workflows where speed is critical. It is positioned as a drop-in replacement for previous mid-tier models with significantly better economics.