🤖 Claude 4 Released: Coding Capabilities Significantly Improved, Better Performance in Agentic Tasks

🤖 Claude 4 Released:显著提升 Coding Ability and Better Performance in Agentic Tasks

Anthropic has released Claude 4, including two versions, Opus 4 and Sonnet 4, with a focus on enhancing coding ability and handling of Agentic tasks.

* Coding Ability: Opus 4 scored 72.5% on SWE-bench and 43.2% on Terminal-bench, and is considered the best coding model currently. GitHub uses Sonnet 4 as the base model for the new coding Agent of GitHub Copilot.
* Agentic Tasks: Opus 4 has stronger tool - using capabilities, can stay focused in long - running tasks, and can create and maintain "memory files" to store key information, thus improving the perception and coherence of long - term tasks.
* Long - text Processing: Opus 4 has a 200K context window and can handle a large amount of information.
* Pricing: The pricing of Opus 4 is $15/million tokens for input and $75/million tokens for output; the pricing of Sonnet 4 is $3/million tokens for input and $15/million tokens for output, consistent with previous models.
* Others: Claude 4 introduces a thought summarization feature, using a smaller model to compress lengthy thinking processes, but users can obtain the developer mode by contacting sales to retain the complete thought chain. The knowledge cutoff date is March 2025.

(HackerNews)

via Teahouse - Telegram Channel