Anthropic introduced Claude Sonnet 4.5—its most capable model for coding, agents, and computer use—and shipped a big refresh to Claude Code: a native VS Code extension (beta), a revamped terminal, and checkpoints that make longer, riskier edits reversible. The update raises the ceiling on autonomous, multi-hour development work while tightening safety and governance. Below is a clean, premium brief you can publish—structured for fast reading and SEO, no inline bubbles.
What happened (the essentials)
Anthropic released Claude Sonnet 4.5 with a focus on agentic work, long-running tasks, and real-world coding. Alongside the model, the company rolled out major changes to Claude Code, its agentic development environment that plans, edits, and executes tasks across your repo. The combined pitch is simple: more capable model + safer autonomy tooling so teams can hand off harder jobs without losing control.
Availability and rollout
Sonnet 4.5 is available now in the Claude apps and via API, and it’s also offered on Amazon Bedrock and Google Cloud Vertex AI for enterprises standardizing on those platforms. Anthropic positions 4.5 as a drop-in upgrade over Sonnet 4, keeping the same developer-facing endpoints and workflows. For day-to-day users, the Claude apps add code execution and file creation (docs, slides, sheets), making the model’s “computer use” skills visible without extra setup.
Claude Code: autonomy upgrades that matter
Claude Code now supports checkpointing: before each model-driven change, the system saves state so you can rewind instantly if an approach goes sideways. That safety net pairs with subagents (specialized helpers), hooks (automations like “run tests after edits”), and background tasks (keep servers running while the agent continues elsewhere). Together, these features make longer horizons practical—refactors, multi-file features, and investigations that used to require constant babysitting.
A native VS Code experience (plus a better terminal)
Anthropic shipped a VS Code extension (beta) that brings Claude Code into the IDE with inline diffs, a live plan view, and a dedicated sidebar for changes. If you prefer the command line, the terminal interface got a visual refresh, clearer status, and searchable prompt history so you can reuse or edit prior steps. The goal is comfort and clarity: see what the agent intends to do, watch it work, and roll back with a keystroke when needed.
Model capabilities and benchmarks (translated)
Sonnet 4.5 moves the needle on reasoning, coding, and computer use, with a 200K-token context window and support for long, step-by-step thinking when accuracy matters more than latency. On internal and public evaluations, Anthropic reports frontier results for software repair (SWE-bench Verified) and real-world computer tasks (OSWorld)—useful proxies for agents that must plan, click, type, and recover from errors. In practical terms, the model is better at using tools, managing memory, and staying on track over many steps.
“30+ hours” and why duration matters
Anthropic says it has observed Sonnet 4.5 maintaining focus for 30+ hours on complex, multi-step tasks, a dramatic jump from prior generations. Duration isn’t a parlor trick—longer horizons let agents finish what they start without constant human glue code in the middle. Paired with checkpoints, you can green-light bolder edits while keeping a reliable escape hatch if direction or quality slips.
Why this matters for teams (beyond a faster chat)
Most enterprise blockers are about trust and throughput, not just model IQ. By making the agent’s plan visible, edits reversible, and tool use observable, Claude Code lowers risk while freeing engineers to focus on reviews, architecture, and edge cases. For security and data teams, keeping work inside familiar IDEs, repos, and CI—and on clouds you already trust—reduces friction compared with shipping code through opaque third-party sandboxes.
What you can build right now
Teams are already using Sonnet 4.5 with Claude Code to ship features end-to-end, migrate frameworks, and patch vulnerabilities before exploitation. The Claude Agent SDK (the same building blocks that power Claude Code) lets you create your own agents with permissions, memory, and tool access tuned to your workflows. If you’re not ready for agents, start smaller: PR review copilots, test generators, and refactor assistants that run under checkpoints provide immediate, low-risk wins.
Pricing and access
Sonnet 4.5 is available on claude.ai (apps) and via Claude API, with partner access on Bedrock and Vertex AI for managed enterprise deployments. Anthropic lists per-token pricing comparable to Sonnet 4 and supports prompt caching and batch processing to control costs at scale. The VS Code extension is free during beta; terminal updates ship as part of Claude Code’s regular releases.
Limitations and what to watch
Even with stronger planning, agents still need guardrails: permission prompts, code owners, CI gates, and human review save you from silent drift. Very large repos, flaky tests, or ambiguous specs can still derail autonomy—expect to invest in project scaffolds and eval suites that keep agents honest. Over the next weeks, watch for region expansions on cloud marketplaces, SDK tutorials for subagents, and case studies measuring cost-per-merged-PR rather than toy benchmarks.
Bottom line
Claude Sonnet 4.5 plus the new Claude Code stack makes long-running, reversible, and inspectable autonomy feel practical, not experimental. If you’ve been waiting for an agent that plans, edits, tests, and recovers without constant hand-holding—while staying inside your dev tools—this is the strongest, most production-minded release Anthropic has shipped yet.