Anthropic Introduces Claude 3.7 Sonnet, Bringing “Extended Thinking Mode” and Visible Thought Process

news

Feb 25, 2025 8:58 AM

Read time ~ 5 minutes

UPDATED: Mar 4, 2025 3:21 PM

SAN FRANCISCO — AI safety and research company Anthropic has launched Claude 3.7 Sonnet, a new AI model designed to tackle more complex tasks with a feature called “extended thinking mode.” By letting users toggle deeper or faster reasoning, Anthropic says this update fundamentally changes how AI can be applied to both intricate challenges—like solving advanced mathematics or debugging code—and simpler inquiries like checking the day’s date.

Extended Thinking, Now with a “Thinking Budget”

The most prominent change in Claude 3.7 Sonnet is the ability to devote additional mental “effort” to difficult questions. Users can turn on extended thinking mode to allow Claude to spend more time generating an answer, or they can set a “thinking budget” to limit how many steps it takes before responding. Unlike a separate model or engine, this new approach simply instructs Claude to explore reasoning paths more deeply, improving accuracy and complexity in its answers.

A Visible Thought Process

In a move intended to increase user trust, Anthropic has made Claude’s raw thought process visible. This lets users see the AI’s intermediate reasoning as it arrives at a final answer, offering clearer insight into how the model breaks down problems. According to Anthropic’s alignment researchers, the ability to watch Claude’s “train of thought” can boost confidence and help detect inconsistencies or potential misalignments—though the company emphasizes it does not guarantee “faithfulness,” as the model’s hidden processes may still differ from its visible thoughts.

Action Scaling and “Agent” Abilities

Claude 3.7 Sonnet also introduces upgraded “agentic” features, allowing the model to perform iterative tasks like navigating a computer environment or engaging with complex user interfaces. In Anthropic’s OSWorld benchmark—an evaluation of AI’s multimodal skills—Claude 3.7 Sonnet improved its results over previous versions by handling more steps and using more computational power when needed.

One of the most eye-catching demonstrations is Claude’s ability to play classic Pokémon Red. Whereas prior versions of Claude got stuck early on, Claude 3.7 Sonnet leveraged its extended thinking and agent training to move through the game’s challenges, defeating three Gym Leaders. While playing Pokémon may be niche, Anthropic says the same long-context, open-ended approach can bring real-world advantages to tasks like data analysis, user support, and continuous process automation.

The performance of Claude 3.7 Sonnet versus its predecessor model on the OSWorld evaluation, testing multimodal computer use skills. “Pass @ 1”: the model has only a single attempt to solve a particular problem for it to count as having passed.

Parallel and Serial Compute

Anthropic’s research also highlights “serial test-time compute,” where Claude extends its reasoning steps before producing an output, and “parallel test-time compute,” where multiple thought processes run at once. In internal testing on GPQA—a set of tough science questions—running many parallel samples and using a learned scoring system significantly boosted accuracy. While parallel compute isn’t yet publicly available in the new release, Anthropic notes it could become a powerful method for future enhancements.

Experimental results from using parallel test-time compute scaling to improve Claude 3.7 Sonnet’s performance on the GPQA evaluation. The different lines refer to different methods of scoring the performance. “Majority @ N”: where multiple outputs are generated from a model for the same prompt with the majority vote taken as the final answer; “scoring model”: a separate model which is used to assess the performance of the model being evaluated; “pass @ N”: where models “pass” a test if any of a given number of attempts succeeds.

Safety and Alignment Measures

Despite its improved performance, Claude 3.7 Sonnet remains at Anthropic’s ASL-2 “Frontier Model” safety standard. Comprehensive red-teaming showed that while the model is more sophisticated, it still hit dead ends when tested on illicit tasks (e.g., creating dangerous weapons) and did not fully achieve them. Anthropic also introduced new security measures for Claude’s ability to view and interact with a user’s computer, making it more resistant to “prompt injection” attacks.

Where extended thinking is visible, certain content deemed potentially harmful is encrypted rather than displayed—ensuring Claude’s open-ended reasoning can still happen without exposing sensitive or dangerous ideas to the user. Anthropic says it may re-evaluate whether to keep the thought process fully visible in subsequent releases, especially as AI capabilities progress.

Claude 3.7 Sonnet’s performance on questions from the 2024 American Invitational Mathematics Examination 2024, according to how many thinking tokens it’s allowed per problem. Note that even though we allow Claude to use the entire thinking budget, it generally stops short. We include in the plot the tokens sampled that are used to summarize the final answer.

Availability and Looking Ahead

Claude 3.7 Sonnet is accessible now via Claude.ai and Anthropic’s API for Pro, Team, Enterprise, and qualified developers. The company has published a full System Card detailing safety features, alignment efforts, and further experiments demonstrating Claude’s agentic and extended reasoning abilities.

For Anthropic, the launch of Claude 3.7 Sonnet is a showcase of how large language models might one day approach near-human flexibility—capable of quick answers when needed, yet able to dive deeply into more formidable tasks, all while giving a window into the reasoning behind the machine.

Why this announcement is important for the AI industry?

Anthropic’s Claude 3.7 Sonnet announcement is significant because it highlights greater transparency and controllability in AI systems—two critical goals for the industry as it tackles challenges like hallucination, alignment, and user trust. By enabling deeper or quicker “modes of thought” and letting users see how the AI reaches its conclusions, developers gain more direct oversight of the model’s reasoning. This represents a move toward safer, more accountable AI, potentially setting a new standard for how AI systems are tested, deployed, and trusted in real-world settings.

Interested users and researchers are encouraged to share their feedback at feedback@anthropic.com.

📣 SHARE:

SOURCE: Anthropic

🆔 RELATED PROFILES:

No related profiles found associated with: “Anthropic”

Read more AI news at RadicalShift.AI’s news section.

➕ ADD A NEWS ARTICLE

➕ ADD OTHER CONTENT

You must be logged in to add content to these sections.

👤 Author

Sheryl Rivera

Sheryl Rivera, 47, is a media industry veteran having worked in the space for nearly 25 years across 3 continents, America, Asia and Europe. An Editor-in-Chief at multiple media/news outlets such as EuropaWire, the leading newswire in Europe, EPR Network, a 20-year old group of PR websites to TravelPRNews.com, the native choice on the Web for travel pr news, she has literally dealt with hundreds of thousands of news stories, articles and press releases throughout her career. Earlier, she worked for BusinessWorld, Southeast Asia’s first daily business newspaper. Profound interests in the societal, educational and media aspects of the AI and what fundamental changes it prompts across societies, more particularly the impact the AI technologies and advancements will have on the labor markets across the world.

Edit your profile

YOU MAY ALSO LIKE:

Anthropic Introduces Claude 3.7 Sonnet and Agentic Coding Tool, Claude Code February 25, 2025
Anthropic Enhances AI Development with New Features for Improved Prompting and Example Management November 15, 2024
Anthropic Expands Partnership with Amazon Web Services to Accelerate AI Innovation November 23, 2024
OpenAI Launches Groundbreaking Series of Reasoning Models for Complex Problem Solving September 13, 2024
Anthropic Launches “Citations” to Boost Trust and Transparency in AI Responses January 24, 2025
NVIDIA Q4 2025: AI Boom Delivers Record Revenue and Profits February 27, 2025

🔄 Updates

If you are the owner of, or part of/represent the entity this News article belongs to, you can request additions / changes / amendments / updates to this entry by sending an email request to info@radicalshift.ai. Requests will be handled on a first come first served basis and will be free of charge. If you want to take over this entry, and have full control over it, you have to create an account at RadicalShift.AI and if you are the owner of, or part of/represent the entity this News article belongs to, we will have it transferred over to your account and then you can add/modify/update this entry anytime you want.

🚩 Flag / Report an Issue

Flag / report an issue with the current content entry.

If you’d prefer to make a report via email, you can send it directly to info@radicalshift.ai. Indicate the content entry / News article you are making a report for.

AI.RadicalShift

Anthropic Introduces Claude 3.7 Sonnet, Bringing “Extended Thinking Mode” and Visible Thought Process

📣 SHARE:

🆔 RELATED PROFILES:

⚡ MORE FROM THAT SOURCE:

✨ SIGNALS FOR THAT SOURCE:

📰 THE LATEST NEWS:

Global Power Transformation: NVIDIA and Consortium Harness AI to Revolutionize Electricity Generation and Distribution

Intel Unveils Next-Gen Edge AI Solutions: Empowering Seamless Integration into Legacy Infrastructure

A deeper look into the CoreWeave’s Acquisition of Weights & Biases

Dane Technologies and Brain Corp Unveil Revolutionary Autonomous Inventory Robot

UiPath Expands Its AI-Driven Automation Arsenal with Strategic Peak Acquisition

➕ ADD A NEWS ARTICLE

➕ ADD OTHER CONTENT

👤 Author

🔄 Updates

🚩 Flag / Report an Issue

What is RadicalShift AI?

Latest Entries

🏭 INDUSTRIES / MARKETS:

🏷️ TAGS

📂 ARCHIVE

Anthropic Introduces Claude 3.7 Sonnet, Bringing “Extended Thinking Mode” and Visible Thought Process

📣 SHARE:

🆔 RELATED PROFILES:

⚡ MORE FROM THAT SOURCE:

✨ SIGNALS FOR THAT SOURCE:

📰 THE LATEST NEWS:

Global Power Transformation: NVIDIA and Consortium Harness AI to Revolutionize Electricity Generation and Distribution

Intel Unveils Next-Gen Edge AI Solutions: Empowering Seamless Integration into Legacy Infrastructure

A deeper look into the CoreWeave’s Acquisition of Weights & Biases

Dane Technologies and Brain Corp Unveil Revolutionary Autonomous Inventory Robot

Palantir and Databricks Partner to Power Next-Generation AI Solutions

UiPath Expands Its AI-Driven Automation Arsenal with Strategic Peak Acquisition

➕ ADD A NEWS ARTICLE

➕ ADD OTHER CONTENT

👤 Author

🔄 Updates

🚩 Flag / Report an Issue

What is RadicalShift AI?

Latest Entries

🏭 INDUSTRIES / MARKETS:

🏷️ TAGS

📂 ARCHIVE