New version of Claude 3.5 Sonnet will be available for computer operation

Haiku will also upgrade to 3.5

Today, Anthropic announced an upgraded version of Claude 3.5 Sonnet and a new model Claude 3.5 Haiku. The new version of Claude 3.5 Sonnet is a significant improvement over its predecessor in all aspects, especially in coding-an area it is already in a leading position. Claude 3.5 Haiku is comparable in performance to Claude 3 Opus in many evaluations, and the price remains unchanged.

The new version of Claude 3.5 Sonnet is now available to all users. Claude 3.5 Haiku will be released later this month.

In addition, the Claude API will also support computer operations and has entered the public testing stage. Developers can use the API to direct Claude to use the computer like a human-view the screen, move the cursor, click buttons, and enter text. Claude 3.5 Sonnet is the first model to provide computer operating capabilities during the public testing phase.

Rather than making specific tools to help Claude complete individual tasks, Anthropic teaches him general computer skills-allowing it to use standard tools and software programs that are widely designed for people. Developers can leverage this new ability to automate repetitive processes, build and test software, and conduct open tasks such as research.

To make these common skills possible, Anthropic built an API that enabled Claude to perceive and interact with computer interfaces. Developers can integrate this API so that Claude can translate instructions (e.g.,”Use data from my computer and online to fill out this form”) into computer commands (e.g., check a spreadsheet; move the cursor to open a web browser; navigate to relevant web pages; fill out the form using data from those pages; etc.). On OSWorld, the platform evaluates AI models ‘ability to use computers like humans, Claude 3.5 Sonnet scored 14.9% in the screenshots only category-significantly better than the score of 7.8%, which was the second-best AI system. When given more steps to complete the task, Claude scored 22.0%.

Claude’s current ability to use computers is not perfect. Some actions that humans can easily complete-scrolling, dragging, and zooming-are still challenges for Claude. In addition, since AI directly operating computers may pose new security threats (such as spam, error information or fraud), Anthropic has developed new filters to identify computer usage and whether harm is occurring, and take proactive measures to promote the security of this feature.

Thank you for watching this video. If you like it, please subscribe and like it. thank

Blogger video:
Original text:https://www.anthropic.com/news/developing-computer-use

Oil tubing: