Updates

Summary of METR's predeployment evaluation of GPT-5.6 Sol

June 26, 2026

A summary of METR's independent, predeployment evaluation of GPT-5.6 Sol

Review of the "Risks from automated R&D" section in the Anthropic Risk Report (February 2026)

May 8, 2026

External review from METR of the "Risks from automated R&D" section in Anthropic's February 2026 Risk Report

Red-Teaming Anthropic's Internal Agent Monitoring Systems

March 26, 2026

A METR staff member spent three weeks red-teaming a subset of Anthropic's internal agent monitoring and security systems, discovering several novel vulnerabilities.

Review of the Anthropic Sabotage Risk Report: Claude Opus 4.6

March 12, 2026

External review from METR of Anthropic's Sabotage Risk Report for Claude Opus 4.6

How We Protect Confidential Information

February 17, 2026

Our high-level approach to protecting confidential access and information

Common Elements of Frontier AI Safety Policies (December 2025 Update)

December 9, 2025

Shared components of AI lab commitments to evaluate and mitigate severe risks.

Review of the Anthropic Summer 2025 Pilot Sabotage Risk Report

October 28, 2025

External review from METR of Anthropic's Summer 2025 Sabotage Risk Report

Summary of our gpt-oss methodology review

October 23, 2025

Details on external recommendations from METR for gpt-oss Preparedness experiments and follow-up from OpenAI.

Notes on Scientific Communication at METR

August 12, 2025

How we think about tradeoffs when communicating surprising or nuanced findings.

What should companies share about risks from frontier AI models?

June 27, 2025

Current views on information relevant for visibility into frontier AI risk.

Response to OSTP on AI Action Plan

March 15, 2025

Suggested priorities for the Office of Science and Technology Policy as it develops an AI Action Plan.

Why it’s good for AI reasoning to be legible and faithful

March 11, 2025

Why legible and faithful reasoning is valuable for safely developing powerful AI

Frontier AI Safety Policies

February 8, 2025

List of frontier safety policies published by AI companies, including Amazon, Anthropic, Google DeepMind, G42, Meta, Microsoft, OpenAI, and xAI.

AI models can be dangerous before public deployment

January 17, 2025

Why pre-deployment testing is not an adequate framework for AI risk management

Response to Bureau of Industry and Security’s proposed AI reporting requirements

October 11, 2024

Red-teaming and security suggestions regarding proposed rule by the Bureau of Industry and Security, “Establishment of Reporting Requirements for the Development of Advanced Artificial Intelligence Models and Computing Clusters.”

New Support Through The Audacious Project

October 9, 2024

Funding for Canary will enable research and implementation at scale

Response to U.S. AISI Draft “Managing Misuse Risk for Dual-Use Foundation Models”

September 8, 2024

Suggestions for expanded guidance on capability elicitation and robust model safeguards in the U.S. AI Safety Institute’s draft document “Managing Misuse Risk for Dual-Use Foundation Models” (NIST AI 800-1).

Response to NIST Draft Generative AI Profile

June 2, 2024

Comments on NIST’s draft document “AI Risk Management Framework: Generative AI Profile.”

ML Engineers Needed for New AI R&D Evals Project

May 16, 2024

METR is hiring ML engineers and researchers.

Emma Abele is METR’s new Executive Director

April 26, 2024

Emma moves from President to Executive Director, Beth moves to Head of Research.

2023 Year In Review

February 7, 2024

A summary of what METR accomplished in 2023 – our first full year of operation.

Bounty: Diverse hard tasks for LLM agents

December 16, 2023

METR (formerly ARC Evals) is looking for (1) ideas, (2) detailed specifications, and (3) well-tested implementations for tasks to measure performance of autonomous LLM agents.

ARC Evals is now METR

December 4, 2023

ARC Evals is wrapping up our incubation period at ARC, and spinning off into our own standalone nonprofit.

Responsible Scaling Policies (RSPs)

September 26, 2023

We describe the basic components of Responsible Scaling Policies (RSPs) as well as why we find them promising for reducing catastrophic risks from AI.

ARC Evals is spinning out from ARC

September 19, 2023

ARC Evals plans to spin out from the Alignment Research Center (ARC) in the coming months, and become its own standalone organization.

Response to RfC on AI Accountability Policy

June 11, 2023

Input to NTIA’s AI Accountability Policy Request for Comment.