Notes
Rough/unpolished research updates and speculation
Five lessons from having helped run an AI-Biology RCT
Five lessons from having helped run an AI-Biology RCT
19 February 2026

Luca Righetti shares takeaways on the role of randomized controlled trials in AI safety testing.

Read more
Analyzing coding agent transcripts to upper bound productivity gains from AI agents
Analyzing coding agent transcripts to upper bound productivity gains from AI agents
17 February 2026

Amy Deng investigates whether coding agent transcripts could serve as an alternative for estimating AI productivity uplift, using 5305 Claude Code transcripts from METR technical staff.

Read more
Measuring Time Horizon using Claude Code and Codex
Measuring Time Horizon using Claude Code and Codex
13 February 2026

Nikola Jurkovic describes our measurements of time horizon using Claude Code and Codex scaffolds.

Read more
A simpler AI timelines model predicts 99% AI R&D automation in ~2032
A simpler AI timelines model predicts 99% AI R&D automation in ~2032
10 February 2026

Thomas Kwa describes a simple model for forecasting when AI will automate AI development, based on the AI Futures model but with only 8 parameters.

Read more
Frontier AI safety regulations: A reference for lab staff
Frontier AI safety regulations: A reference for lab staff
29 January 2026

Miles Kodama and Michael Chen summarize key provisions from California's SB 53, the EU Code of Practice, and New York's RAISE Act covering frontier AI developers.

Read more
Clarifying limitations of time horizon
Clarifying limitations of time horizon
22 January 2026

Thomas Kwa responds to some misinterpretations of our time horizon work, and explains limitations and the core finding.

Read more
Early Results on Monitorability in QA Settings
Early Results on Monitorability in QA Settings
6 October 2025

Vincent Cheng, Thomas Kwa, and Neev Parikh share research on how AI agents can hide secondary task-solving from monitors, finding that harder tasks are more detectable and small models can learn to evade larger monitors.

Read more
Claude, GPT, and Gemini All Struggle to Evade Monitors
Claude, GPT, and Gemini All Struggle to Evade Monitors
22 August 2025

Vincent Cheng and Thomas Kwa replicate a Google DeepMind paper on chain-of-thought monitoring, showing evidence that monitoring works on other companies' models.

Read more