OpenAI Releases CoT-Control Evaluation Suite

CoT-Control

Release Date
2026-03-05
Tasks
>13,000
Benchmarks Included
GPQA, MMLU-Pro, HLE, BFCL, SWE-Bench Verified
Key Finding
Low controllability in frontier models

OpenAI released the open-source CoT-Control evaluation suite and a related research paper on March 5, 2026. The suite comprises over 13,000 tasks assessing reasoning models' ability to control their chain-of-thought to evade monitoring. Frontier models, including GPT-5.4 Thinking, demonstrate low controllability.