Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Mohsin arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benc...
Research New J-PAL research and policy initiative to test and scale AI innovati February 16, 2026 Michelle Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research New J-PAL research and policy initiative to test and scale AI innovati February 16, 2026 Arthur Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research A Theoretical Framework for Adaptive Utility-Weighted Benchmarking February 16, 2026 Mohsin arXiv:2602.12356v1 Announce Type: new Abstract: Benchmarking has long served as a foundational practice in machine learning and, increasingly, in modern AI systems such as large l...
Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Arthur arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benc...
Research Accelerating science with AI and simulations February 16, 2026 Arthur Associate Professor Rafael Gómez-Bombarelli has spent his career applying AI to improve scientific discovery. Now he believes we are at an inflection point....
Research New J-PAL research and policy initiative to test and scale AI innovati February 16, 2026 Michelle Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research GPT-5.2 derives a new result in theoretical physics February 16, 2026 Arthur A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators....