Research A Theoretical Framework for Adaptive Utility-Weighted Benchmarking February 16, 2026 Mohsin arXiv:2602.12356v1 Announce Type: new Abstract: Benchmarking has long served as a foundational practice in machine learning and, increasingly, in modern AI systems such as large l...
Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Arthur arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benc...
Research Accelerating science with AI and simulations February 16, 2026 Arthur Associate Professor Rafael Gómez-Bombarelli has spent his career applying AI to improve scientific discovery. Now he believes we are at an inflection point....
Research New J-PAL research and policy initiative to test and scale AI innovati February 16, 2026 Michelle Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research GPT-5.2 derives a new result in theoretical physics February 16, 2026 Arthur A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators....
Research New Research from New Advances AI Capabilities February 15, 2026 Mohsin New research from New contributes to the growing body of work advancing artificial intelligence capabilities. The findings could have significant implications for both academic and...