Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Mohsin
Daily Brief OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Enviro February 16, 2026 Michelle
Funding Claude Code costs up to $200 a month. Goose does the same thing for fr February 16, 2026 Michelle
Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Mohsin
Research A Theoretical Framework for Adaptive Utility-Weighted Benchmarking February 16, 2026 Mohsin arXiv:2602.12356v1 Announce Type: new Abstract: Benchmarking has long served as a foundational practice in machine learning and, increasingly, in modern AI systems such as large l...
Research GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Th February 16, 2026 Arthur arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benc...
Daily Brief Accelerating science with AI and simulations February 16, 2026 Michelle Associate Professor Rafael Gómez-Bombarelli has spent his career applying AI to improve scientific discovery. Now he believes we are at an inflection point.
Research New J-PAL research and policy initiative to test and scale AI innovati February 16, 2026 Arthur Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Daily Brief OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Enviro February 16, 2026 Michelle OpenEnv has been operating at the intersection of ambition and execution, and this week’s announcement shows just how seriously the company is taking its AI ambitions. In a landscape crowded…
Daily Brief Custom Kernels for All from Codex and Claude February 16, 2026 Michelle Custom has been operating at the intersection of ambition and execution, and this week’s announcement shows just how seriously the company is taking its AI ambitions. In a landscape crowded…
Funding Claude Code costs up to $200 a month. Goose does the same thing for fr February 16, 2026 Michelle The artificial intelligence coding revolution comes with a catch: it's expensive.Claude Code, Anthropic's terminal-based AI agent that can write, debug, and deploy code a...
Funding Railway secures $100 million to challenge AWS with AI-native cloud inf February 16, 2026 Michelle Railway, a San Francisco-based cloud platform that has quietly amassed two million developers without spending a dollar on marketing, announced Thursday that it raised $100 million...
Research After all the hype, some AI experts don’t think OpenClaw is all that e February 16, 2026 Arthur "From an AI research perspective, this is nothing novel," one expert told TechCrunch.
Daily Brief Flapping Airplanes on the future of AI: ‘We want to try really radical February 16, 2026 Arthur "We're exploring a different set of tradeoffs."