Research Evidence-Grounded Subspecialty Reasoning: Evaluating a Curated Clinica February 19, 2026 Arthur arXiv:2602.16050v1 Announce Type: new Abstract: Background: Large language models have demonstrated strong performance on general medical examinations, but subspecialty clinical r...
Research Evidence-Grounded Subspecialty Reasoning: Evaluating a Curated Clinica February 19, 2026 Michelle arXiv:2602.16050v1 Announce Type: new Abstract: Background: Large language models have demonstrated strong performance on general medical examinations, but subspecialty clinical r...
Research How Uncertain Is the Grade? A Benchmark of Uncertainty Metrics for LLM February 19, 2026 Mohsin arXiv:2602.16039v1 Announce Type: new Abstract: The rapid rise of large language models (LLMs) is reshaping the landscape of automatic assessment in education. While these systems...
Research Anthropic launches Cowork, a Claude Desktop agent that works in your f February 19, 2026 Michelle Anthropic released Cowork on Monday, a new AI agent capability that extends the power of its wildly successful Claude Code tool to non-technical users — and according to company in...
Research Towards Efficient Constraint Handling in Neural Solvers for Routing Pr February 19, 2026 Michelle arXiv:2602.16012v1 Announce Type: new Abstract: Neural solvers have achieved impressive progress in addressing simple routing problems, particularly excelling in computational eff...
Research New J-PAL research and policy initiative to test and scale AI innovati February 18, 2026 Arthur Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research Attention-gated U-Net model for semantic segmentation of brain tumors February 18, 2026 Mohsin arXiv:2602.15067v1 Announce Type: new Abstract: Gliomas, among the most common primary brain tumors, vary widely in aggressiveness, prognosis, and histology, making treatment chal...
Research AI is already making online crimes easier. It could get much worse. February 18, 2026 Mohsin Anton Cherepanov is always on the lookout for something interesting. And in late August last year, he spotted just that. It was a file uploaded to VirusTotal, a site cybersecurity...
Research New J-PAL research and policy initiative to test and scale AI innovati February 18, 2026 Michelle Project AI Evidence will connect governments, tech companies, and nonprofits with world-class economists at MIT and across J-PAL's global network to evaluate and improve AI solutio...
Research Scaling social science research February 18, 2026 Mohsin GABRIEL is a new open-source toolkit from OpenAI that uses GPT to turn qualitative text and images into quantitative data, helping social scientists analyze research at scale.