Ai2 Launches Asta: a New Standard for Trustworthy AI Agents in Science

The complete ecosystem includes AI agents, benchmarks, and tools to bring clarity and credibility to the scientific AI space

Ai2 (The Allen Institute for AI) today launched Asta, an integrated, open ecosystem designed to transform how science is done with AI agents. At a time when AI tools are flooding the research landscape—often opaque, untested, and unproven—Asta offers a principled alternative: a comprehensive collection that includes an agentic AI research assistant, the first rigorous benchmark suite for scientific agents, and a set of developer resources for building trustworthy tools.

Together, these components form a foundation for high-performance scientific AI that is transparent, evidence-based, and designed to earn the trust of scientists, developers, and institutions.

“AI can be transformative for science, but only if it’s held to the same standards as science itself,” said Ali Farhadi, CEO of Ai2. “With Asta, we’re not just building an assistant but an ecosystem built on transparency, reproducibility, and scientific rigor. It’s designed for real researchers solving real problems—and developers creating the next generation of agentic tools to accelerate scientific discoveries. It’s a bet on a future where AI doesn’t just keep up with science, it helps drive it forward.”

Asta: A New Kind of Research Partner

At its core, Asta is an intelligent, open-source AI assistant designed specifically for scientists. Unlike general-purpose tools, Asta understands the needs of research workflows. It doesn't just retrieve information, it reviews literature, synthesizes evidence, and (in beta) analyzes data—all while providing citations.

Already in use by researchers at 194 institutions including the University of Chicago and the University of Washington, Asta is accelerating real-world discovery—from identifying therapeutic targets to exploring new areas of inquiry.

"More than ever before, researchers struggle with literature search and synthesis," said James Evans, Director of the Knowledge Lab at the University of Chicago. "Ai2's Asta ecosystem of AI agents, benchmarks, and tools helps to break these barriers. Its system is poised to accelerate the path from hunch to insight, transforming how we navigate the vast landscape of scientific understanding."

A Fully Integrated Ecosystem for Scientific AI

Asta isn’t a standalone tool. It’s a full-stack ecosystem designed to support the entire lifecycle of scientific AI development and use:

Asta: An open-source AI agentic research assistant that helps scientists navigate literature, synthesize findings, and analyze data. It’s fully transparent, source-cited, and designed to integrate into real-world workflows.
AstaBench: A rigorous benchmark suite that sets the bar for scientific AI agent performance across complex, multi-step research tasks, from literature comprehension to code execution and end-to-end discovery. Launching with over 2,400 problems across 11 benchmarks, it provides researchers and developers with a reproducible, evidence-based way to evaluate and compare agents. At launch, AstaBench includes 16 leaderboards spanning agent performance across all benchmark categories, four subcategories, and an overall ranking that includes both performance and cost efficiency.
Asta Resources: A developer toolkit that includes open-source agents, APIs, post-trained language models for science, and access to a Scientific Corpus Tool, an MCP extension of Ai2's Semantic Scholar API infrastructure (200M+ papers). It provides everything needed to build and evaluate trustworthy scientific agents.

“When building Asta, we focused on problems that we faced as researchers,” said Dan Weld, Chief Scientist at Ai2. “We needed AI tools that could really save us time by executing complex multi-step plans, explaining their thinking, and staying grounded in evidence. That’s what Asta delivers. It’s not just another assistant but a collaborator designed to think like a scientist."

Setting the Standard for Scientific AI

As agentic AI gains momentum, so does the noise. New tools emerge weekly, often with opaque claims and no standard way to evaluate them. As a comprehensive framework for testing and comparing AI agents on real scientific tasks, not synthetic prompts, Asta fills that gap with AstaBench.

Our Asta v0 science agent led on our initial evaluations with a 52.5% score, nearly 10 points above the next best system. GPT-5 mini and Claude 3.5 Haiku paired with a specialized framework were also strong contenders thanks to their low costs. Yet AstaBench reveals that many agents struggle with complex tasks like coding, underscoring the challenges ahead and the value of purpose-built scientific agents.

This benchmark suite is paired with Asta resources, which provides building blocks for developers to build agents that meet the same high bar. Developers can build agents using the Asta resources, and then evaluate them using AstaBench—creating a flywheel of scientific improvement that the entire ecosystem can benefit from.

What sets Asta apart is not just what it does, but how it’s built: fully open-source, open-access, and grounded in scientific values. While others race to define the field through closed systems and proprietary agents, Ai2 is charting a collective path forward that is transparent, principled, and built to evolve.

Looking Ahead

Asta is just the beginning. As the scientific AI landscape continues to evolve, Ai2 is committed to expanding Asta with new capabilities and tools that push the boundaries of what researchers and developers can do.

One of the most exciting capabilities coming to Asta is data analysis. It allows users to upload their own real-world datasets and explore them using natural language. They can ask sophisticated questions and receive rigorous, explainable answers grounded in statistical reasoning. Designed to accelerate data-driven discovery by generating and testing new hypotheses, it can support work across domains like social science, biology, and climate research, helping scientists move from raw data to meaningful conclusions.

Future Asta releases will also include advanced features like experiment replication, scientific programming, and long-term research planning—bringing us closer to an AI research assistant that can truly support end-to-end scientific workflows.

Explore the ecosystem and get involved at allenai.org/asta.

About Ai2

Ai2 is a Seattle-based non-profit AI research institute with the mission of building breakthrough AI to solve the world’s biggest problems. Founded in 2014 by the late Paul G. Allen, Ai2 develops foundational AI research and innovative new applications that deliver real-world impact through large-scale open models, open data, robotics, conservation platforms, and more. Ai2 champions true openness through initiatives like OLMo, the world’s first truly open language model framework, Molmo, a family of open state-of-the-art multimodal AI models, and Tulu, the first application of fully open post-training recipes to the largest open-weight models. These solutions empower researchers, engineers, and tech leaders to participate in the creation of state-of-the-art AI and to directly benefit from the many ways it can advance critical fields like medicine, scientific research, climate science, and conservation efforts. For more information, visit allenai.org.

View source version on businesswire.com: https://www.businesswire.com/news/home/20250826827940/en/

“AI can be transformative for science, but only if it’s held to the same standards as science itself. Asta is an ecosystem built on transparency, reproducibility, and scientific rigor.” - Ali Farhadi, CEO of Ai2

Contacts

ai2pr@archetype.co