Research Tools (Prototype): University of Washington & Allen Institute for AI (the People Behind Semantic Scholar) Announce the Launch of Ai2 Open Scholar
Ed. Note: The Ai2 OpenScholar demo currently available provides responses from a datastore of computer science papers (open-access only) NOT the complete Semantic Scholar database. We look forward to seeing the demo grow to include more papers from more disciplines. We’re also excited to see what others do with the open source code and tools being made available. Finally, make sure to review the limitations and future directions section.
From Ai2:
Scientific progress hinges on our ability to find, synthesize, and build on relevant knowledge from the scientific literature. However, the exponential growth of this literature—with millions of papers now published each year—has made it increasingly difficult for scientists to find the information they need or even stay abreast of the latest findings in a single subfield.
To help scientists effectively navigate and synthesize scientific literature, we introduce Ai2 OpenScholar—a collaborative effort between the University of Washington and the Allen Institute for AI. OpenScholar is a retrieval-augmented language model (LM) designed to answer user queries by first searching for relevant papers in the literature and then generating responses grounded in those sources.
Check out the Ai2 OpenScholar Demo at openscholar.allen.ai!
[Clip]
On ScholarQABench, our new benchmark of open-ended scientific questions, OpenScholar-8B sets the state of the art on factuality and citation accuracy. For instance, on biomedical research questions, GPT-4o hallucinated more than 90% of the scientific papers that it cited, whereas OpenScholar-8B—by construction—remains grounded in real retrieved papers. To evaluate the effectiveness of OpenScholar in a real-world setup, we recruited 20 scientists working in computer science, biomedicine, and physics, and asked them to evaluate OpenScholar responses against expert-written answers. Across these three scientific disciplines, OpenScholar-8B’s responses were considered more useful than expert-written answers for the majority of questions.
To support research in this direction, [emphasis ours] we have open-sourced all of our code, LM, retriever and re-ranker checkpoints, retrieval index, and data, including the training data for our language model and retriever, our OpenScholar datastore of academic papers, and the evaluation data in ScholarQABench. To our knowledge, this is the first open release of a complete pipeline for a scientific assistant LM—from data to training recipes to model checkpoints—and we’re excited to see how the community builds upon it.
OpenScholar is a research prototype, and it is just our first step toward building AI systems that can effectively assist scientists and accelerate scientific discovery.
[Clip]
Limitations and future directions
OpenScholar is a research prototype. As the results above show, we’re confident that it gives more accurate and reliable answers to scientific research questions than other models. However, it still has several limitations, which we highlight in the hope that we can collectively address them in future work:
- OpenScholar may cite papers that are less representative. For instance, when describing a particular method, it may fail to cite the original paper that proposed the method, and instead cite another paper that mentions the method. Example: This response misses the original paper that first described edge evaluation as the main bottleneck in planning.
- OpenScholar may occasionally generate responses that are unsupported by citations, or retrieve papers that are not the most relevant or up-to-date in the field. Example: When asked about large foundation models in robotics, this response cites a paper with a 307M parameter model, whereas the current largest foundation model in robotics (as of November 2024), RT-2, has 55 billion parameters.
- OpenScholar may still generate citations directly from parametric knowledge instead of relying on the papers it has retrieved. These citations might be hallucinated and might not correspond to any real paper. In our demo, such citations do not appear as links. Example: This response cites Si et al. even though that paper was not retrieved.
- Many scientific papers are paywalled. To ensure that we respect all applicable licenses and copyrights, the OpenScholar datastore includes only open-access papers. This can significantly degrade our ability to answer questions in fields where closed-access papers are more prevalent. We hope that future work can address this issue by developing ways of responsibly incorporating such papers (e.g., by restricting verbatim copying from those papers, and instead linking out to their respective publisher sites).
MUCH MORE info Including Visuals and Video in the Complete Ai2 Intro Post
Direct to Ai2 OpenScholar Demo (“Synthesizing 1M+ open access computer science papers”)
Filed under: Data Files, Journal Articles, News, Open Access, Patrons and Users, Publishing
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.