January 23, 2022

Microsoft Says Their Academic Graph (Bibliographic and Citation Data) is Adding About 1 Million Articles Each Week

According to a new post on the Microsoft Research Blog, the Microsoft Academic Graph is growing by “roughly” one million articles each week.

From the Blog Post:

Behind the scenes, we are taking advantage of the fact that machines do not require time to sleep or eat, and have superior memory to humans. We have trained our AI robots to read, classify, and tag every document published to the web in real time. The result is a massive collection of academic knowledge we call the Microsoft Academic Graph (MAG), which is growing at roughly 1 million articles per week. While one set of robots is busy gathering knowledge from the web, another set of robots is dedicated to analyzing citation behaviors and computing the relative importance of each node in the MAG so that users are always presented with information they need and want.

Microsoft Academic is based on the work our team developed for Microsoft Cognitive Services, including open APIs that give developers AI-based semantic search tools and entity-linking capabilities. We’re also applying AI semantic search—which is contextual and conversational—to Cortana, Bing, and more.

Learn MORE, Read the Complete Blog Post

In Microsoft’s Own Words

The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals and conference “venues” and fields of study.


Microsoft’s Academic Search Graph became about one year ago and the dataset is available to download. The most recent version of the database available to download was posted about three months ago.

Microsoft Academic Search

On February 29th we posted that a preview version of a new Microsoft Academic search tool had just become available.

The preview version continues under development and since launch they’ve added an option to combine some mainstream news content (topic/discipline specific) with academic/scholarly articles.

Note the drop-down menu option at the top of a results list.


Our Febuary 29th post has plenty of background and additional resources.

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.