May 28, 2022

Research Article: “Nine Million Books and Eleven Million Citations: A Study of Book-Based Scholarly Communication Using OpenCitations” (Preprint)

The following article (preprint) was recently shared on arXiv.


Nine Million Books and Eleven Million Citations: A Study of Book-Based Scholarly Communication Using OpenCitations


Yongjun Zhu
Sungkyunkwan University, Republic of Korea

Erjia Yan
Drexel University

Silvio Peroni
University of Bologna

Chao Che
Dalian University, China


via arXiv


Books have been widely used to share information and contribute to human knowledge. However, the quantitative use of books as a method of scholarly communication is relatively unexamined compared to journal articles and conference papers. This study uses the COCI dataset (a comprehensive open citation dataset provided by OpenCitations) to explore books’ roles in scholarly communication. The COCI data we analyzed includes 445,826,118 citations from 46,534,705 bibliographic entities.

By analyzing such a large amount of data, we provide a thorough, multifaceted understanding of books. Among the investigated factors are 1) temporal changes to book citations; 2) book citation distributions; 3) years to citation peak; 4) citation half-life; and 5) characteristics of the most-cited books. Results show that books have received less than 4% of total citations, and have been cited mainly by journal articles.

Moreover, 97.96% of books have been cited fewer than 10 times. Books take longer than other bibliographic materials to reach peak citation levels, yet are cited for the same duration as journal articles. Most-cited books tend to cover general (yet essential) topics, theories, and technological concepts in mathematics and statistics.

Direct to Full Text Article (Preprint)
16 pages; PDF.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.