May 21, 2022

New Research Article: “Characterizing In-Text Citations in Scientific Articles: A Large-Scale Analysis” (Preprint)

The following article was recently shared on arXiv.


Characterizing In-Text Citations in Scientific Articles: A Large-Scale Analysis


Kevin W. Boyack
SciTech Strategies

Nees Jan van Eck
Leiden University

Giovanni Colavizza
École Polytechnique Fédérale de Lausanne

Ludo Waltman
Leiden University


via arXiv


We report characteristics of in-text citations in over five million full text articles from two large databases – the PubMed Central Open Access subset and Elsevier journals – as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more substantive research questions. As such, we have analyzed in-text citations in several ways and report many findings here. Perhaps most significantly, we find that there are large field-level differences that are reflected in position within the text, citation interval (or reference age), and citation counts of references. In general, the fields of Biomedical and Health Sciences, Life and Earth Sciences, and Physical Sciences and Engineering have similar reference distributions, although they vary in their specifics. The two remaining fields, Mathematics and Computer Science and Social Science and Humanities, have different reference distributions from the other three fields and between themselves. We also show that in all fields the numbers of sentences, references, and in-text mentions per article have increased over time, and that there are field-level and temporal differences in the numbers of in-text mentions per reference. A final finding is that references mentioned only once tend to be much more highly cited than those mentioned multiple times.

Direct to Full Text Article (22 pages; PDF)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.