May 25, 2022

Report From eLife Labs: “Investigating the Context of Citations”

From the eLife Labs Blog:

Citation records have already been used to interrogate citation behaviour (such as Greenberg, 2009), enrich the information provided with a citation (see also Di Iorio et al., 2018, for current work considering how to present this information to the reader) and explore a researcher’s scholarly network, for example. With the amount of open citation data growing, driven by the Initiative for Open Citations (I4OC) and OpenCitations with the Open Citations Corpus, we are interested in how these data could be used to derive a deeper understanding of the impact of a research paper. We have also been experimenting with the application of data science to open science, to create openly available insight and tools that add value. And so we have embarked on a data science project to investigate how natural language processing (NLP) techniques could provide insight into the context of scientific citations available as open citation data. In particular, we are asking whether data science techniques could reveal where a citation occurs within the paper’s narrative and whether the citation has a positive, negative or neutral sentiment (citation polarity). This work began as a short project through the ASI data science fellowship programme.

Read the Complete Report


About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.