May 20, 2018

Research Article: “Citation Count Analysis for Papers with Preprints” (Preprint)

The following research article (preprint) was recently shared by its authors on arXiv.


Citation Count Analysis for Papers with Preprints


Sergey Feldman
Allen Institute of Artificial Intelligence

Kyle Lo
Allen Institute of Artificial Intelligence

Waleed Ammar
Allen Institute of Artificial Intelligence


via arXiv
May 14, 2018


We explore the degree to which papers prepublished on arXiv garner more citations, in an attempt to paint a sharper picture of fairness issues related to prepublishing. A paper’s citation count is estimated using a negative-binomial generalized linear model (GLM) while observing a binary variable which indicates whether the paper has been prepublished. We control for author influence (via the authors’ h-index at the time of paper writing), publication venue, and overall time that paper has been available on arXiv. Our analysis only includes papers that were eventually accepted for publication at top-tier CS conferences, and were posted on arXiv either before or after the acceptance notification. We observe that papers submitted to arXiv before acceptance have, on average, 65\% more citations in the following year compared to papers submitted after. We note that this finding is not causal, and discuss possible next steps.

Direct to Full Text Article
10 pages; PDF.

On a Related Note (by Gary Price, infoDOCKET Founder/Editor):

The authors of the paper shared above work at the Allen Institute of Artificial Intelligence (AI2). This organization is responsible for the impressive, useful, and constantly improving Semantic Scholar, a research tool we’ve been posting about (and using) since the day it launched in November, 2015.

In June 2017 we also posted about this collaborative effort being led by AI2. See: Microsoft, Google, Baidu, and Paul Allen’s AI2 Form Open Academic Search Group




Gary Price About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.