Teaching Assistant Professor Jill Naiman has received a $506,912 grant from the National Aeronautics and Space Administration (NASA) to digitize predigital scientific literature. Her project, “The Reading Time Machine: Transforming Astrophysical Literature into Actionable Data,” is a collaboration with Harvard University and the Astrophysics Data System (ADS), a digital library portal operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant. With over 15 million records, ADS is one of the most important archives in the scientific field of astronomy.
“Newer documents are ‘born digital,’ making them machine-readable and parseable,” said Naiman. “This has not only helped domain scientists find relevant research more efficiently, but through methods like natural language processing, it also has facilitated new discoveries in these fields.”
Naiman’s project aims to extend these capabilities to predigital documents by extracting their text, figures, and tables, allowing researchers to apply the same information mining methods that are available to “born digital” documents. This will result in more easily searchable documents and new discoveries. The work will also enhance the screen-reading capabilities of these documents to make them more accessible.
For the project, researchers will use optical character recognition and object detection methods to find and “extract” any tables and figure captions in the text. According to Naiman, this is something that has been done in biomedical literature but not in astronomy. After the images are extracted, they will be classified (i.e., graph, photo, picture of sky), and the figure labels will be parsed to extract science-relevant information.
“In each step, we plan on publishing a database—to be hosted by ADS—and the code so that other folks can do the same to their ‘old’ scientific literature,” she said. “The wealth of science generated by such ‘indexing’ efforts in other STEM fields has demonstrated that we have only scratched the surface of the discoveries possible when the community has access to science-ready data collected from the literature.”
Naiman earned her PhD in astronomy and astrophysics from the University of California, Santa Cruz, and completed National Science Foundation and Institute of Theory and Computation postdoctoral fellowships at the Harvard-Smithsonian Center for Astrophysics before coming to the University of Illinois. She is a Fiddler Faculty Fellow at the National Center for Supercomputing Applications (NCSA) at Illinois.
Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.
From ArchDaily: As gateways to knowledge and culture, libraries play a fundamental role in society. Foundational in creating opportunities for learning, as well as supporting literacy and education, the resources ...
From the Associated Press: A roundup of some of the most popular but completely untrue stories and visuals of the week. None of these are legit, even though they were ...
Full Text of ALA Statement (6/24): In response to the alarming increase in acts of aggression toward library workers and patrons as reported by press across the country, the American ...
FCC and IMLS Sign Agreement to Promote Broadband Access More Than Fifty Libraries and Library Systems Live on EBSCO FOLIO Library Services Platform NIST Releases New Guidance and Resources on ...
From the Associated Press (via Times of Israel): Pope Francis orders the online publication of 170 volumes of its Jewish files from the recently opened Pope Pius XII archives, the ...
From NYPL: The virtual branch— a custom designed interactive AR (Augmented Reality) Effect accessible via Instagram Reels is the centerpiece of #NYPLSummerBookshelf, a new initiative to spark a love of ...
CLIR Invites Proposals for Pocket Burgundy Series (via Council on Library and Information Resources) Oregon’s State Library added to National Register of Historic Places (via Oregon Capital Chronicle)
From GCN: An address-level, interactive broadband map will help officials in New York explore statewide high-speed internet availability, assess connectivity needs and better allocate state and federal funding. The map ...
The article linked below was recently published by Information Technology and Libraries. Title Rarely Analyzed: The Relationship Between Digital and Physical Rare Books Collections Authors Allison McCormack University of Utah ...
From The Pratt Institute: The Mellon Foundation has awarded the Pratt Institute School of Information $600,000 to support the Digital Preservation Outreach and Education Network (DPOE-N) in collaboration with the ...
From a DPLA Announcement: DPLA’s ebook work is a key part of our mission to advance digital access to knowledge for all. Earlier this month, The Palace Project app and platform ...
From an AUPresses Announcement: Charles Watkinson, director of the University of Michigan Press, has stepped into the presidency of the Association of University Presses. Watkinson, who also serves as associate ...