OCLC Finishes Major Upgrade of Core WorldCat Infrastructure
From OCLC:
On June 6, OCLC completed the development work to convert the underlying structure for its WorldCat database to Apache HBase, a distributed platform in use by many global information providers, including Facebook, Adobe and Salesforce.com. This marks the conclusion of a significant technical update to the WorldCat database of more than 300 million library records and more than 2 billion library holdings that will offer new options for data analysis and faster service to libraries and their users.
The Apache Hadoop software collection is a framework that allows for the distributed processing of large data sets across clusters of computers. HBase is a top-level Apache Software Foundation project built on Hadoop that provides major data handling improvements for these very large datasets. OCLC WorldShare applications for library management, resource sharing, metadata and discovery rely on access to a variety of large and growing datasets, including the WorldCat database.[Clip]Hadoop provides these enhancements, in part, by scaling data services across hundreds or even thousands of computers, each with several processor cores. This efficiently distributes large amounts of work across a set of machines, allowing for greater flexibility, speed and dependability. OCLC is running Hadoop across more than 150 servers in three clusters.
Filed under: Data Files, Libraries, Management and Leadership, News, Patrons and Users
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.