May 19, 2022

In Australia: “Preserving Records of Endangered Languages Through Digital Archives”

From the Australian Research Council: 

Associate Professor Nick Thieberger is a linguistics academic and an Australian Research Council (ARC) Future Fellow, based at The University of Melbourne, who is digitising and archiving the records of our most at-risk indigenous languages.

Professor Thieberger and his team are building and populating the databases that hold these records, and developing methodology that will allow the public, and particularly speakers of endangered languages, to have greater access to the raw materials of language research including transcripts, words lists, and recordings made in the field.

The newest example of this work, and a direct outcome of Professor Thieberger’s Future Fellowship, is the online language archive Digital Daisy Bates, which was launched on 12 June 2018 at the National Library of Australia. The website is the result of the complete digitisation of historical documents prepared in the 1900s by Irish-Australian author and ethnographer, Daisy Bates. It contains 23,000 pages of wordlists of Australian languages mostly from the Western half of Australia, and is completely accessible and easily searchable by the public and researchers.

Professor Thieberger also engaged a Text Encoding Initiative (TEI) specialist in Australia, to assist with encoding the texts.


Professor Thieberger is also the Director of a digital archiving project funded through several rounds of the ARC’s Linkage Infrastructure, Equipment and Facilities scheme—the Pacific and Regional Archive for Digital Sources in Endangered Cultures, or PARADISEC, which archives Australian researchers’ recordings from the Asia-Pacific region.

“PARADISEC provides a citable form of primary data, and a way for people to verify that the data cited in analysis actually exists. Its catalog is accessible to people on their phone—often the only means of internet access in remote areas.


As a Chief Investigator at the ARC Centre of Excellence for the Dynamics of Language, Professor Thieberger is also overseeing the digital preservation of the enormous amounts of new language data that is being produced by researchers at the Centre.

Read the Complete Article

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.